Back to Glossary
Token (NLP)
NLP & Language Models
Smallest processing unit for language models.
A token is the smallest unit of text a model processes, often a word piece or symbol.
- Types: Words, subwords, punctuation marks.
- Impact: Tokenization affects context length and model efficiency.
- Example: 'ChatGPT' may be split into 'Chat' and 'GPT'.