Series · Part 4 of 21
FoundationHow AI Reads Your Words
AI doesn't read letters or words — it reads tokens. This small detail explains why AI costs money, why it stumbles on names, and why non-English is more expensive.
You type a sentence. The AI doesn’t read it the way you do.
Before any processing happens, your text is broken into tokens — chunks that might be a word, part of a word, a space, or a punctuation mark. The model never sees letters. It sees a list of integers.
Why Tokens, Not Words?
Words are inconsistent. “run”, “running”, “ran” — three different words that share a root. Handling every possible form of every word in every language would require a vocabulary of millions.
Tokens are a middle ground: a fixed vocabulary of ~100,000 chunks that cover the patterns in language efficiently. Common words are single tokens. Rare words get split. Code has its own patterns. Emoji are usually one token each.
The Practical Stuff
Cost — APIs charge per token. A short message might be 20 tokens. A full document pasted into the prompt might be 5,000. This is why “be concise” is genuinely good advice when using AI.
Context limits — every model has a maximum token count it can process at once. This is the “context window.” More tokens in = more expensive and eventually impossible once you hit the limit.
Names and unusual words — common English words are single tokens. “Abhishek” might be tokenized as Ab + hi + shek. This is partly why AI can be surprisingly bad at word games, counting letters, or handling proper nouns.
Next up: What happens to those tokens once the model has them? They get compressed into meaning. That’s the context window — and it’s also why AI forgets things in long conversations.
AI Demystified · 16 of 21 published
- 0 Grounding 5 Mental Models You Need Before Diving Into AI
- 1 Foundation What Happens When You Ask AI Something?
- 2 Foundation Transformers — The Architecture That Changed Everything
- 3 Foundation How AI Learns, Thinks, and Decides
- 4 Foundation How AI Reads Your Words
- 5 Foundation Why AI Forgets
- 6 Foundation Why AI Lies (And Doesn't Know It)
- 7 Foundation What AI Cannot Do
- 8 Foundation How AI Reasons (And Why It Sometimes Breaks)
- 9 Practice Prompt Engineering — How to Talk to AI
- 10 Practice Embeddings & Vector Databases — The Memory Layer of AI
- 11 Practice RAG Explained — How AI Knows What You Didn't Train It On
- 12 Practice Fine-tuning vs. Prompting — When to Use Which
- 13 Practice Do You Really Need GPT-4?
- 14 Practice Latency, Tokens, and Cost — The Physics of AI Products
- 15 Practice How Do You Know AI Is Actually Working?
- 16 Hands-On Coding Setup — Your AI Development Environment soon
- 17 Hands-On MCP Tool Calling — How AI Uses Tools soon
- 18 Hands-On AI Agents — Beyond Chatbots soon
- 19 Hands-On Build Your First Real AI App soon
- 20 Hands-On Token Optimization — Spend Less, Get More soon
Related posts
How DNS Works — The Internet's Phone Book
You type google.com. Your browser stares at it blankly — it has no idea where that is. Here's how the internet turns a name into an address, and why it's faster than you'd expect.
read more →
How HTTP Works — The Language of the Web
The TLS tunnel is open. Now your browser and server need to speak the same language. Here's what GET, POST, 404, and 200 actually mean — and what really travels across the wire.
read more →Get new posts in your inbox
No spam. No digest. Just a note when I publish something new.
Discussion
On this page