🤖

Series · Part 2 of 21

Foundation
AI Demystified
Abhishek Saha
Abhishek Saha
· 🤖 AI / ML

Transformers — The Architecture That Changed Everything

Every major AI — GPT, Claude, Gemini, Llama — runs on the Transformer. Here's how it works, stage by stage, with live visualizations.

Transformers — The Architecture That Changed Everything

In 2017, a Google paper called Attention Is All You Need described an architecture that would replace everything before it. GPT, Claude, Gemini, Llama — they all run on this.

Stage 1 of 5

Text becomes numbers

Before any computation, your text is broken into tokens — subword units of roughly 3-4 characters. Each token is mapped to an integer ID. "Transformer" might become ["Trans", "form", "er"] → [1423, 678, 91].

Vocabulary size~100k tokens
Avg token length3–4 characters
Outputsequence of integer IDs
The
Trans
form
er
mod
el
is
fast
→ [1423, 678, 91, 44, 20, 119, 7, 201]

The key insight: instead of processing words one at a time (slow, forgetful), attention lets every token relate to every other token simultaneously. Parallelizable, scalable, and the reason modern AI is possible.

Next up: Part 3 covers how AI actually learns — training, sampling, and why the same prompt doesn’t always give the same answer.

AI Demystified · 16 of 21 published

  1. 0 Grounding 5 Mental Models You Need Before Diving Into AI
  2. 1 Foundation What Happens When You Ask AI Something?
  3. 2 Foundation Transformers — The Architecture That Changed Everything
  4. 3 Foundation How AI Learns, Thinks, and Decides
  5. 4 Foundation How AI Reads Your Words
  6. 5 Foundation Why AI Forgets
  7. 6 Foundation Why AI Lies (And Doesn't Know It)
  8. 7 Foundation What AI Cannot Do
  9. 8 Foundation How AI Reasons (And Why It Sometimes Breaks)
  10. 9 Practice Prompt Engineering — How to Talk to AI
  11. 10 Practice Embeddings & Vector Databases — The Memory Layer of AI
  12. 11 Practice RAG Explained — How AI Knows What You Didn't Train It On
  13. 12 Practice Fine-tuning vs. Prompting — When to Use Which
  14. 13 Practice Do You Really Need GPT-4?
  15. 14 Practice Latency, Tokens, and Cost — The Physics of AI Products
  16. 15 Practice How Do You Know AI Is Actually Working?
  17. 16 Hands-On Coding Setup — Your AI Development Environment soon
  18. 17 Hands-On MCP Tool Calling — How AI Uses Tools soon
  19. 18 Hands-On AI Agents — Beyond Chatbots soon
  20. 19 Hands-On Build Your First Real AI App soon
  21. 20 Hands-On Token Optimization — Spend Less, Get More soon

Related posts

⚙️
How DNS Works — The Internet's Phone Book ⚙️ Tech
Part 1 · How the Internet Works

How DNS Works — The Internet's Phone Book

You type google.com. Your browser stares at it blankly — it has no idea where that is. Here's how the internet turns a name into an address, and why it's faster than you'd expect.

read more →
⚙️
How HTTP Works — The Language of the Web ⚙️ Tech
Part 4 · How the Internet Works

How HTTP Works — The Language of the Web

The TLS tunnel is open. Now your browser and server need to speak the same language. Here's what GET, POST, 404, and 200 actually mean — and what really travels across the wire.

read more →
⚙️
How HTTP/2 Works — The Speed Upgrade ⚙️ Tech
Part 5 · How the Internet Works

How HTTP/2 Works — The Speed Upgrade

HTTP/1.1 can only send one request at a time. A modern page needs 50+ files. HTTP/2 loads everything at once over a single connection — here's how multiplexing actually works.

read more →
newsletter

Get new posts in your inbox

No spam. No digest. Just a note when I publish something new.

Discussion