🤖

Series · Part 2 of 21

Foundation

Abhishek Saha · May 20, 2026 🤖 AI / ML

Transformers — The Architecture That Changed Everything

Every major AI — GPT, Claude, Gemini, Llama — runs on the Transformer. Here's how it works, stage by stage, with live visualizations.

#ai #transformers #attention #architecture #llm #interactive

Transformers — The Architecture That Changed Everything

In 2017, a Google paper called Attention Is All You Need described an architecture that would replace everything before it. GPT, Claude, Gemini, Llama — they all run on this.

Stage 1 of 5

Text becomes numbers

Before any computation, your text is broken into tokens — subword units of roughly 3-4 characters. Each token is mapped to an integer ID. "Transformer" might become ["Trans", "form", "er"] → [1423, 678, 91].

Vocabulary size~100k tokens

Avg token length3–4 characters

Outputsequence of integer IDs

The

Trans

form

er

mod

el

is

fast

→ [1423, 678, 91, 44, 20, 119, 7, 201]

The key insight: instead of processing words one at a time (slow, forgetful), attention lets every token relate to every other token simultaneously. Parallelizable, scalable, and the reason modern AI is possible.

Next up: Part 3 covers how AI actually learns — training, sampling, and why the same prompt doesn’t always give the same answer.

AI Demystified · 16 of 21 published

0 Grounding 5 Mental Models You Need Before Diving Into AI
1 Foundation What Happens When You Ask AI Something?
2 Foundation Transformers — The Architecture That Changed Everything
3 Foundation How AI Learns, Thinks, and Decides
4 Foundation How AI Reads Your Words
5 Foundation Why AI Forgets
6 Foundation Why AI Lies (And Doesn't Know It)
7 Foundation What AI Cannot Do
8 Foundation How AI Reasons (And Why It Sometimes Breaks)
9 Practice Prompt Engineering — How to Talk to AI
10 Practice Embeddings & Vector Databases — The Memory Layer of AI
11 Practice RAG Explained — How AI Knows What You Didn't Train It On
12 Practice Fine-tuning vs. Prompting — When to Use Which
13 Practice Do You Really Need GPT-4?
14 Practice Latency, Tokens, and Cost — The Physics of AI Products
15 Practice How Do You Know AI Is Actually Working?
16 Hands-On Coding Setup — Your AI Development Environment soon
17 Hands-On MCP Tool Calling — How AI Uses Tools soon
18 Hands-On AI Agents — Beyond Chatbots soon
19 Hands-On Build Your First Real AI App soon
20 Hands-On Token Optimization — Spend Less, Get More soon

← Part 1 What Happens When You Ask AI Something? Part 3 → How AI Learns, Thinks, and Decides

⚙️

How DNS Works — The Internet's Phone Book

⚙️ Tech

May 23, 2026

Part 1 · How the Internet Works

How DNS Works — The Internet's Phone Book

You type google.com. Your browser stares at it blankly — it has no idea where that is. Here's how the internet turns a name into an address, and why it's faster than you'd expect.

#dns #networking #internet #protocols

⚙️

How HTTP Works — The Language of the Web

⚙️ Tech

May 23, 2026

Part 4 · How the Internet Works

How HTTP Works — The Language of the Web

The TLS tunnel is open. Now your browser and server need to speak the same language. Here's what GET, POST, 404, and 200 actually mean — and what really travels across the wire.

#http #networking #protocols #api

⚙️

⚙️ Tech

May 23, 2026

Part 5 · How the Internet Works

How HTTP/2 Works — The Speed Upgrade

HTTP/1.1 can only send one request at a time. A modern page needs 50+ files. HTTP/2 loads everything at once over a single connection — here's how multiplexing actually works.

#http2 #networking #performance #web

newsletter

Get new posts in your inbox

No spam. No digest. Just a note when I publish something new.

Transformers — The Architecture That Changed Everything

Text becomes numbers

How DNS Works — The Internet's Phone Book

How HTTP Works — The Language of the Web

How HTTP/2 Works — The Speed Upgrade

Get new posts in your inbox

Discussion