Foundations

🧠

Essential concepts for understanding LLMs

7 Articles

36 Papers Referenced

~56 min Reading Time

Tokens, vocabularies, BPE intuition, and engineering trade-offs.

Attention weights, Q/K/V, multi-head self-attention, and compute costs.

Temperature, top-k, top-p, and the choices made at inference time.

The Transformer block, decoder-only LLMs, and major design trade-offs.

Word vectors, contextual representations, and the foundation of semantic retrieval.

Absolute positions, relative positions, and RoPE.

A careful look at scale, data, compute, and emergence.