Skip to content

minimindUnderstand LLM training from scratch

No more black-box training — understand every design choice via controlled experiments

Core Highlights

Truly understand how LLMs are trained

No more blind training — use controlled experiments to understand every design choice.

Principles First

No black boxes — understand the tradeoffs behind every design decision.

Controlled Experiments

Show, don’t tell — run experiments to see what breaks without a design.

Modular Learning

From normalization to Transformer — 6 independent modules, progressive and clear.

Low Barrier

Tiny datasets run in minutes on CPU — verify ideas fast and cheaply.

Pick Your Learning Path

Choose the best path for your time and goals

Different paths for different needs — from quick taste to deep mastery.

Most Popular

Quick Start

Use 3 experiments to grasp key LLM design choices. Great for first timers.

30 min
Start Learning
Comprehensive

Systematic Study

Master all Transformer fundamentals with a complete, structured path.

6 hours
Start Learning
Ultimate Challenge

Deep Mastery

Train a full LLM from scratch and go deep into architecture and training.

30+ hours
Start Learning

Quick Start

30 minutes, three experiments — change how you understand LLM training

Terminal

# 1. Clone the repo

git clone https://github.com/joyehuang/minimind-notes.git

cd minimind-notes

# 2. Activate your virtual environment (if any)

source venv/bin/activate

# 3. Experiment 1: Why normalization?

cd modules/01-foundation/01-normalization/experiments

python exp1_gradient_vanishing.py

# 4. Experiment 2: Why RoPE position encoding?

cd ../../02-position-encoding/experiments

python exp1_rope_basics.py

# 5. Experiment 3: How Attention works?

cd ../../03-attention/experiments

python exp1_attention_basics.py

📊

Gradient Vanishing

Visualize gradient flow in deep networks

🔄

RoPE Encoding

See the math behind rotary position embeddings

🎯

Attention

Visualize attention weight computation

💡 Why choose this tutorial?

🎯 No more “just make it run”

Have you ever followed a tutorial, got the code working, but still didn’t know why? This tutorial uses controlled experiments to show what breaks and why other choices fail.

🔬 Every design choice is backed by experiments

No more armchair theory — each module includes runnable comparison experiments so you can see real effects. Theory + practice, down to the details.

💻 Low barrier for learning experiments

Learning-stage experiments: TinyShakespeare (1MB) and similar micro datasets, runnable on CPU in minutes. Full training: If you want to train a full model from scratch, you will need a GPU (MiniMind original project: single NVIDIA 3090, about 2 hours).

🔗 Resources

📦 Upstream projectjingyaogong/minimind

🗺️ Learning roadmapFull roadmap

💻 Code examplesExecutable examples

📝 Learning notesLearning log · Knowledge base

Built on MiniMind for learning and experiments