mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Summary

Open-source “llm-course” by Maxime Labonne offers a free, three-track roadmap—LLM Fundamentals, LLM Scientist, LLM Engineer—supported by Colab notebooks and a paid LLM Engineer’s Handbook. A free HuggingChat/ChatGPT assistant quizzes learners and answers questions in real time.

Key Points

Prerequisites: linear algebra, calculus, probability & statistics; Python fluency (NumPy, Pandas, Matplotlib, Seaborn).
Core ML: scikit-learn algorithms, PCA/t-SNE, train/val/test splits, data-cleaning pipeline.
Neural nets: back-prop, optimizers (SGD, Adam), regularisation (dropout, L1/L2); build MLP in PyTorch.
NLP: tokenisation, TF-IDF, Word2Vec/GloVe/FastText, RNN/LSTM/GRU.
Transformers: encoder-decoder → decoder-only GPT-style; self-attention; tokenisation choices affect speed & memory.
Text generation: greedy/beam vs. temperature/nucleus sampling; visual guides by 3Blue1Brown, Karpathy, Bycroft.
Pre-training: data-centric, compute-heavy (Llama 3.1 used 15 T tokens), but feasible below 1 B parameters with careful curation, deduplication, and tokenisation.

Next Steps

Clone the repo, pick a track, run the Colabs, and use the interactive assistant to test understanding.