- This is the 2025 iteration of the course, materials are added as we prepare them.
- Lecture and seminar materials for each week are in ./week* folders, see README.md for materials and instructions
- Any technical issues, ideas, bugs in course materials, contribution ideas - add an issue
- Installing libraries and troubleshooting: this thread.
-
week01 Word Embeddings
- Lecture: Word embeddings. Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. Evaluation: intrinsic vs extrinsic. Analysis and Interpretability. Interactive lecture materials and more.
- Seminar: Playing with word and sentence embeddings
- Homework: Embedding-based machine translation system
-
week02 Language Modeling
- Lecture: Language Modeling: what does it mean? Left-to-right framework. N-gram language models. Neural Language Models: General View, Recurrent Models, Convolutional Models. Evaluation. Practical Tips: Weight Tying. Analysis and Interpretability. Interactive lecture materials and more.
- Seminar: Build a N-gram language model from scratch
- Homework: Neural LMs & smoothing in count-based models.
-
week03 Seq2seq and Attention
- Lecture: Seq2seq Basics: Encoder-Decoder framework, Training, Simple Models, Inference (e.g., beam search). Attention: general, score functions, models. Transformer: self-attention, masked self-attention, multi-head attention; model architecture. Subword Segmentation (BPE). Analysis and Interpretability: functions of attention heads; probing for linguistic structure. Interactive lecture materials and more.
- Seminar: Basic sequence to sequence model
- Homework: Machine translation with attention
-
week04 Transfer Learning
- Lecture: What is Transfer Learning? Great idea 1: From Words to Words-in-Context (CoVe, ELMo). Great idea 2: From Replacing Embeddings to Replacing Models (GPT, BERT). (A Bit of) Adaptors. Analysis and Interpretability. Interactive lecture materials and more.
- Homework: fine-tuning a pre-trained BERT model
-
week05 Large Language Models
- Lecture: Scaling laws. Emergent abilities. Open-source LLMs.
- Practice: hands-on with open-source LLMs
-
week06 Prompting & In-Context Learning
- Lecture: Prompting techniques. Chain-of-Thought reasoning. In-context learning: how and why it works. Analysis and Interpretability.
- Homework: manual prompt engineering and chain-of-thought reasoning
-
week07 Fine-tuning (PEFT & RLHF)
- Lecture: Parameter-efficient fine-tuning (LoRA, adapters). Reinforcement Learning from Human Feedback (RLHF).
- Seminar + Homework
-
week08 Efficiency
- Lecture: Quantization. Distillation. Pruning. Speculative decoding.
- Homework
-
week09 Retrieval-Augmented Generation (RAG)
- Lecture: Dense retrieval. RAG architectures.
- Practice
-
week10 AI Agents
- Lecture: Agent architectures. Tool use. Memory.
- Seminar + Homework
-
week11 Interpretability
- Lecture: Probing. Mechanistic interpretability.
- Seminar + Homework
-
week12 Multimodal LLMs
-
week13 Building LLM Systems
-
week14 AI Agents in Production
Course materials and teaching performed by
- Elena Voita - original course author
- Michael Diskin - responsible for the 2025 edition; lecturer
- Just Heuristic - most of the seminars and some lectures
- Ignat Romanov, George Yakushev, Andrei Panferov - lectures
- Natasha Badanina - course admin for on-campus students
- Boris Kovarsky, David Talbot, Sergey Gubanov, Ruslan Svirschevski - help build course materials and/or held some classes
- 30+ volunteers who contributed and refined the notebooks and course materials. Without their help, the course would not be what it is today
- A mighty host of TAs who stoically grade hundreds of homework submissions from on-campus students each year