YSDA Natural Language Processing course

This is the 2025 iteration of the course, materials are added as we prepare them.
Lecture and seminar materials for each week are in ./week* folders, see README.md for materials and instructions
Any technical issues, ideas, bugs in course materials, contribution ideas - add an issue
Installing libraries and troubleshooting: this thread.

Syllabus

week01 Word Embeddings
- Lecture: Word embeddings. Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. Evaluation: intrinsic vs extrinsic. Analysis and Interpretability. Interactive lecture materials and more.
- Seminar: Playing with word and sentence embeddings
- Homework: Embedding-based machine translation system
week02 Language Modeling
- Lecture: Language Modeling: what does it mean? Left-to-right framework. N-gram language models. Neural Language Models: General View, Recurrent Models, Convolutional Models. Evaluation. Practical Tips: Weight Tying. Analysis and Interpretability. Interactive lecture materials and more.
- Seminar: Build a N-gram language model from scratch
- Homework: Neural LMs & smoothing in count-based models.
week03 Seq2seq and Attention
- Lecture: Seq2seq Basics: Encoder-Decoder framework, Training, Simple Models, Inference (e.g., beam search). Attention: general, score functions, models. Transformer: self-attention, masked self-attention, multi-head attention; model architecture. Subword Segmentation (BPE). Analysis and Interpretability: functions of attention heads; probing for linguistic structure. Interactive lecture materials and more.
- Seminar: Basic sequence to sequence model
- Homework: Machine translation with attention
week04 Transfer Learning
- Lecture: What is Transfer Learning? Great idea 1: From Words to Words-in-Context (CoVe, ELMo). Great idea 2: From Replacing Embeddings to Replacing Models (GPT, BERT). (A Bit of) Adaptors. Analysis and Interpretability. Interactive lecture materials and more.
- Homework: fine-tuning a pre-trained BERT model
week05 Large Language Models
- Lecture: Scaling laws. Emergent abilities. Open-source LLMs.
- Practice: hands-on with open-source LLMs
week06 Prompting & In-Context Learning
- Lecture: Prompting techniques. Chain-of-Thought reasoning. In-context learning: how and why it works. Analysis and Interpretability.
- Homework: manual prompt engineering and chain-of-thought reasoning
week07 Fine-tuning (PEFT & RLHF)
- Lecture: Parameter-efficient fine-tuning (LoRA, adapters). Reinforcement Learning from Human Feedback (RLHF).
- Seminar + Homework
week08 Efficiency
- Lecture: Quantization. Distillation. Pruning. Speculative decoding.
- Homework
week09 Retrieval-Augmented Generation (RAG)
- Lecture: Dense retrieval. RAG architectures.
- Practice
week10 AI Agents
- Lecture: Agent architectures. Tool use. Memory.
- Seminar + Homework
week11 Interpretability
- Lecture: Probing. Mechanistic interpretability.
- Seminar + Homework
week12 Multimodal LLMs
week13 Building LLM Systems
week14 AI Agents in Production

Contributors & course staff

Course materials and teaching performed by

Elena Voita - original course author
Michael Diskin - responsible for the 2025 edition; lecturer
Just Heuristic - most of the seminars and some lectures
Ignat Romanov, George Yakushev, Andrei Panferov - lectures
Natasha Badanina - course admin for on-campus students
Boris Kovarsky, David Talbot, Sergey Gubanov, Ruslan Svirschevski - help build course materials and/or held some classes
30+ volunteers who contributed and refined the notebooks and course materials. Without their help, the course would not be what it is today
A mighty host of TAs who stoically grade hundreds of homework submissions from on-campus students each year

Name		Name	Last commit message	Last commit date
Latest commit History 933 Commits
resources		resources
week01_embeddings		week01_embeddings
week02_lm		week02_lm
week03_attention		week03_attention
week04_transfer		week04_transfer
week05_llm		week05_llm
week06_prompting		week06_prompting
week07_finetuning		week07_finetuning
week08_efficiency		week08_efficiency
week09_retrieval		week09_retrieval
week10_agents		week10_agents
week11_interpretability		week11_interpretability
week12_multimodal		week12_multimodal
week13_llm_systems		week13_llm_systems
week14_agents_production		week14_agents_production
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YSDA Natural Language Processing course

Syllabus

Contributors & course staff

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YSDA Natural Language Processing course

Syllabus

Contributors & course staff

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages