The rise of Large Generative Models (LGMs) and Large Language Models (LLMs) has revolutionized AI capabilities, but building efficient systems to support them is a critical next step. This course delves into modern machine learning systems for LGMs, discussing the fundamentals and cutting-edge topics in this field. The students will learn the system design principles for training, inference, and serving LGMs, scaling techniques to handle ever-growing models, memory reduction strategies to optimize resource utilization, as well as acceleration techniques to improve model performance. This course offers some background for students who would like to pursue engineering or research in machine learning systems.

Pre-requisitions: UG machine learning, UG operating systems, Python coding.
Instructor: Yao Lu
TAs: Junyi Shen, Noppanat Wadlom
When and where: Wed 10:00-12:00 COM1-0212 (SR3)

Schedule:

Lecture
date
Plan Lecturer
if not Yao
Note
Aug 13 Week 1: Introduction
[slides]
[HW1 Release]
Aug 20 Week 2: MLsys foundations
[slides]
Aug 27 Week 3: Automatic differentiation
[slides]
HW1 due
Sep 03 Week 4: Hardware acceleration
[slides]
[HW2 Release]
Sep 10 Week 5: Parallelism and training techniques
[slides]

Sep 17 Week 6: Transformers, Attention and Optimizations
[slides]
HW2 due, Project proposal due
Sep 24 Recess week

Oct 01 Week 7: Serving LLMs
[slides]
[HW3 Release]
Oct 08 Week 8: Application Systems: RAGs, Vector DBs and AI Agents
[slides]

Oct 15 Week 9: Deep Research in Action
Bruce Yang HW3 due, Mid-term project report due
Oct 22 Week 10: LLM Alignment
[slides]
[HW4 Release]
Oct 29 Week 11: LLM Safety
Junfeng Fang
Nov 05 Week 12: Cloud systems for AI
Nov 12 Week 13: Project presentations
HW4 due, final project report due
Grading schemes:
  • Mandatory: (1) Paper reading and discussion in class, and (2) HW1 for individual students.
  • Elecive: (1) HW2-4 for individual students, and (2) course project for groups of 2-3 students.
    You can choose between more (all) homeworks and less (no) projects, or the other way.
    Note: pure ML/AI/CV/NLP projects are not acceptable. The project has to demonstrate systems design and implementation which leads to improvements of the system efficiency, robustness or generalizability.