The rise of Large Generative Models (LGMs) and Large Language Models (LLMs) has revolutionized AI capabilities, but building efficient systems to support them is a critical next step. This course delves into modern machine learning systems for LGMs, discussing the fundamentals and cutting-edge topics in this field. The students will learn the system design principles for training, inference, and serving LGMs, scaling techniques to handle ever-growing models, memory reduction strategies to optimize resource utilization, as well as acceleration techniques to improve model performance. This course offers some background for students who would like to pursue engineering or research in machine learning systems. Pre-requisitions: UG machine learning, UG operating systems, Python coding. Instructor: Yao Lu TAs: Junyi Shen, Noppanat Wadlom When and where: Wed 10:00-12:00 COM1-0212 (SR3) Schedule: Lecturedate Plan Lecturer if not Yao Note Aug 13 Week 1: Introduction [slides] [HW1 Release] Aug 20 Week 2: MLsys foundations [slides] Aug 27 Week 3: Automatic differentiation [slides] HW1 due Sep 03 Week 4: Hardware acceleration [HW2 Release] Sep 10 Week 5: Parallelism and training techniques Sep 17 Week 6: Transformers, Attention and Optimizations HW2 due, Project proposal due Sep 24 Recess week Oct 01 Week 7: Serving LLMs [HW3 Release] Oct 08 Week 8: Post-training techniques Oct 15 Week 9: Multi-Modal Models HW3 due, Mid-term project report due Oct 22 Week 10: Application Systems: AI Agents, RAGs, Deep Research and beyond [HW4 Release] Oct 29 Week 11: LLM Safety Nov 05 Week 12: Cloud systems for AI Nov 12 Week 13: Project presentations HW4 due, final project report due Grading schemes: Mandatory: (1) Paper reading and discussion in class, and (2) HW1 for individual students. Elecive: (1) HW2-4 for individual students, and (2) course project for groups of 2-3 students. You can choose between more (all) homeworks and less (no) projects, or the other way. Note: pure ML/AI/CV/NLP projects are not acceptable. The project has to demonstrate systems design and implementation which leads to improvements of the system efficiency, robustness or generalizability.