The rise of Large Generative Models (LGMs) has revolutionized AI capabilities, but building efficient systems to support them is a critical next step. This course delves into modern machine learning systems for LGMs, discussing the fundamentals and cutting-edge topics in this field. The students will learn the system design principles for training, inference, and serving LGMs, scaling techniques to handle ever-growing models, memory reduction strategies to optimize resource utilization, as well as acceleration techniques to improve model performance. This course offers some background for students who would like to pursue engineering or research in machine learning systems. Pre-requisitions: UG machine learning, UG operating systems, Python coding. Instructor: Yao Lu TAs: Shenggan Cheng, Xuanlei Zhao When and where: Wed 10:00-12:00 COM1-0212 (SR3) Schedule: Lecturedate Plan Lecturer if not Yao Note Aug 14 Week 1: Introduction [slides] [HW1 Release] Aug 21 Week 2: MLsys foundations [slides] Jeff Dean and Prateek Jain's talk in the first hour Aug 28 Week 3: Automatic differentiation [slides] HW1 due (Aug 31) Sep 04 Week 4: Hardware acceleration [slides] [HW2 Release] Sep 11 Week 5: Parallelism and training techniques [slides] Sep 18 Week 6: Transformers, Attention and Optimizations [slides] HW2 due, Project proposal due (Sep 21) Sep 25 Recess week Oct 02 Week 7: Serving LLMs [slides] [HW3 Release] Oct 09 Week 8: Fine-tuning and alignment techniques [slides] Oct 16 Week 9: AI for systems Guest lecture: Dr. Jialin Ding HW3 due, Mid-term project report due Oct 23 Week 10: Application Systems: server design, AI Agents and RAGs [slides] Oct 30 Week 11: ML compilers Guest lecture: Dr. Tianqi Chen [HW4 Release] Nov 06 Week 12: Cloud systems for AI [slides] Nov 13 Week 13: Project presentations [Schedule] HW4 due, final project report due Grading schemes: Mandatory: (1) Paper reading and discussion, and (2) HW1 for individual students. Elecive: (1) HW2-4 for individual students, and (2) course project for groups of 2-3 students. You can choose between more (all) homeworks and less (no) projects, or the other way. Note: pure ML/AI/CV/NLP projects are not acceptable. The project has to demonstrate systems design and implementation which leads to improvements of the system efficiency, robustness or generalizability.