Modern data systems are evolving to cope with the increasing scalability and complexity of multi-modal data that is being used in emerging applications such as search engines, business intelligence, and large generative models. This course begins by exploring individual databases designed for various data models, progresses to cloud databases such as data lakes and warehouses, followed by optimization techniques, tuning and data integration strategies. Our goal is to provide a contemporary perspective that supports practical, data-intensive applications.

Instructor: Yao Lu
When and where: Fridays, LT15 18:30-20:30 (lecture), 20:30-21:30 (tutorial)

Schedule:

Lecture
date
Plan Note
Jan 17 Week 1: Introduction
[Lecture slides]
Jan 24 Week 2: Relation Databases I. Concepts
[Lecture slides]
Jan 31 Week 3: Relation Databases II. Tuning A
Tutorial session: Labs for relational DB design
[Lecture slides][Tutorial slides]
[HW1 Release]
Feb 07 Week 4: Relation Databases II. Tuning B
Tutorial session: Labs for relational DB tuning
[Lecture slides][Tutorial slides]
[HW2 Release]
Feb 14 Week 5: Modern Databases I. Streaming and Time Series Databases
Tutorial session: Labs for time series DB
[Lecture slides][Guest slides on stream DB][Tutorial slides]
[HW1 due]
Feb 21 Week 6: Modern Databases II. Vector Databases
Tutorial session: Labs for vector DB
[Lecture slides][Tutorial slides]
[Team project release]
Feb 28 Recess week, no lecture
Mar 07 Week 7: Modern Databases III. Data Curation and RAGs
[Lecture slides][Tutorial slides]
Tutorial session: Labs for data curation & RAGs
[HW2 due]
Mar 14 Week 8: Cloud Databases I: Foundamentals
[Lecture slides]
Tutorial session: team project presentations, time series DB
[Final project release]
Mar 21 Week 9: Cloud Databases II: Data Lake and Warehouse
[Lecture slides]
Tutorial session: team project presentations, vector DB
Mar 28 Well-Being Day, no lecture
Apr 4 Week 11: Cloud Databases III: OLAP optimizations

Tutorial session: team project presentations, data curation and RAGs
Apr 11 Week 12: Cloud Databases IV: Data Integration

Tutorial session: Labs for document DB
TBD Final project presentations, time & location: TBD
Grading schemes:
  • Tutorials: 10%, 2 Homeworks: 40%, 1 Team Project: 20%, Final Project: 30%