DB Systems
Non-deduplicating project


Simple SQL


Groupby


- if hash table > DRAM size, program will crash
- to scale larger groups, divide and conquer
Matrix Sum / Norms

- all stores have same I/O cost because we need to square and sum all cells
- generally, tiled partitioning is better because it scales both dimensions, more DRAM & Cache efficient
ML Systems