5/8 - Basics of Parallel Data Access -Scalable Data Access: Paged Access, I/O Costs, Layouts/Access Patterns TODO

Central Issue: Large data file does not fit entirely in DRAM

Basic Idea: Divide-and-conquer again

4 regimes of scalability

Paged Access

Screen Shot 2024-05-08 at 3.21.13 PM.png

Caching: retaining pages from disk in DRAM

Eviction: removing a page frame’s content in DRAM

Spilling: Writing out pages from DRAM to disk

Cache Replacement Policy: algorithm which chooses which page frames to evict

disk cost: count number of page I/Os, map to bytes given page size

communication cost: count number of pages / bytes sent/received to network