Lesson 11: Distributed Shared Memory

On readings: Recommended background readings are marked with (^) above. Optional historical or fun readings are marked with (*). If you feel comfortable with the topic already, you may skip these readings.

Notes

Page granularity sharing

DSM systems typically share data at the level of pages, contrasted with the cache line size. This is because we need to amortize the (relatively) large costs of communicating over a loosely coupled newtork. These costs suggest the use of large pages, but there is a problem…

False sharing

We’ve talked about this before, but any time we share data in chunks, there is a possibility of two threads/processes accessing distinct data in separate parts of the chunk. In a system that has to enforce coherence, such false sharing confounds the coherence protocol, generating unnecessary traffic on the interconnect.

Restricting shared memory

Note that in IVY, not all memory can be shared across nodes. Some portion of the address space (in particular, the low portion), is kept local, ensuring fast access. For example, the executable of processes is kept in local memory. However, the stack is not. The PCB is kept private.

Note on terminology

The authors use the term eventcount to describe a synchronization primitive. You will hopefully recognize this as what we’d today call a counting semaphore.