OS for parallel machine
mem ICN challenges
– numa effects
– deep mem hierarchy
– false sharing
principles
Cache conscious decisions
limit shared system data structure
keep memory accesses local
cpu – mem – icn – mem
CPU -> vpn -> TLB lookup -> miss -> PT lookup -> miss -> locate file -> I/O -> page frame -> vpn, pfn -> pt update -> vpn -> TLB update -> p.f. service complete
Parallel os + page fault service
easy scenario
-multiprocess workload
N1 T1 … Tn Nn
*threads independent
*page tables distinct
*no serialization
Hard Scenario
-multi thread workload
process T1, T2, T3, T4 shared address space
T1, T3 N1, T2, T4 N2
*address space shared
*page table shared
*shared entries in processor TLB’s
Recipe for scalable structure in parallel os
for every sub system
-determine functionally needs of that service
-to ensure concurrent execution of service
* minimize shared data structures
less sharing -> more scalable
– where possible replicate/ partition
system data structures
-> less locking
-> more concurrency
Tornado’s secret sauce: clustered object
object reference: illusion of single object, under the cover multiple representations
Degree of clustering
– choice of implementor of service
* singleton rep, one per core
* ppc for consistency
TLB – Process -> region -> FCM -> core – DRAM -> page