Dance Hall Architecture
CPU, cache, interconnection network, memory
Write invalidate
CPU -> cache (y->y’) -> Main Memory
Scalability
Expectation with more processors
perf, processors, expected, actual, overhead
Synchronization
primitives for shared memory programming
Barrier
T1, T2, …Tn
P1:modify struct(A)
P2:wait for mod; use struct(A);
Atomic operations
LOCK(L):
if(L==0)
L = 1;
else
while(L==1);
// wait
go back;
unlock(L);
L == 0;
atomic rmw instructions
test-and-set(
return current value in
set
fetch-and-inc(
return current value in
increment [
Latency waiting time contention
Lock Algorithm
Barrier Algorithms
Naive Spinlock
Lock(L):
while(test(L) == locked);
if(t + s(L)==locked) go back;