Parallel computing: many threads solving a problem by working together.
Map:Tasks read from and write to specific data elements
Gather,
Scatter: tasks compute where to write output
Stencil:tasks read input from a fixed neighborhood in an array:
Transpose:array, matrix, image, data structure
-> tasks re-order data elements in memory
struct foo {
float f;
int i;
};
foo array[1000];
array of structures -> structure of array
float out[], in[]; int i = threadIdx.x; int j = threadIdx.y; const float pi = 3.1415; out[i] = pi*in[i]; out[i + j*128] = in[j + i*128]; if (i%2){ out[i-1] += pi * in[i]; out[i+1] += pi * in[i]; out[i] = (in[i]+ in[i-1] + in[i+1]) * pi/3.0f; }