Parallel computing

Parallel computing: many threads solving a problem by working together.
Map:Tasks read from and write to specific data elements
Gather,
Scatter: tasks compute where to write output
Stencil:tasks read input from a fixed neighborhood in an array:
Transpose:array, matrix, image, data structure
-> tasks re-order data elements in memory

struct foo {
float f;
int i;
};
foo array[1000];
array of structures -> structure of array

float out[], in[];
int i = threadIdx.x;
int j = threadIdx.y;

const float pi = 3.1415;

out[i] = pi*in[i];
out[i + j*128] = in[j + i*128];

if (i%2){
	out[i-1] += pi * in[i]; out[i+1] += pi * in[i];
	out[i] = (in[i]+ in[i-1] + in[i+1]) * pi/3.0f;
}