Power-efficient

CPU:latency (time/seconds)
GPU:throughput (stuff/time) (jobs/hour)

Latency vs Bandwidth

car:
latency 22.5 hours
throughput 0.089 people/hour
bus:
latency: 90 hours
throughput 0.45 people/hour

8 core ng (intel)
8-wide avx vector operations / core
2 thread / core(hyperthreading)
128-way parallelism

CUDA program
which is written in c with extensions for CPU(“host”) and GPU(“device”)