PAR Class 14, Wed 2018-05-02

W Randolph Franklin, RPI

Table of contents

1 Course recap

My teaching style is to work from particulars to the general.
You've seen OpenMP, a tool for shared memory parallelism.
You've seen the architecture of NVidia's GPU, a widely used parallel system, and CUDA, a tool for programming it.
You've seen Thrust, a tool on top of CUDA, built in the C++ STL style.
You've seen how widely used numerical tools like BLAS and FFT have versions built on CUDA.
You've had a chance to program in all of them on parallel.ecse, with dual 14-core Xeons, Pascal NVidia board, and Xeon Phi coprocessor.
You seen talks by leaders in high performance computing, such as Jack Dongarra.
You've seen quick references to parallel programming using Matlab, Mathematica, and the cloud.
Now, you can inductively reason towards general design rules for shared and non-shared parallel computers, and for the SW tools to exploit them.