PAR Homework 4, due Thu 2020-02-13, noon
Rules
- Submit the answers to Gradescope.
- You may do homeworks in teams of 2 students. Create a gradescope team and make one submission with both names.
- For redundancy, at the top of your submitted homework write what it is and who you are. E.g., "Parallel Homework 2, 1/30/20, by Boris Badenov and Natasha Fatale".
Questions
These may require some research.
- (10 points) Compare and contrast these different types of memory on the Nvidia GPU. How big are they? How fast are they? How global is their visibility?
- register
- shared
- local
- global
- (10 points) Compare and contrast on size, synchronization of components, order in hierarchy.
- thread block
- thread warp
- grid
- (10 points) In what way is the simple CUDA example in Module 2 obsolete?
- (10 points) On parallel, according to /local/cuda/samples/0_Simple/matrixMulCUBLAS how many FLOPS is the RTX 8000 on parallel? How does that compare to your program (that didn't use the GPU)?
Total: 40 pts.