Skip to main content

PAR Homework 4, due Thu 2020-02-13, noon

Rules

  1. Submit the answers to Gradescope.
  2. You may do homeworks in teams of 2 students. Create a gradescope team and make one submission with both names.
  3. For redundancy, at the top of your submitted homework write what it is and who you are. E.g., "Parallel Homework 2, 1/30/20, by Boris Badenov and Natasha Fatale".

Questions

These may require some research.

  1. (10 points) Compare and contrast these different types of memory on the Nvidia GPU. How big are they? How fast are they? How global is their visibility?
    1. register
    2. shared
    3. local
    4. global
  2. (10 points) Compare and contrast on size, synchronization of components, order in hierarchy.
    1. thread block
    2. thread warp
    3. grid
  3. (10 points) In what way is the simple CUDA example in Module 2 obsolete?
  4. (10 points) On parallel, according to /local/cuda/samples/0_Simple/matrixMulCUBLAS how many FLOPS is the RTX 8000 on parallel? How does that compare to your program (that didn't use the GPU)?

Total: 40 pts.