Skip to main content

PAR Homework 3, Thu 2022-01-27

Rules

  1. Due next Thu 2022-02-03

  2. Submit the answers to Gradescope.

  3. You may do homeworks in teams of 2 students. Create a gradescope team and make one submission with both names.

Questions

  1. (10 pts) Investigate the effects of varying numbers of threads on elapsed time and errors. Test sum, sum_atomic, sum_crit, sum_reduc with 1, 2, 8, 16, 32, 64 threads. Make appropriate mods so that the programs run slowly enough for you to measure the elapsed times.

  2. (10 pts) Look at https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html#openmp-use

    Try to modify sum_atomic and sum_crit to run on the GPU (using OpenMP). Full points for a good try, whether or not you succeed.

  3. (5 pts) Why is matmul4 segfaulting on n=1000? Try to fix it.

Total: 25