PAR Homework 3, Thu 2022-01-27
Rules
Due next Thu 2022-02-03
Submit the answers to Gradescope.
You may do homeworks in teams of 2 students. Create a gradescope team and make one submission with both names.
Questions
(10 pts) Investigate the effects of varying numbers of threads on elapsed time and errors. Test sum, sum_atomic, sum_crit, sum_reduc with 1, 2, 8, 16, 32, 64 threads. Make appropriate mods so that the programs run slowly enough for you to measure the elapsed times.
-
(10 pts) Look at https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html#openmp-use
Try to modify sum_atomic and sum_crit to run on the GPU (using OpenMP). Full points for a good try, whether or not you succeed.
(5 pts) Why is matmul4 segfaulting on n=1000? Try to fix it.
Total: 25