PAR Homework 3, due Thu 2021-02-15, noon
Rules
Submit the answers to Gradescope.
You may do homeworks in teams of 2 students. Create a gradescope team and make one submission with both names.
Question
The goal is to measure whether OpenMP actually makes matrix multiplication faster, with and w/o SIMD.
You may use anything in /parallel-class/openmp that seems useful.
-
Write a C++ program on parallel.ecse to initialize pseudorandomly and multiply two 100x100 float matrices. One possible initialization:
a[i][j] = i*3 + (j*j)%5; b[i][j] = i*2 + (j*j)%7;
(10 points) Report the elapsed time. Include the program listing.
Add an OpenMP pragma to do the work in parallel.
-
(10 points) Report the elapsed time, varying the number of threads thus: 1, 2, 4, 8, 16, 32, 64.
What do you conclude?
(5 points) Repeat that two more times to see how consistent the times are.
-
(10 points) Modify the pragma to use SIMD.
Report the elapsed time, varying the number of threads thus: 1, 2, 4, 8, 16, 32, 64.
What do you conclude?
(10 points) Compile and run your program with two different levels of compiler optimization: O1 and O3, reporting the elapsed time. Modify your program to prevent the optimizer from optimizing the program away to nothing. E.g., print a few values.
(5 points) What do you conclude about everything?
Total: 40 pts.