Homework 2, due Thurs 2014-02-07
- Install CUDA on your linux system. The instructions are here:
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html
Note the need to set
PATH
and LD_LIBRARY_PATH
.
Run nvcc -V
.
- On 2014-01-30 I debugged
omp_bug1.c,
omp_bug2.c,
omp_bug3.c,
and, sort of, omp_bug4.c on
https://computing.llnl.gov/tutorials/openMP/exercise.html.
- Finish exploring
omp_bug4.c
. Specifically, determine what combinations of the system stack size,
accessed via
ulimit
, OpenMP stack size and system virtual memory
work.
- Debug omp_bug5.c and omp_bug6.c
- I also demoed http://people.sc.fsu.edu/~%20jburkardt/c_src/heated_plate_openmp/ in class.
- Explore the parallel speed up. Run it with varying numbers of threads
up to the max that your system allows and report the time.
- Remove the openmp include file, compile w/o
-fopenmp
and report the
time. How does this compare to the OpenMP time with 1 processor?
- http://people.sc.fsu.edu/~%20jburkardt/cpp_src/dijkstra_openmp/ is a less regularly structured program.
- Explore its parallel speedup as a function of the number of threads.
- Explore the effect of varying scheduling strategies, by comparing static, dynamic, etc.
- Make up a program to test the efficiency of parallel sections. Write a
program with 4 separate sections doing something compute bound; I don't
care what. Use OpenMP with varying numbers of threads. Report your
observations.