Skip to main content

PAR Class 26, Mon 2022-04-18

1 Final project

1.1 Talks today

  1. Phu TT, Mark B

  2. Ivan H

  3. Luoyan Z

  4. Sissi J

1.2 Prior work

If your final project is building on, or sharing with, another course or project (say on GitHub), then you must give the details, and say what's new for this course.

1.3 Paper format

Try to use the IEEE conference format . It allows either latex or MS word. Submit the PDF paper to gradescope.

1.4 Deliverables

from the syllabus (modified).

  1. An implementation showing parallel computing.

  2. An extended abstract or paper on your project, written up like a paper. You should follow the style guide for some major conference (I don't care which, but can point you to one).

  3. A more detailed manual, showing how to use it.

  4. A 10-minute project presentation.

  5. A write-up uploaded on the last class day. This will contain an academic paper, code and perhaps video or user manual.

2 No lecture on Thurs

I'll assign some good things to watch.

3 Software tips

3.1 Git

Git is good to simultaneously keep various versions. Here's a git intro:

Create a dir for the project:

mkdir PROJECT; cd PROJECT

Initialize:

git init

Create a branch (you can do this several times):

git branch MYBRANCHNAME

Go to a branch:

git checkout MYBRANCHNAME

Do things:

vi, make, ....

Save it:

git add .; git commit -mCOMMENT

Repeat

At times I've used this to modify a program for class while keeping a copy of the original.

3.2 Freeze decisions early: SW design paradigm

One of my rules is to push design decisions to take effect as early in the process execution as possible. Constructing variables at compile time is best, at function call time is 2nd, and on the heap is worst.

  1. If I have to construct variables on the heap, I construct few and large variables, never many small ones.

  2. Often I compile the max dataset size into the program, which permits constructing the arrays at compile time. Recompiling for a larger dataset is quick (unless you're using CUDA).

    Accessing this type of variable uses one less level of pointer than accessing a variable on the heap. I don't know whether this is faster with a good optimizing compiler, but it's probably not slower.

  3. If the data will require a dataset with unpredictably sized components, such as a ragged array, then I may do the following.

    1. Read the data once to accumulate the necessary statistics.

    2. Construct the required ragged array.

    3. Reread the data and populate the array.

Update: However, with CUDA, maybe managed variables must be on the heap.

4 Exascale computing

https://www.nextplatform.com/2019/12/04/openacc-cozies-up-to-c-c-and-fortran-standards/

https://www.nextplatform.com/2019/01/09/two-thirds-of-the-way-home-with-exascale-programming/

They agree with me that fewer bigger parallel processors are better than many smaller processors connected with MPI. I.e., they like my priorities in this course.

5 CPPCON

CppCon is the annual, week-long face-to-face gathering for the entire C++ community. https://cppcon.org/

https://cppcon2018.sched.com/

CppCon 2014: Herb Sutter "Paying for Lunch: C++ in the ManyCore Age" https://www.youtube.com/watch?v=AfI_0GzLWQ8

CppCon 2016: Combine Lambdas and weak_ptrs to make concurrency easy (4min) https://www.youtube.com/watch?v=fEnnmpdZllQ

A Pragmatic Introduction to Multicore Synchronization by Samy Al Bahra. https://www.youtube.com/watch?v=LX4ugnzwggg

CppCon 2018: Tsung-Wei Huang “Fast Parallel Programming using Modern C++” https://www.youtube.com/watch?v=ho9bqIJkvkc&list=PLHTh1InhhwT7GoW7bjOEe2EjyOTkm6Mgd&index=13

CppCon 2018: Anny Gakhokidze “Workflow hacks for developers” https://www.youtube.com/watch?v=K4XxeB1Duyo&list=PLHTh1InhhwT7GoW7bjOEe2EjyOTkm6Mgd&index=33

CppCon 2018: Bjarne Stroustrup “Concepts: The Future of Generic Programming (the future is here)” https://www.youtube.com/watch?v=HddFGPTAmtU

CppCon 2018: Herb Sutter “Thoughts on a more powerful and simpler C++ (5 of N)” https://www.youtube.com/watch?v=80BZxujhY38

CppCon 2018: Jefferson Amstutz “Compute More in Less Time Using C++ Simd Wrapper Libraries” https://www.youtube.com/watch?v=80BZxujhY38

CppCon 2018: Geoffrey Romer “What do you mean "thread-safe"?” https://www.youtube.com/watch?v=s5PCh_FaMfM

6 Supercomputing conference

https://supercomputing.org/

"The International Conference for High Performance Computing, Networking, Storage, and Analysis"

7 Jack Dongarra

He has visited RPI more than once.

An Overview of High Performance Computing and Future Requirements Jul 17, 2021, (35:03)

Abstract: In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our numerical scientific software. A new generation of software libraries and algorithms are needed for the effective and reliable use of (wide area) dynamic, distributed and parallel environments. Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies, range of latencies, and increased run--time environment variability will make these problems much harder.

About the Speaker: Jack Dongarra holds an appointment at the University of Tennessee, Oak Ridge National Laboratory, and the University of Manchester. He specializes in numerical algorithms in linear algebra, parallel computing, use of advanced-computer architectures, programming methodology, and tools for parallel computers. He was awarded the IEEE Sid Fernbach Award in 2004; in 2008 he was the recipient of the first IEEE Medal of Excellence in Scalable Computing; in 2010 he was the first recipient of the SIAM Special Interest Group on Supercomputing's award for Career Achievement; in 2011 he was the recipient of the IEEE Charles Babbage Award; in 2013 he received the ACM/IEEE Ken Kennedy Award; in 2019 he received the ACM/SIAM Computational Science and Engineering Prize, and in 2020 he received the IEEE Computer Pioneer Award. He is a Fellow of the AAAS, ACM, IEEE, and SIAM and a foreign member of the Russian Academy of Science, a foreign member of the British Royal Society, and a member of the US National Academy of Engineering.

8 12 Ways to Fool the Masses with Irreproducible Results

https://lorenabarba.com/news/keynote-at-the-35th-international-parallel-and-distributed-processing-symposium-ipdps/

Lorena Barba IPDPS21 keynote May 20, 2021 (37:11)

Keynote at the IEEE International Parallel and Distributed Processing Symposium, May 19, 2021

Abstract Thirty years ago, David Bailey published a humorous piece in the Supercomputing Review magazine, listing 12 ways of presenting results to artificially boost performance claims. That was at a time when the debate was between Cray "two-oxen" machines versus parallel "thousand-chickens" systems, when parallel standards (like MPI) were still unavailable, and the Top500 list didn't yet exist. In the years since, David and others updated the list of tricks a few times, notably in 2010–11 (when the marketing departments of Intel and Nvidia were really going at each other) Georg Hager in his blog and Scott Pakin in HPC Wire. Heterogeneity of computing systems has only escalated in the last decade, and many remiss reporting tactics continue unabated. Alas, two new ingredients have entered into the mix: wide adoption of machine learning techniques both in the science applications and systems research; and a swell of concern over reproducibility and replicability. My talk will be a new twist on the 12 ways to fool the masses, focusing on how researchers in computational science and high-performance computing miss the mark when conducting or reporting their results with poor reproducibility. By showcasing in a lighthearted manner a set of anti-patterns, I aim to encourage us to see the value and commit to adapting our practice to achieve more trustworthy scientific evidence with high-performance computing.

There's a link to the slides there.

9 Course survey

I see my role as a curator, selecting the best stuff to present to you. Since I like the topic, and asked to be given permission to create this course, I pick things I like since you might like them too.

So, if you liked the course, then please officially tell RPI by completing the survey and saying what you think. Thanks.

10 After the semester

I'm open to questions and discussions about any legal ethical topic. Even after you graduate.