Skip to main content

PAR Syllabus

1 Catalog info

Title

ECSE-4740-01 Applied Parallel Computing for Engineers

Semesters

Spring term annually

Credits

3 credit hours

Time and place

Mon and Thurs noon-1:20pm. Online for the 1st 2 weeks, then in class

2 Description

  1. This is intended to be a computer engineering course to provide students with knowledge and hands-on experience in developing applications software for affordable parallel processors. This course will mostly cover hardware that any lab can afford to purchase. It will cover the software that, in the prof's opinion, is the most useful. There will also be some theory.

  2. The target audiences are ECSE seniors and grads and others with comparable background who wish to develop parallel software.

  3. This course will have minimal overlap with parallel courses in Computer Science. We will not teach the IBM BlueGene, because it is so expensive, nor cloud computing and MPI, because most big data problems are in fact small enough to fit on our hardware.

  4. You may usefully take all the parallel courses at RPI.

  5. This unique features of this course are as follows:

    1. For 2/3 of the course, use of only affordable hardware that any lab might purchase, such as Nvidia GPUs. This is currently the most widely used and least expensive parallel platform.

    2. Emphasis on learning several programming packages, at the expense of theory. However you will learn a lot about parallel architecture.

  6. Hardware taught, with reasons:

    Multicore Intel Xeon

    universally available and inexpensive, comparatively easy to program, powerful

    Nvidia GPU accelerator

    widely available (Nvidia external graphics processors are on 1/3 of all PCs), very inexpensive, powerful, but harder to program. Good cards cost only a few hundred dollars.

    Quantum computers

    It's new and hot and might some day be useful.

  7. Software that might be taught, with reasons:

    OpenACC C++ extension

    widely used, easy to use if your algorithm is parallelizable, backend is primarily multicore Xeon but also GPUs.

    Thrust C++ functional programming library

    FP is nice, hides low level details, backend can be any major parallel platform.

    MATLAB, Mathematica

    easy to use parallelism for operations that they have implemented in parallel, etc.

    CUDA C++ extension and library for Nvidia

    low level access to Nvidia GPUs.

    nvc++

    Nvidia's parallel C++ compiler.

  8. The techniques learned here will also be applicable to larger parallel machines -- numbers 2, 3, 5, and 6 on the top 500 list use NVIDIA GPUs.

  9. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors.

3 Prerequisite

ECSE-2660 CANOS or equivalent, knowledge of C++,.

4 Professor

/files/wrf2013.jpg

W. Randolph Franklin. BSc (Toronto), AM, PhD (Harvard)

Office

online only

Phone

+1 (518) 276-6077 (forwards)

Email

frankwr@YOUKNOWTHEDOMAIN or mailATwrfranklinDOTorg or wrfranklinATpmDOTme

Email is my preferred communication medium. I generally send email from my own domain.

Sending from non-RPI accounts are fine; I do that. But please show your name, at least in the comment field. A subject prefix of #Par is helpful. I support encrypted email, either through inline GPG, or via protonmail.

We can also use webex or facetime.

Web

https://wrf.ecse.rpi.edu/

A quick way to get there is to google RPIWRF.

Office hours

At 3:20 pm after my probability lecture, usually as long as anyone wants to talk. Also by appointment.

Informal meetings

If you would like to talk with me, either individually or in a group, just mention it. We can then talk about most anything legal and ethical.

Why I'm teaching this course

I asked to create and teach it, because I like the topic.

5 Course websites

The homepage has lecture summaries, syllabus, homeworks, etc.

6 Reading material

6.1 Text

There is no required text, but the following inexpensive books may be used. I might mention others later.

  1. Sanders and Kandrot, CUDA by example. It gets excellent reviews, although it is several years old. Amazon has many options, including Kindle and renting hardcopies.

  2. Kirk and Hwu, 2nd edition, Programming massively parallel processors. It concentrates on CUDA.

One problem is that even recent books may be obsolete. For instance they may ignore the recent CUDA unified memory model, which simplifies CUDA programming at a performance cost. Even if the current edition of a book was published after unified memory was released, the author might not have updated the examples.

6.2 Web

There is a lot of free material on the web, which I'll reference class by class. Because web pages are vanish so often (really!), I may cache some locally. We will start here:

https://hpc.llnl.gov/training/tutorials

7 Computer systems used

7.1 parallel.ecse

This course will primarily use (remotely via ssh) parallel.ecse.rpi.edu.

Parallel has:

  1. dual 14-core Intel Xeon E5-2660 2.0GHz

  2. 256GB of DDR4-2133 ECC Reg memory

  3. Nvidia GPUs:

    1. Quadro RTX 8000 with 48GB memory, 16 TFLOPS, and 4608 CUDA cores. This can do real time ray tracing.

    2. GeForce GTX 1080 with 8GB memory and 2569 CUDA cores.

  4. Samsung Pro 850 1TB SSD

  5. WD Red 6TB 6GB/s hard drive

  6. CUDA 10.1

  7. OpenMP 4.5

  8. Thrust

  9. Ubuntu 21.10

Material for the class is stored in /parallel-class/ .

7.2 Amazon EC2

We might also use parallel virtual machines on the Amazon EC2. If so, you will be expected to establish an account. I expect the usage to be in the free category.

7.3 Gradescope

Gradescope will be used for you to submit homeworks and for us to distribute grades.

7.4 WebEx

to deliver lectures, to discuss, etc.

7.5 LMS

This is only for distributing the computed (total) grades. (Gradescope has no way to upload computed grades.)

7.6 Others

Delivering good remote courses is a work in progress. If I learn of a new tool that looks useful, I might use it.

8 Assessment measures, i.e., grades

  1. There will be no exams.

  2. Each student will make several in-class presentations summarizing some relevant topic.

  3. There will be a term project.

8.1 Homeworks

There will be some homeworks.

You may do homeworks in teams of up to 3 students. Create a gradescope team and make one submission with all the names.

You may switch teams from homework to homework.

8.2 Term project

  1. For the latter part of the course, most of your homework time will be spent on a term project.

  2. You are encouraged do it in teams of any size. A team of 3 people would be expected to do twice as much work as 1 person.

  3. You may combine this with work for another course, provided that both courses know about this and agree. I always agree.

  4. If you are a grad student, you may combine this with your research, if your prof agrees, and you tell me.

  5. You may build on existing work, either your own or others'. You have to say what's new, and have the right to use the other work. E.g., using any GPLed code or any code on my website is automatically allowable (because of my Creative Commons licence).

  6. You will implement, demonstrate, and document something vaguely related to parallel computing.

  7. Deliverables:

    1. An implementation showing parallel computing.

    2. An extended abstract or paper on your project, written up like a paper. You should follow the style guide for some major conference (I don't care which, but can point you to one).

    3. A more detailed manual, showing how to use it.

    4. A 2-minute project proposal given to the class around the middle of the semester.

    5. A 10-minute project presentation and demo given to the class in the last week.

    6. Some progress reports.

    7. A write-up uploaded on the last class day. This will contain an academic paper, code and perhaps video or user manual.

  8. Size

    It's impossible to specify how many lines of code makes a good term project. E.g., I take pride in writing code that is can be simultaneously shorter, more robust, and faster than some others. See my 8-line program for testing whether a point is in a polygon: Pnpoly.

    According to Big Blues, when Bill Gates was collaborating with around 1980, he once rewrote a code fragment to be shorter. However, according to the IBM metric, number of lines of code produced, he had just caused that unit to officially do negative work.

9 Remote class procedures

Parallel will start virtual and then switch to in-person when RPI tells me to. I intend to lecture live, with some supplementary videos. If the format changes unexpectedly, I'll email everyone.

Since this is a small class, students will be encouraged to ask questions, live and with chat during the class. I'll try to save the chat window, but it may be deleted after each class. If you cannot watch the class live, then you may miss any info presented in the chat window.

Try to share your video when talking.

If you would like, I can try to set up a webex teams space for discussions between class.

I will maintain a class blog that briefly summarizes each class. Important points will be written down.

I will attempt to record all virtual classes and upload them to mediasite for students to watch or download later. However, this fall, about 10% of classes failed to record properly.

Depending on the class composition, since it increases learning, there might be graded in-class quizzes. That means, that I would like to do this. However if it causes too much trouble for too many students then I won't.

10 Early warning system (EWS)

As required by the Provost, we may post notes about you to EWS, for example, if you're having trouble doing homeworks on time, or miss an exam. E.g., if you tell me that you had to miss a class because of family problems, then I may forward that information to the Dean of Students office.

11 Academic integrity

See the Student Handbook for the general policy. The summary is that students and faculty have to trust each other. After you graduate, your most important possession will be your reputation.

Specifics for this course are as follows.

  1. You may collaborate on homeworks, but each team of 1 or 2 people must write up the solution separately (one writeup per team) using their own words. We willingly give hints to anyone who asks.

  2. The penalty for two teams handing in identical work is a zero for both.

  3. You may collaborate in teams of up to 3 people for the term project.

  4. You may get help from anyone for the term project. You may build on a previous project, either your own or someone else's. However you must describe and acknowledge any other work you use, and have the other person's permission, which may be implicit. E.g., my web site gives a blanket permission to use it for nonprofit research or teaching. You must add something creative to the previous work. You must write up the project on your own.

  5. However, writing assistance from the Writing Center and similar sources in allowed, if you acknowledge it.

  6. The penalty for plagiarism is a zero grade.

  7. Cheating will be reported to the Dean of Students Office.

12 Other rules required by RPI

You've seen them in your other classes. Assume that they're included here.

13 Student feedback

Since it's my desire to give you the best possible course in a topic I enjoy teaching, I welcome feedback during (and after) the semester. You may tell me or write me, or contact a third party, such as Prof James Lu, the ECSE undergrad head, or Prof John Wen, the ECSE Dept head.