Skip to main content

PAR Grad Syllabus

1   Catalog info

Title: ECSE-6xxx Independent study: Parallel, GPU, quantum comput.
Semesters: Spring, on demand
Credits: 3 credit hours
Time and place: Mon and Thurs noon-1:20pm, JONSSN 4107

2   Description

  1. This is graduate individual study course.
  2. This is intended to be a computer engineering course to provide students with knowledge and hands-on experience in developing applications software for affordable parallel processors. This course will mostly cover hardware that any lab can afford to purchase. It will cover the software that, in the prof's opinion, is the most useful. There will also be considerable theory.
  3. The target audience is ECSE grads and others with comparable background who wish to learn the theory and then to develop parallel software.
  4. This course will have minimal overlap with parallel courses in Computer Science. We will not teach the IBM BlueGene, because it is so expensive, nor cloud computing and MPI, because most big data problems are in fact small enough to fit on our hardware.
  5. You may usefully take all the parallel courses at RPI.
  6. This unique features of this course are as follows:
    1. For 2/3 of the course, use of only affordable hardware that any lab might purchase, such as Nvidia GPUs. This is currently the most widely used and least expensive parallel platform.
    2. Emphasis on learning several programming packages, at the expense of theory. However you will learn a lot about parallel architecture.
  7. Hardware taught, with reasons:
    Multicore Intel Xeon:
      universally available and inexpensive, comparatively easy to program, powerful
    Nvidia GPU accelerator:
      widely available (Nvidia external graphics processors are on 1/3 of all PCs), very inexpensive, powerful, but harder to program. Good cards cost only a few hundred dollars.
    IBM quantum computer:
      It's new and hot and might some day be useful.
  8. Software that might be taught, with reasons:
    OpenMP C++ extension:
      widely used, easy to use if your algorithm is parallelizable, backend is primarily multicore Xeon but also GPUs.
    Thrust C++ functional programming library:
      FP is nice, hides low level details, backend can be any major parallel platform.
    MATLAB, Mathematica:
      easy to use parallelism for operations that they have implemented in parallel, etc.
    CUDA C++ extension and library for Nvidia:
      low level access to Nvidia GPUs.
  9. The techniques learned here will also be applicable to larger parallel machines -- numbers 1 and 2 on the top 500 list use NVIDIA GPUs. (Number 12 is a BlueGene.)
  10. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors.

3   Prerequisite

ECSE-2660 CANOS or equivalent, knowledge of C++.

4   Instructors

4.1   Professor

W. Randolph Franklin. BSc (Toronto), AM, PhD (Harvard)

Office:

Jonsson Engineering Center (JEC) 6026

Phone:

+1 (518) 276-6077 (forwards)

Email:

frankwr@YOUKNOWTHEDOMAIN

Email is my preferred communication medium.

Sending from non-RPI accounts are fine, but please show your name, at least in the comment field. A subject prefix of #Prob is helpful. GPG encryption is fine.

Web:

https://wrf.ecse.rpi.edu/

A quick way to get there is to google RPIWRF.

Office hours:

After each lecture, usually as long as anyone wants to talk. Also by appointment.

Informal meetings:
 

If you would like to lunch with me, either individually or in a group, just mention it. We can then talk about most anything legal and ethical.

4.2   Teaching assistant

Elkin Cruz, cruzce@THEUSUALDOMAIN

Office hour: TBA

5   Course websites

The homepage has lecture summaries, syllabus, homeworks, etc.

6   Reading material

6.1   Text

There is no required text, but the following inexpensive books may be used. I might mention others later.

  1. Sanders and Kandrot, CUDA by example. It gets excellent reviews, although it is several years old. Amazon has many options, including Kindle and renting hardcopies.
  2. Kirk and Hwu, 2nd edition, Programming massively parallel processors. It concentrates on CUDA.

One problem is that even recent books may be obsolete. For instance they may ignore the recent CUDA unified memory model, which simplifies CUDA programming at a performance cost. Even if the current edition of a book was published after unified memory was released, the author might not have updated the examples.

6.2   Web

There is a lot of free material on the web, which I'll reference class by class. Because web pages are vanish so often (really!), I may cache some locally. If interested, you might start here:

https://hpc.llnl.gov/training/tutorials

7   Computer systems used

7.1   parallel.ecse

This course will primarily use (remotely via ssh) parallel.ecse.rpi.edu.

Parallel has:

  1. dual 14-core Intel Xeon E5-2660 2.0GHz
  2. 256GB of DDR4-2133 ECC Reg memory
  3. Nvidia GPUs: #. Quadro RTX 8000 with 48GB memory, 16 TFLOPS, and 4608 CUDA cores. This can do real time ray tracing. #. GeForce GTX 1080 with 8GB memory and 2569 CUDA cores.
  4. Samsung Pro 850 1TB SSD
  5. WD Red 6TB 6GB/s hard drive
  6. CUDA 10.1
  7. OpenMP 4.5
  8. Thrust
  9. Ubuntu 19.10

Material for the class is stored in /parallel-class/ .

7.2   Amazon EC2

We might also use a parallel virtual machine on the Amazon EC2. If so, you will be expected to establish an account. I expect the usage to be in the free category.

7.3   Piazza

Piazza will be available for discussion and questions.

7.4   Gradescope

Gradescope will be used for you to submit homeworks and for us to distribute grades.

The entry code for this course is 9DYRB2.

Please add yourself.

8   Assessment measures, i.e., grades

  1. There will be no exams.
  2. There might be some homeworks.
  3. Each student will make 2 or 3 in-class presentations summarizing some relevant topic.
  4. There will be a term project that will include a major research paper.
  5. A blog will be required.

8.1   Research blog

Students will be required to perform individual research into parallel computing principles and to record their findings as weekly entries in a personal blog, a sort of online lab book.

Possible platforms include:

  1. Mathmatica notebook
  2. Jupyter book
  3. Google doc
  4. icloud doc

8.2   Term project

  1. For the latter part of the course, most of your homework time will be spent on a term project.

  2. You are encouraged do it in teams of up to 3 people. A team of 3 people would be expected to do twice as much work as 1 person.

  3. You may combine this with work for another course, provided that both courses know about this and agree. I always agree.

  4. If you are a grad student, you may combine this with your research, if your prof agrees, and you tell me.

  5. You may build on existing work, either your own or others'. You have to say what's new, and have the right to use the other work. E.g., using any GPLed code or any code on my website is automatically allowable (because of my Creative Commons licence).

  6. You will implement, demonstrate, and document something vaguely related to parallel computing.

  7. Deliverables:

    1. An implementation showing parallel computing.
    2. A major research paper on your project. You should follow the style guide for some major conference (I don't care which, but can point you to one).
    3. A more detailed manual, showing how to use it.
    4. A 2-minute project proposal given to the class around the middle of the semester.
    5. A 10-minute project presentation and demo given to the class in the last week.
    6. Some progress reports.
    7. A write-up uploaded on the last class day. This will contain an academic paper, code and perhaps video or user manual.
  8. Size

    It's impossible to specify how many lines of code makes a good term project. E.g., I take pride in writing code that is can be simultaneously shorter, more robust, and faster than some others. See my 8-line program for testing whether a point is in a polygon: Pnpoly.

    According to Big Blues, when Bill Gates was collaborating with around 1980, he once rewrote a code fragment to be shorter. However, according to the IBM metric, number of lines of code produced, he had just caused that unit to officially do negative work.

9   Early warning system (EWS)

As required by the Provost, we may post notes about you to EWS, for example, if you're having trouble doing homeworks on time, or miss an exam. E.g., if you tell me that you had to miss a class because of family problems, then I may forward that information to the Dean of Students office.

10   Academic integrity

See the Student Handbook for the general policy. The summary is that students and faculty have to trust each other. After you graduate, your most important possession will be your reputation.

Specifics for this course are as follows.

  1. You may collaborate on homeworks, but each team of 1 or 2 people must write up the solution separately (one writeup per team) using their own words. We willingly give hints to anyone who asks.
  2. The penalty for two teams handing in identical work is a zero for both.
  3. You may collaborate in teams of up to 3 people for the term project.
  4. You may get help from anyone for the term project. You may build on a previous project, either your own or someone else's. However you must describe and acknowledge any other work you use, and have the other person's permission, which may be implicit. E.g., my web site gives a blanket permission to use it for nonprofit research or teaching. You must add something creative to the previous work. You must write up the project on your own.
  5. However, writing assistance from the Writing Center and similar sources in allowed, if you acknowledge it.
  6. The penalty for plagiarism is a zero grade.
  7. Cheating will be reported to the Dean of Students Office.

11   Student feedback

Since it's my desire to give you the best possible course in a topic I enjoy teaching, I welcome feedback during (and after) the semester. You may tell me or write me, or contact a third party, such as Prof James Lu, the ECSE undergrad head, or Prof John Wen, the ECSE Dept head.