PAR Class 11, Wed 2018-04-11
Table of contents
1 Intel Xeon Phi 7120A
1.1 In general
-
The Xeon Phi is Intel's brand name for their MIC (for Many Integrated Core Architecture).
-
The 7120a is Intel's Knights Landing (1st generation) MIC architecure, launched in 2014.
-
It has 61 cores running about 244 threads clocked at about 1.3GHz.
Having several threads per core helps to overcome latency in fetching stuff.
-
It has 16GB of memory accessible at 352 GB/s, 30BM L2 cache, and peaks at 1TFlops double precision.
-
It is a coprocessor on a card accessible from a host CPU on a local network.
-
It is intended as a supercomputing competitor to Nvidia.
-
The mic architecture is quite similar to the Xeon.
-
However executables from one don't run on the other, unless the source was compiled to include both versions in the executable file.
-
The mic has been tuned to emphasize floating performance at the expense of, e.g., speculative execution.
This helps to make it competitive with Nvidia, even though Nvidia GPUs can have many more cores.
-
Its OS is busybox, an embedded version of linux.
-
The SW is called MPSS (Manycore Platform Software Stack).
-
The mic can be integrated with the host in various ways that I haven't (yet) implemented.
- Processes on the host can execute subprocesses on the device, as happens with Nvidia CUDA.
- E.g., OpenMP on the host can run parallel threads on the mic.
- The mic can page virtual memory from the host.
-
The fastest machine on top500.org a few years ago used Xeon Phi cards.
The 2nd used Nvidia K20 cards, and the 3rd fastest was an IBM Bluegene.
So, my course lets you use the 2 fastest architectures, and there's another course available at RPI for the 3rd.
-
Information:
- https://en.wikipedia.org/wiki/Xeon_Phi
- http://ark.intel.com/products/80555/Intel-Xeon-Phi-Coprocessor-7120A-16GB-1_238-GHz-61-core
- http://www.intel.com/content/www/us/en/products/processors/xeon-phi/xeon-phi-processors.html
- http://www.intel.com/content/www/us/en/architecture-and-technology/many-integrated-core/intel-many-integrated-core-architecture.html
- https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss
- https://pleiades.ucsc.edu/hyades/MIC_QuickStart_Guide
1.2 parallel.ecse's mic
-
The hostname (of this particular MIC) is parallel-mic0 or mic0.
-
The local filesystem is in RAM and is reinitialized when mic0 is rebooted.
-
Parallel:/home and /parallel-class are NFS exported to mic0.
-
/home can be used to move files back and forth.
-
All the user accounts on parallel were given accounts on mic0.
-
You can ssh to mic0 from parallel.
-
Your current parallel ssh key pair should work.
-
Your parallel login password as of a few days ago should work on mic0.
However, future changes to your parallel password will not propagate to mic0 and you cannot change your mic0 password.
(The mic0 setup snapshotted parallel's accounts and created a read-only image to boot mic0 from. Any changes to mic0:/etc/shadow are reverted when mic0 reboots.)
So use your public key.
1.3 Programming the mic
-
Parallel:/parallel-class/mic/bin has versions of gcc, g++, etc, with names like k1om-mpss-linux-g++ .
-
These run on parallel and produce executable files that run (only) on mic0.
-
Here's an example of compiling (on parallel) a C program in /parallel-class/mic
bin/k1om-mpss-linux-gcc hello.c -o hello-mic
-
Run it thus from parallel (it runs on mic0):
ssh mic0 /parallel-class/mic/hello-mic
2 Intel compilers on parallel
Note: currently they don't work, but maybe soon..
-
Intel Parallel Studio XE Cluster 2017 is now installed on parallel, in /opt/intel/ . It is a large package with compilers, debuggers, analyzers, MPI, etc, etc. There is is extensive doc on Intel's web site. Have fun.
Students and free SW developers can get also free licenses for their machines. Commercial licenses cost $thousands.
-
What’s Inside Intel Parallel Studio XE 2017. There're PDF slides, a webinar, and training stuff.
-
https://goparallel.sourceforge.net/want-faster-code-faster-check-parallel-studio-xe/
-
Before using the compiler, you should initialize some envars thus:
source /opt/intel/bin/iccvars.sh -arch intel64
-
Then you can compile a C or C++ program thus:
icc -openmp -O3 foo.c -o foo icpc -qopenmp -O3 foo.cc -o foo
-
On my simple tests, not using the mic, icpc and g++ produced equally good code.
-
Compile a C++ program with OpenMP thus:
icpc -qopenmp -std=c++11 omp_hello.cc -o omp_hello
-
Test it thus, it is in /parallel-class/mic
OMP_NUM_THREADS=4 ./omp_hello
Note how the output from the various threads is mixed up.
3 Programming the MIC (ctd)
-
It turns out that I (but not you) can update a login password on mic0, but it's a little tedious. Use your ssh key.
Details: at startup, mic0:/etc is initialized from parallel:/var/mpss/mic0/etc So I could edit shadow and insert a new encrypted password.
So is /home, but it's then replaced by the NFS mount.
-
I can also change, e.g., your login shell. Use bash on mic0 since zsh does not exist there.
-
Dr.Dobb's: Programming the Xeon Phi by Rob Farber.
-
Some MIC demos .
-
To cross compile with icc and icpc, see the MPSS users guide, Section 8.1.4. Use the -mmic flag.
-
Intel OpenMP* Support Overview.
-
New Era for OpenMP*: Beyond Traditional Shared Memory Parallel Programming.
-
Book: Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors, 2nd Edition [508 Pages].
-
See also https://www.cilkplus.org/
-
Rolling Out the New Intel Xeon Phi Processor at ISC 2016 https://www.youtube.com/watch?v=HDPYymREyV8
-
Supermicro Showcases Intel Xeon Phi and Nvidia P100 Solutions at ISC 2016 https://www.youtube.com/watch?v=nVWqSjt6hX4
-
Insider Look: New Intel® Xeon Phi™ processor on the Cray® XC™ Supercomputer https://www.youtube.com/watch?v=lkf3U_5QG_4
4 Mic0 (Xeon Phi) setup
- mic0 is the hostname for the Xeon Phi coprocessor.
- It's logically a separate computer on a local net named mic0, which is established when the mpss service starts.
- It's accessible only from parallel.
- It has its own filesystem, accounts, etc.
- On parallel, /opt and /home are exported by being listed in /etc/exports
- mic0 has no disk; its root partition is in memory.
- parallel:/var/mpss/mic0/ is copied as mic0's root partition.
- This is copied when the mic boots.
- That is 1-way; changes done on mic0 are not copied back.
- Changes to parallel:/var/mpss/mic0/ are not visible to the mic until it reboots.
- Accounts on parallel have home dirs on mic0.
- I add the accounts themselves by copying the relevant lines from parallel:/etc/{passwd,shadow,group} to /var/mpss/mic0/etc/{passwd,shadow,group}.
- parallel is running an old linux kernel, 4.4.55, because the required kernel module, mic.ko, wouldn't compile with a newer kernel, even 4.8.
- The proximate problem is that newer kernels have secure boot, where modules need to be validated.
- This can allegedly be disabled, but that didn't work.
- I also tried to create a certificate authorizing mic.ko, but that didn't work.
- It's possible that sufficient work might get a newer kernel to work. However I spent far too much time getting it to this point.
- FYI, parallel:/opt/mpss/ has some relevant files.
- parallel:/opt/intel has useful Intel compilers and tools, most of which I haven't gotten working yet. If anyone wants to do this, welcome.
- On parallel, micctrl -s gives the mic's state.
- micctrl has other rooty options.
- On parallel, root controls the mic with service mpss start/stop
- The mpss service should start automatically when parallel reboots.
5 Boinc
- I've installed BOINC on parallel.ecse.
- Currently, it's running MilkyWay@Home.
- The command boincmgr gives its status.
- Would you like to play with it?
- If you want to run timing tests for other SW on parallel, tell me; I'll disable it.
6 Quantum physics talk at 4pm today
The Fascinating Quantum World of Two-dimensional Materials: Symmetry, Interaction and Topological Effects
Symmetry, interaction and topological effects, as well as environmental screening, dominate many of the quantum properties of reduced-dimensional systems and nanostructures. These effects often lead to manifestation of counter-intuitive concepts and phenomena that may not be so prominent or have not been seen in bulk materials. In this talk, I present some fascinating physical phenomena discovered in recent studies of atomically thin two-dimensional (2D) materials. A number of highly interesting and unexpected behaviors have been found – e.g., strongly bound excitons (electron-hole pairs) with unusual energy level structures and new topology-dictated optical selection rules, massless excitons, tunable magnetism and plasmonic properties, electron supercollimation, novel topological phases, etc. – adding to the promise of these 2D materials for exploration of new science and valuable applications.
Steven G. Louie, Physics Department, University of California at Berkeley, and Lawrence Berkeley National Lab
Darrin Communications Center (DCC) 337 4:00 pm
Announcement (link will decay soon.)