Jacobi
Functions
CUDA_Aware_MPI.c File Reference

The implementation details for the CUDA-aware MPI version. More...

#include "Jacobi.h"
Include dependency graph for CUDA_Aware_MPI.c:

Functions

void SetDeviceBeforeInit ()
 This allows the MPI process to set the CUDA device before the MPI environment is initialized For the CUDA-aware MPI version, the is the only place where the device gets set. In order to do this, we rely on the node's local rank, as the MPI environment has not been initialized yet.
void SetDeviceAfterInit (int rank)
 This allows the MPI process to set the CUDA device after the MPI environment is initialized For the CUDA-aware MPI version, there is nothing to be done here.
void ExchangeHalos (MPI_Comm cartComm, real *devSend, real *hostSend, real *hostRecv, real *devRecv, int neighbor, int elemCount)
 Exchange halo values between 2 direct neighbors This is the main difference between the normal CUDA & MPI version and the CUDA-aware MPI version. In the former, the exchange first requires a copy from device to host memory, an MPI call using the host buffer and lastly, a copy of the received host buffer back to the device memory. In the latter, the host buffers are completely skipped, as the MPI environment uses the device buffers directly.

Detailed Description

The implementation details for the CUDA-aware MPI version.


Function Documentation

void ExchangeHalos ( MPI_Comm  cartComm,
real *  devSend,
real *  hostSend,
real *  hostRecv,
real *  devRecv,
int  neighbor,
int  elemCount 
)

Exchange halo values between 2 direct neighbors This is the main difference between the normal CUDA & MPI version and the CUDA-aware MPI version. In the former, the exchange first requires a copy from device to host memory, an MPI call using the host buffer and lastly, a copy of the received host buffer back to the device memory. In the latter, the host buffers are completely skipped, as the MPI environment uses the device buffers directly.

Parameters:
[in]cartCommThe carthesian MPI communicator
[in]devSendThe device buffer that needs to be sent
[in]hostSendThe host buffer where the device buffer is first copied to (not needed here)
[in]hostRecvThe host buffer that receives the halo values (not needed here)
[in]devRecvThe device buffer that receives the halo buffers directly
[in]neighborThe rank of the neighbor MPI process in the carthesian communicator
[in]elemCountThe number of elements to transfer
void SetDeviceAfterInit ( int  rank)

This allows the MPI process to set the CUDA device after the MPI environment is initialized For the CUDA-aware MPI version, there is nothing to be done here.

Parameters:
[in]rankThe global rank of the calling MPI process
 All Files Functions Defines