General Purpose Graphics Processor Unit (GPGPU)

NVIDIA has made the computing engine of their graphics processing units (GPU‘s) accessible to programmers. This engine called the Compute Unified Device Architecture, or CUDA for short is a parallel computing architecture developed by NVIDIA used to render images that are to be displayed on a computer screen. These devices can be programmed using a variant of C or C++ and the NVIDIA compiler. While the CUDA engine is accessible in virtually all of NVIDIA’s newer graphics cards, GPU‘s, they also make specially purposed devices that are used exclusively for computation called GPGPU‘s.


Minerva has one GPU node:

1 node with 4 NVIDIA P100 GPUs, 20 Intel Broadwell cores and 128 GB of memory.

These devices can speed up the execution of certain codes almost three orders of magnitude if there is extensive vector content but also can slow down execution if there is very little vector content.


How to access:

Researchers that have programs that can utilize the GPGPU’s can access the appropriate nodes by specifying:

-n Ncpu
-R "p100"

Where ‘Ncpu‘ is the total number of CPU cores. The number of cores per node should be or 1 through 20 on P100 GPU nodes. Our experience is that the GPGPU’s get saturated around 10 per card, program dependent of course. The number of GPUs requested per node is by 1 by default. This can be changed with the parameter ngpus_excl_p=N, where N is the number of GPUs per node. As an example, -R "rusage[ngpus_excl_p=2]" would request two GPUs.

One must also load in the cuda toolkit to access the drivers and supporting subroutines:

module load cuda


Further Information:
CUDA programming
Available applications