GPGPU

 

General Purpose Graphics Processor Unit (GPGPU)

NVIDIA has made the computing engine of their graphics processing units (GPU‘s) accessible to programmers. This engine is technically called the Compute Unified Device Architecture, CUDA for short, is a parallel computing architecture developed by NVIDIA used to render images that are to be displayed on a computer screen. These devices can be programmed using a variant of C or C++ and the NVIDIA compiler. While the CUDA engine is programmer accessible in virtually all of NVIDIA’s newer graphics cards, GPU‘s, they also make specially purposed devices that are used exclusively for computation called GPGPU or, more commonly, just GPU.

 

Minerva has a total of 20 nodes configured with 4 NVIDIA GPU cards each:

  • 12 Intel nodes each with 32 cores, 384GiB RAM, and 4 NVIDIA V100 GPU cards with 16GB memory on each card (see more)
  • 8 Intel nodes each with 48 cores, 384GiB RAM, and 4 NVIDIA A100 GPU cards with 40GB memory on each card (see more)

 

Accessing the GPU Nodes

The GPU nodes must be accessed by way of a queued job. There are nodes on the interactive and gpu queues. Therefore, your job must specify either:

-q interactive

or

-q gpu

To specify which type of GPU you want:

-R v100

or

-R a100

If the job is submitted to the interactive queue, make sure you specify the model GPU that is in that queue. Otherwise, your job will stall in the queue.

 

In addition, the number of GPU cards on each node to be allocated to your job must be specified using the LSF rusage specification, e.g.:

 

-R rusage[ngpus_excl_p=1]       # For 1 GPU card

-R rusage[ngpus_excl_p=3]        # For 3 GPU cards

 

If your program needs to know which GPU cards have been allocated to your job (not common), LSF sets the CUDA_VISIBLE_DEVICES environment variable to specify which cards have been assigned to your job.

 

Supplemental Software

One almost certainly will need auxiliary software to utilize the GPUs. Most likely the CUDA libraries from NVIDA and perhaps the CUDNN libraries. There are several versions of each on Minerva. Use:

 

ml avail cuda

and/or

ml avail cudnn

 

to determine which versions are available for loading.

For developers, there are a number of CUDA accelerated libraries available for download from NVIDIA.

 

Interactive Submission

Minerva sets aside a number of GPU enabled nodes to be accessed via the interactive queue. This number is changed periodically based on demand but is always a small number, e.g, 1 or 2. The number and type of GPU will be posted in the announcement  section of the home page.

 

To open an interactive session on one of these nodes:

bsub -P acc_xxx -q interactive -n 1 -R v100 -R rusage[ngpus_excl_p=1] -W 01:00 -Is /bin/bash

Alternatively, one can open an interactive session on one of the batch GPU nodes. This is particularly useful if the interactive nodes do not have the model GPU you would like to use:

bsub -P acc_xxx -q gpu -n 1 -R a100 -R rusage[ngpus_excl_p=1] -W 01:00 -Is /bin/bash

 

Batch Submission

Batch submission is a straightforward specification of the GPU related bsub options in your LSF script.

bsub < test.lsf

Where test.lsf is something like:

#BSUB -q gpu
#BSUB -R a100
#BSUB -R rusage[ngpus_excl_p=1]
#BSUB -n 1
#BSUB -W 4
#BSUB -P acc_xxx
#BSUB -oo test.out

ml cuda
ml cudnn

echo “salve mundi”

 

Accessing the Local SSD on the a100 GPU Nodes

To take advantage of the local 1.8 TB SSD, request the resource using the rusage specification, for example:

-R “rusage[ssd_gb=1000]”

This example will allocate 1000GB of dedicated SSD space to your job.

 

We would advise you to make your ssd_rg request to be <= 1,500 (1.5T).

 

The slink /ssd points to the ssd storage. You can specify /ssd in your job script and direct your temporary files there. At the end of your job script, please remember to clean up your temporary files.

 

Further Information:

CUDA programming http://www.nvidia.com/object/What-is-GPU-Computing.html
Available applications http://www.nvidia.com/docs/IO/123576/nv-applications-catalog-lowres.pdf