Hardware and Technical Specs

  • The Minerva supercomputer is maintained by Scientific Computing and Data (SCD) at the Icahn School of Medicine, Mount Sinai.
  • Minerva was created in 2012 and has been upgraded several times (most recently in Nov. 2024) and has over 11 petaflops of compute power.
  • It consists of 24,912 Intel Platinum processors in different generations including 2.3GHz, 2.6 GHz, and 2.9 GHz compute cores (96 cores or 64 cores or 48 cores per node with two sockets in each node) with 1.5TB of memory per node. In addition, Minerva also includes 236 H100 Nvidia graphical processing units (GPUs), 32 L40S Nvidia servers, 40 A100 Nvidia GPUs, 48 V100 Nvidia GPUs, 440 terabytes of total memory, 40 petabytes of spinning storage accessed via IBM’s Spectrum Scale/General Parallel File System (GPFS).
  • Minerva has contributed to over 1,800 peer-reviewed publications since 2012.

The following diagram shows the overall Minerva configuration:

Compute Nodes

Chimera Partition

Added in Nov. 2024. Nodes purchased prior to 2024 and integrated to new NDR network via HDR 100Gb/s.
  • 4 login nodes – Intel Emerald Rapids 8568Y+, 2.3GHz – 96 cores with 512 GB memory per node.
  • 146 compute nodes* – Intel Emerald Rapids 8568Y+, 2.3GHz– 96 cores with 1.5 TB memory per node.
    • 14,016 cores in total.
  • 188 H100 in 47 nodes – Intel ER 8568Y+, 2.3GHz– 96 cores with 1.5 TB memory per node.
    • 4 x H100-80GB(SXM5) NVLinked GPUs per node.
  • 32 L40s GPUs in 4 nodes – AMD Genoa 9334 2.7GHz – 64 cores with 1.5TB memory per node.
    • 8x L40s-48GB GPUs per node. L40s doesn’t support FP64.
  • 3.84 TB Local NVME SSD (3.5TB usable) per node.
    • It can deliver a sustained read-write speed of 3.5 GB/s in contrast with SATA SSDs that limit at 600 MB/s.
  • NDR InfiniBand fat tree fabric networking (400Gb/s).
    • 6 service nodes.
    • 295.5 TB memory in total.
    • Direct water-cooling solution.
    • New NFS storage (for users’ home directories) –140 TB usable.
  • 33 high memory nodes – Intel 8268 24C, 2.9GHZ – 1.5 TB memory.
  • 48 V100 GPUs in 12 node – Intel 6142 16C, 2.6GHz – 384 GB memory – 4x V100-16 GB GPU.
  • 32 A100 GPUs in 8 nodes – Intel 8268 24C, 2.9GHz – 384 GB memory – 4x A100-40 GB GPU.  1.92TB SSD (1.8 TB usable) per node.
  • 8 A100 GPUs in 2 nodes – Intel 8358 32C, 2.6GHz – 2 TB memory – 4x A100-80 GB GPU.
    • A100 is connected via NVLink.
    • 7.68 TB NVMe SSD (7.0TB usable) per node.
  • 8 H100 GPUs in 2 nodes – Intel 8358 32C, 2.6 GHz – 0.5 TB memory – 4xH100-80GB GPU.
    • 3.84 TB NVMe SSD (3.5 TB usable) per node.
  • [Decommissioned on Nov. 5th 2024] 4 login nodes – Intel Xeon(R) Platinum 8168 24C, 2.7GHz – 384 GB memory.
  • [Decommissioned on July 17th and Nov. 5th 2024] 275 compute nodes* – Intel 8168 24C, 2.7GHz – 192 GB memory.
    • 13,152 cores (48 cores per node).
  • *Compute Node —where you run your applications. Users do not have direct access to these machines. Access is managed through the LSF job scheduler.

 

 

  •  

BODE2 Partition

CATS Partition

Private Nodes

[Decommissioned on July 17th and Nov. 5th, 2024].

$2M S10 BODE2 awarded by NIH (Kovatch PI).

  • 3,744 48-core 2.9 GHz Intel Cascade Lake 8268 processors in 78 nodes.

$2M CATS awarded by NIH (Kovatch PI).

  • 3,520 cores in 55 nodes- Intel IceLake 8358, 26GHz – 64 cores with 1.5 TB memory per node.
  • 82.5 TB memory (collectively).
  • Open to eligible NIH funded projects.
  • Purchased by private groups.
  • Hosted on Minerva.

 

 

 

 

Summary

Total system memory (computes + GPU) = 440 TB  Total number of cores (computes + GPU) = 24,912 cores
CPU Peak performance of all nodes = > 1.8 PFLOPS H100 Peak performance based FP64 Tensor cores = 12.5 PFLOPS.
Max performance from HPL LINPACK run is 7.9 PFLOPS. Private nodes are not counted in the calculation.

 

File System Storage

Minerva uses IBM’s General Parallel File System (GPFS) because it has advantages that are specifically useful for informatics workflows that involve high speed metadata access, tiered storage, and sub-block allocation. Metadata is the information about the data in the file system, and it is stored in flash memory for fast access. A parallel file system was used for Minerva because NFS and other file systems cannot scale to the number of nodes or provide performance for the large number of files involved in typical genomics workflows. 

Currently we have one parallel file system on Minerva, Arion, which users can access at /sc/arion. The Hydra file system was retired at the end of 2020.

GPFS Name Lifetime Storage Type Raw PB Usable PB
Arion 2019 – Lenovo DSS 14 9.6
Arion 2019 – Lenovo G201 flash 0.12 0.12
Arion 2020 – Lenovo DSS 16 11.2
Arion 2021 – Lenovo DSS 16 11.2
    Total 46 32

 

 

Acknowledging Mount Sinai in Your Work

This work was supported by grant UL1TR004419 from the National Center for Advancing Translational Sciences, National Institutes of Health.

Using the S10 BODE and CATS Minerva partitions requires acknowledgements of support by NIH in your publications. To assist, we have provided exact wording of acknowledgements required by NIH for use in publications and other work. Click here to learn how to acknowledge Minerva and NIH support in your publications.