Scientific Computing and Data
Partnering with researchers to advance scientific discovery

Get Started
Led by Dean for Scientific Computing Patricia Kovatch, the Scientific Computing team partners with researchers to advance science by providing tools, resources, and assistance to researchers across institutions around the globe.
Advance your Science
Utilize our services to facilitate your research, data, or compute power.

High Performance Computing
The primary asset for Scientific Computing is the supercomputer, Minerva:
- Upgraded: November 2024.
- Computing power: >11 petaflops.
- 24,912 Intel Platinum processors in different generations
- 2.3GHz, 2.6 GHz, and 2.9 GHz computing cores with 1.5 terabytes (TB) of memory per node
- 96 cores or 64 cores or 48 cores per node with two sockets in each node
- 356 graphics processing units (GPUs)
- 236 Nvidia H100 GPUs, 32 Nvidia L40S servers, 40 Nvidia A100 GPUs, and 48 Nvidia V100 GPUs
- 440 TB of total memory
- 32 petabytes of spinning storage accessed via IBM’s Spectrum Scale/General Parallel File System (GPFS)
- Contributed to >1,900 peer-reviewed publications since 2012.

AIR·MS (Artificial Intelligence-Ready Mount Sinai)
AIR·MS is a cloud-based, multi-modal health data platform that integrates patient data generated from different clinical departments across Health Systems. At this time, the following datasets are offered on the platform:
Mount Sinai Data Warehouse (MSDW)
- Uses the OMOP Common Data Model.
- Clinical data extracted from Mount Sinai’s Epic Caboodle database.
- Both identifiable and de-identified versions of the MSDW dataset are offered in AIR·MS.
Clinical Notes
- Clinical notes in the form of unstructured data (progress notes, telephone encounters, nursing notes, procedures, etc.)
- In-memory full-text search capabilities (SAP HANA database management) have been enabled to empower researchers to build patient cohorts based on terms contained in unstructured reports in seconds or even milliseconds!
Computational Pathology
- The pathology metadata will aid researchers in the field of Computational Pathology.
- Researchers will be able to build their patient cohort across data modalities and subsequently apply quantitative methods for the analysis of digital microscopy slides relating the resulting statistical descriptors to patient outcomes.

Research Data Services
We partner with scientists to conduct their research via independent data collection, capture, and analysis.
- eRAP is a web-based interactive tool for data entry and reporting. Custom databases are rapidly developed for longitudinal single and multi-site studies.
- REDCap* is a secure web application for building and managing online surveys and databases. Track data manipulation, user activity, export procedures, scheduling, calendaring, and branching logic.
- Data Ark utilizes FAIR principles to provide regularly updated high-quality data sets for reusability in research.
All applications are HIPAA compliant.
*Please note that our current build of REDCap is not 21 CFR 11 compliant.

Mount Sinai Data Warehouse
The Mount Sinai Data Warehouse provides researchers access to data on over 12 million patient records in the Mount Sinai Health System Epic Electronic Health Record (EHR).
- In total there are over 105 million patient encounters recorded in Epic
- The MSDW ecosystem is Epic and OMOP-centric
- Clinical data is extracted from the Epic Caboodle and Clarity databases, transformed to the OMOP Common Data Model (CDM)
- Data is located on the Minerva High Performance Computing cluster
Request your own custom data set or utilize self-service query tools to search the data.
Supported by grant UL1TR004419 from the National Center for Advancing Translational Sciences, National Institutes of Health.