Contact Scientific Computing Team
The Mount Sinai Data Warehouse (MSDW) collects clinical and operational data for use in clinical and translational research, as well as quality and improvement initiatives. Scientific Computing’s mission is to enable and accelerate biomedical discovery through collaborative research using high performance computing and data science. Led by Dean for Scientific Computing Patricia Kovatch, the Scientific Computing team provides tools, resources, and assistance to researchers across institutions around the globe.
The MSDW provides researchers access to data on the over 12 million unique patients in the Mount Sinai Health System Epic Electronic Health Record (EHR), of which nearly 5.5 million of these patients have had an encounter. In total there are over 115 million patient encounters recorded in Epic. The Mount Sinai Data Warehouse (MSDW) ecosystem is Epic and OMOP-centric. The majority of the data collected by the MSDW comes from the Epic Clarity and Caboodle databases, as Epic is the primary electronic health record (EHR) across the Mount Sinai Health System (MSHS). The clinical data is extracted from the Epic Caboodle and Clarity databases, transformed to the OMOP Common Data Model (CDM) and is stored in the MSDW2 database, which is now located on the Minerva High Performance Computing cluster.
Click here to read more about data sources, transactions, and comprehensive descriptions of data in the Data Warehouse. MSDW is approved by the IRB.
Click here to learn more about data volume.
The MSDW provides researchers access to data on the over 12 million unique patients in the Mount Sinai Health System Epic EHR, of which nearly 5.5 million of these patients have had an encounter. In total there are over 115 million patient encounters recorded in Epic. This data is refreshed daily. MSDW’s core services include:
- Custom Queries: Work with MSDW analysts to compose custom SQL queries for research or QA. See the 10-step process.
- Leaf: Web-based, lightweight drag-and-drop query tool that quickly analyzes population demographics using ICD-10 codes. Launch Leaf
- Atlas: A web-based integrated platform for database exploration, standardized vocabulary browsing, cohort definition, and population-level analysis. Launch Atlas
- TriNetX: A global research network that optimizes clinical research. Launch TriNetX
Research Services enable you to collect and analyze your own data:
- eRAP: A 21 CFR Part 11 compliant web-based interactive tool for data entry and reporting. We rapidly develop custom databases for your longitudinal single and multi-site studies. Launch eRAP
- REDCap: secure web application for building and managing online surveys and databases. Launch RedCap
All of our systems and services are reviewed by IT Security and Compliance every year.
Minerva High Performance Computing Platform
The primary asset for Scientific Computing is the supercomputer Minerva. The HPC resource, upgraded in 2020, utilizes 14,304 Intel Gold 8168 24C, 2.7 GHz compute cores (48 cores per node with two sockets in each node), 286 nodes with 192 GB of memory per node, 65.7 terabytes of total memory, 350 terabytes of solid-state storage and nearly 30 petabytes of spinning storage accessed via IBM’s Spectrum Scale/General Parallel File System (GPFS) for a total of 1.2 petaflops of compute power. Minerva has contributed to over 1,200 peer-reviewed publications in seven years. Read more about Minerva and learn how to connect to Minerva for computing jobs.
The Mount Sinai Data Warehouse is supported by High Performance Computing. We partner with researchers to enable discovery, collaboration, research, and quality improvement.
Have a question, comment, or concern? Submit a ticket to let us know!