About Data Ark

The Data Ark team downloads, organizes and performs quality assurance and quality control on the data. The team also manages the data access process, answers questions on the data, and updates to the latest versions of the data sets. The Data Ark is located on Minerva at /sc/arion/projects/data-ark/. This Mount Sinai data commons is guided by the FAIR principles [1]: making data more findable, accessible, interoperable and reusable. Data Ark includes both public (restricted and unrestricted) and Sinai-generated data sets.

The overarching goal of the Data Ark is to ensure that research data at Mount Sinai are managed, processed and combined in a way that optimizes the power, pace and relevance of our science.

  • Power: Scientists typically use only a tiny fraction of available data
  • Pace: Users will have rapid access to huge, powerful research data
  • Relevance: Our diverse patient population is ideal for testing the generalizability of our results

Data Ark is an initiative led by Associate Professor Paul O’Reilly and Dean for Scientific Computing and Data Patricia Kovatch, and supported by the Department of Genetics and Genomic Sciences and Scientific Computing. An advisory board has been convened to provide guidance and to help Data Ark become sustainable over time.

We are supported by grant UL1TR004419 from the National Center for Advancing Translational Sciences, National Institutes of Health.


Access Data Ark

For Public Unrestricted data sets, you can simply access the following path on Minerva:


For any other data sets, you must read, agree to, and sign the Data Use Agreement specific to the requested data set. Once the agreement has been submitted, as well as any evidence of approved permission for public restricted-use data, the Data Ark team will grant access within two working days. Users will receive email confirmation that access has been granted.

The Data Use Agreement is accessible only through the Mount Sinai campus network or secure remote VPN. Click here for the Data Use Agreement and choose the data set that you would like to access from the drop-down list. From here you can follow the link to view and agree to the specific Data Use Agreement. Users will need to login with your Sinai account and password and will be able to choose only one data set at a time.

For more information and for all inquiries relating to the Data Ark, please email: data-ark-team@lists.mssm.edu, or join our Data Ark Slack channel at https://join.slack.com/t/data-ark/signup and signup using your Mount Sinai credentials. You will be able to interact with the researchers and the Data Ark group right away! 


Data Ark User Feedback

We have asked Data Ark users for feedback on features and availability of data sets and solicit recommendations for improvement over time. Here are some specific recommendations and comments from Data Ark users:


Data Ark Support Materials

Scientific Computing and Data hosts Data Ark Town Hall sessions that are open to current and prospective Data Ark users. Here are the session archives:


Data Sets

The Data Ark is located on Minerva and the number, type, and diversity of data sets on the Data Ark are increasing on an ongoing basis.


Onboarding Data Ark Data Sets

PI’s must complete a REDCap form and name expected research groups. Approval process is regulated according to data set size:

  • =<1 TB: Data Ark operations team will approve
  • >1 TB: must be approved by the Data Ark Advisory Board

Data Retention period: The original data owner will receive usage reports every quarter and will be alerted when other researchers are not using their data sets. If usage is low, then the data sets will be removed from Data Ark. Usage is evaluated annually.

To read more information about the Data Ark Onboarding Policy, including data retention and contacts, please click the downloadable “Data Ark Onboarding/Offboarding Policy” PDF below.


Contact Data Ark Team

The Data Ark team manages the data, data access, and data updates. For all inquiries related to the Data Ark, especially to access or utilize data, please email: data-ark-team@lists.mssm.edu


Data Ark Slack Channel

Join our Data Ark Slack channel at https://join.slack.com/t/data-ark/signup and sign up using your Mount Sinai credentials. You will be able to interact with the researchers and the Data Ark group right away!