Alumni

Former Ma’ayan Lab PhD students, postdoctoral fellows, bioinformaticians, and computational systems biologists have authored or co-authored publications in leading journals, including Nature, Science, PNAS, Nature Genetics, Nature Medicine, Bioinformatics and Science Signaling. The Ma’ayan Lab alumni have gone to leadership positions in industry and have also been recruited to academic positions. Several of our bioinformatics and computational systems biology research trainees have entered PhD programs at Icahn School of Medicine at Mount Sinai, Princeton, MIT, Columbia, Duke, Cornell, and University of Pennsylvania.

MD/PhD Student

Alexandra Keenan, PhD
Training Period in Lab: 2017-2019
Thesis: Inferring Transcriptional Regulation by Data Integration

PhD Students

Zichen Wang, PhD
Training Period in Lab: 2013-2016
Thesis: Methods for Collecting, Improving and Exploring Mammalian Gene Expression Signatures

duan_headshot
Qiaonan Duan, PhD
Training Period in Lab: 2012-2016
Thesis: Harnessing the LINCS L1000 Data for Drug Discovery and Drug Response Analysis

yan_kou.fw
Yan Kou, PhD
Training Period in Lab: 2011-2015
Thesis: Big Data Analytics for Understanding Mechanism of Human Disease

h_xu.fw
Huilei Xu, PhD
Training Period in Lab: 2009-2012
Thesis: Analysis of Transcriptional Networks in mESCs

Master’s Student

Eryk Kropiwnicki, MS
Training Period in Lab: 2019-2020

Postdoctoral Fellows

Minji Jeon, PhD
Training Period in Lab: 2020-2022

Kathleen Jagodnik, PhD
Training Period in Lab: 2016-2021

Nicolas Fernandez, PhD
Training Period in Lab: 2013-2017

rouillard_headshot
Andrew Rouillard, PhD
Training Period in Lab: 2013-2016
Research Focus: Inferring Cell Signaling Pathways from LINCS

clark_headshot
Neil Clark, PhD
Training Period in Lab: 2011-2015
Research Focus: Statistical Methods for Network Analysis in Systems Biology

b_macarthur.fw
Ben MacArthur, PhD
Training Period in Lab: 2008-2011
Research Focus: Systems Biology of mESCs

a_mazloom.fw
Amin Mazloom, PhD
Training Period in Lab: 2008-2011
Research Focus: Network Analysis of NURSA HT IP-MS Proteomics

Bioinformaticians, Software Engineers, and Database Developers

Juncheng Pan, BS
Training Period in Lab: 2022

Eryk Kropiwnicki, MS
Training Period in Lab: 2020-2022

Megan Wojciechowicz, MS
Training Period in Lab: 2017-2019
Current Position: PhD Student, Icahn School of Medicine at Mount Sinai

torre
Denis Torre, BS
Training Period in Lab: 2016-2019
Current Position: PhD Student, Icahn School of Medicine at Mount Sinai

Shai Magidi, MEng
Training Period in Lab: 2016-2017

Anders Dohlman, BA
Training Period in Lab: 2015-2017

gundersen_headshot
Gregory Gundersen, MA
Training Period in Lab: 2014-2016
Current Position: PhD Student, Princeton University

mm.fw
Michael McDermott, BS
Training Period in Lab: 2014-2016

m_jones.fw
Matthew Jones, BS
Training Period in Lab: 2014-2015

ed_chen.fw
Edward Chen, MS
Training Period in Lab: 2010-2013
Research Focus: Enrichr – Interactive and Collaborative HTML5 Gene List Enrichment Analysis Tool

ruth_d.fw
Ruth Dannenfelser, MSE
Training Period in Lab: 2010-2012
Research Focus: Software Tools for Systems Biology

simon_g.fw
Simon Gordonov, PhD
Training Period in Lab: 2011-2012
Research Focus: Software Tools for Systems Biology

chris_tan.fw
Christopher Tan, MD
Training Period in Lab: 2011-2012
Research Focus: Software Tools for Systems Biology

caroline_baroukh.fw
Caroline Baroukh, PhD
Training Period in Lab: 2010-2011
Research Focus: Embryonic Stem Cells Atlas of Pluripotency Evidence (ESCAPE) Database

Undergraduate and Post-bac Research Trainees

Clara Chen
Training Period in Lab: Summer 2022
Statistics, Harvard University

Sophia Colmenares
Training Period in Lab: Summer 2022
Chemical and Biological Engineering, Princeton University

Eden Deng
Training Period in Lab: Summer 2022
Statistics, Computer Science, Duke University

Lauren Druz
Training Period in Lab: Summer 2022
Biology and Society, Cornell University

Reid Fleishman
Training Period in Lab: Summer 2022
Computer Science, Cornell University

Sophie Goldman
Training Period in Lab: Summer 2022
Computer Science and Information Science, University of Pennsylvania

Jason Han
Training Period in Lab: Summer 2022
Computer Science, Rice University

Cole Heine
Training Period in Lab: Summer 2022
Biochemistry and Molecular Biology, Brown University

David Lewis
Training Period in Lab: Summer 2022
Computation and Cognition, Massachusetts Institute of Technology

Nhi Nguyen
Training Period in Lab: Summer 2022
Neuroscience, Dartmouth College

Hannah Qu
Training Period in Lab: Summer 2022
Biomedical Engineering, Johns Hopkins University

Derek Wang
Training Period in Lab: Summer 2022
Biological Sciences, Computer Science, Carnegie Mellon University

Eric Che
Training Period in Lab: Summer 2021
Computer Science and Pre-Med, Columbia University

Kerwin Chen
Training Period in Lab: Summer 2021
Biometry and Statistics, Cornell University

Sarah Fida
Training Period in Lab: Summer 2021
Computer Science and Biology, Williams College

Sophie Guo
Training Period in Lab: Summer 2021
Biological Engineering and Computer Science, Massachusetts Institute of Technology

Angelina Huang
Training Period in Lab: Summer 2021
Biomedical Engineering and Information Science, Cornell University

Marilyn Meyers
Training Period in Lab: Summer 2021
Electrical Engineering and Computer Science, Massachusetts Institute of Technology

Gabrielle Pile
Training Period in Lab: Summer 2021
Computer Science, Brandeis University

Helen Propson
Training Period in Lab: Summer 2021
Computer Science and Engineering, Massachusetts Institute of Technology

Emily Slobodenyuk
Training Period in Lab: Summer 2021
Biological Sciences and Computer Science, Carnegie Mellon University

Hunter Smith
Training Period in Lab: Summer 2021
Computer Science, University of Chicago

Alexander Tsai
Training Period in Lab: Summer 2021
Molecular Biology, Statistics and Machine Learning, Princeton University

Sophia Vincoff
Training Period in Lab: Summer 2021
Biomedical Engineering and Computer Science, Duke University

Cindy Xiong
Training Period in Lab: Summer 2021
Mathematical Sciences, Carnegie Mellon University

Tahmid Ahamed
Training Period in Lab: Summer 2020
Undergraduate, University of Pennsylvania
Project: Tracing Single Human and Mouse Gene Expression with Age

Samantha Bhuiyan
Training Period in Lab: Summer 2020
Undergraduate, Barnard College, Columbia University
Project: Functional Predictions for Undiagnosed Disease Network (UDN) Genes

Jessie Cheng
Training Period in Lab: Summer 2020
Undergraduate, Carnegie Mellon University
Project: Analysis and Visualization of SARS-CoV-2 Gene Expression Signatures

Charles Dai
Training Period in Lab: Summer 2020
Undergraduate, Cornell University
Project: Appyters to Extract, Transform, and Load (ETL) Data for the Harmonizome

Lucas Goldman
Training Period in Lab: Summer 2020
Undergraduate, Vanderbilt University
Project: L1000KD2: Analysis and Visualization of the LINCS L1000 shRNA Gene Knock-Down Dataset

Jason Hom
Training Period in Lab: Summer 2020
Undergraduate, University of Pennsylvania
Project: Gene Set Comparison Appyter

Julie Karam
Training Period in Lab: Summer 2020
Undergraduate, Brown University
Project: GeneBot and DrugBot – Slack Applications to Discover Knowledge about Genes and Drugs

Skylar Litz
Training Period in Lab: Summer 2020
Undergraduate, Harvey Mudd College
Project: Enrichr Visualization Appyters

Nicole Moiseyev
Training Period in Lab: Summer 2020
Undergraduate, Duke University
Project: An Appyter for the Visualization and Characterization of Cancer Patient Clusters based on RNA-Seq Profiles

Daniel Stein
Training Period in Lab: Summer 2020
Undergraduate, Massachusetts Institute of Technology
Project: Drugmonizome-ML: An Appyter for Predicting Drug Indications

Christine Yoon
Training Period in Lab: Summer 2020
Undergraduate, Duke University
Project: RNA-seq Co-Expression for Functional Prediction of Non-Coding Genes

Serena Zhang
Training Period in Lab: Summer 2020
Undergraduate, University of Pennsylvania
Project: The KINOMEscan and KEA3 Appyters

Zaire Bryant
Training Period in Lab: Summer 2019
Undergraduate, Rutgers University
Project: Global Analysis of Gene Sets and Signatures Associated with Senescence

Elizabeth Jung
Training Period in Lab: Summer 2019
Undergraduate, Cornell University
Project: FAIRification of NIH Common Fund Programs’ Datasets

Alessandra LaRocco
Training Period in Lab: Summer 2019
Undergraduate, Yale University
Project: Predicting Novel Targets and Drugs to Induce Autophagy

Damon Pham
Training Period in Lab: Summer 2017, Fall 2017, Summer 2019
Undergraduate, Indiana University Bloomington
Projects: Gencodr – Gene Set Enrichment Analysis with Deep Learning and Developing Novel Gene Set Enrichment Analysis Algorithms

Samuel Maltz
Training Period in Lab: Summer 2019
Undergraduate, The Cooper Union
Project: Gene Expression Imputations with the Latent Space of RNA-seq Data from ARCHS4

Kaeli Rizzo
Training Period in Lab: Summer 2019
Undergraduate, Georgia Institute of Technology
Project: Predicting the Function and Disease Associations of Under-Studied GPCRs, Ion Channels, and Protein Kinases

Allison Seiden
Training Period in Lab: Summer 2019
Undergraduate, Johns Hopkins University
Project: Analysis of Gene Sets Extracted from a GWAS of the UK Biobank

Ingrid Shu
Training Period in Lab: Summer 2019
Undergraduate, Amherst College
Project: KEA3: Kinase Enrichment Analysis Version 3

David Szanto
Training Period in Lab: Summer 2019
Undergraduate, Harvard University
Project: PALM2A – Mobile and Web App for Predicting Physiological Age from Lab Tests and Vital Signs

Justin Williams
Training Period in Lab: Summer 2019
Undergraduate, Cornell University
Projects: Drug Repurposing Strategies for Alzheimer’s Disease
and PeerShake – A System to Transform the Scientific Review Process

Katherine Chew
Training Period in Lab: Summer 2018
Undergraduate, Massachusetts Institute of Technology
Project: Predicting Gene Signatures for Understudied Small Molecules

Zachary Flamholz
Training Period in Lab: Summer 2018
Undergraduate, Princeton University
Project: modEnrichr: Gene List Enrichment Analysis Tools for Fish, Fly, Worm, and Yeast
Current Position: MD/PhD Student, Albert Einstein College of Medicine

Hannah Freid
Training Period in Lab: Summer 2018
Undergraduate, Princeton University
Project: Integrative Analysis to Suggest Combination Therapies for Ovarian Cancer
Current Position: Medical Student, Icahn School of Medicine at Mount Sinai

Kevin Moses
Training Period in Lab: Summer 2018
Undergraduate, Yale University
Project: JOIN: The Journal of Interactive Notebooks

Vivek Sriram
Training Period in Lab: Summer 2018
Undergraduate, Duke University
Project: Predicting Diabetes Risk from Labs and Vitals with Machine Learning
Current Position: PhD Student, Genomics and Computational Biology, University of Pennsylvania Perelman School of Medicine

Vivian Utti
Training Period in Lab: Summer 2018
Undergraduate, Cornell University
Project: ChEA3: Transcription Factor Enrichment Analysis
Current Position: Medical Student, Icahn School of Medicine at Mount Sinai

Lily Wang
Training Period in Lab: Summer 2017, Fall 2018, Summer 2018
Undergraduate, Cornell University
Projects: FAIRshake: Digital Objects Evaluation System and FAIRshake: Toolkit to Assess the FAIRness of Biomedical Digital Objects
Current Position: PhD Student, Bioinformatics and Integrative Genomics, Harvard University

Sherry Xie
Training Period in Lab: Summer 2018
Undergraduate, Cornell University
Project: Standard API to Enable Semantic Integration of the Tools Developed by the Ma’ayan Lab
Current Position: Bioinformatician, Ma’ayan Laboratory, Icahn School of Medicine at Mount Sinai

Patrycja Krawczuk
Training Period in Lab: Summer 2017
Undergraduate, Hunter College
Project: Automated Indexing of Bioinformatics Tools
Current Position: PhD Student, Computer Science, University of Southern California

Marina Latif
Training Period in Lab: Summer 2017
Undergraduate, Princeton University
Project: Mining the Human Kinome

Joyce (Hyojin) Lee
Training Period in Lab: Summer 2017
Undergraduate, Princeton University
Project: Predicting Gene Function and PPIs with Co-Expression Data

Ariel Leong
Training Period in Lab: Summer 2017
Undergraduate, Stanford University
Project: ChEA3 – Improving Transcription Factor Enrichment Analysis by Data Integration

Christopher Tseng
Training Period in Lab: Summer 2017
Undergraduate, Emory University
Project: Analysis and Visualization of MCF10A Data for the LINCS Common Project
Current Position: Medical Student, Rutgers New Jersey Medical School

Charlotte Zuber
Training Period in Lab: Summer 2017
Undergraduate, Rutgers University
Project: Visualization of the Space of 200,000 Annotated Gene Sets

esther_chen.fw
Esther Chen
Training Period in Lab: Summer 2016
Undergraduate, Cornell University
Project: Cite-D-Lite: Chrome Extension for Data and Paper Citations with Text Importance Highlighting
Current Position: Medical Student, SUNY Downstate Medical Center College of Medicine

Axel Feldmann
Training Period in Lab: Summer 2016
Undergraduate, Carnegie Mellon University
Project: X2K-Web: an Updated Web-based Version of the Expression2Kinases Pipeline
Current Position: PhD Student, Massachusetts Institute of Technology

Daniel Clarke
Training Period in Lab: Summer 2016
Undergraduate, Fairleigh Dickinson University
Projects: Adhesome 2016: An Updated Adhesome Site with Predictions of New Member Components and Genes2WordCloud: A Biology Oriented Interactive Word Cloud Generator
Current Position: Data Science Analyst II, Ma’ayan Laboratory, Icahn School of Medicine at Mount Sinai

katie_lin.fw
Katie Lin, BS, MS
Training Period in Lab: Summer 2016
Undergraduate, Columbia University
Project: Visualization of the Multi-Layered Data from the LINCS MCF10A Dense Cube Project

malyack.fw
Colette Malyack, BS, MS
Training Period in Lab: Summer 2016
Undergraduate, Fairleigh Dickinson University
Project: Predicting Experimental Platforms by Examining Gene-Set Content

sani.fw
Kevin Sani
Training Period in Lab: Summer 2016
Undergraduate, Harvard University
Project: Dr. Gene Budger: Web App to Predict Drugs to Modulate the Expression of a Specific Gene

b_kaplan.fw
Benjamin Kaplan
Training Period in Lab: Summer 2015
Undergraduate, Carnegie Mellon University
Project: Integrative Analysis and Visualization of Gene Expression Signatures toward the Repurposing of FDA Approved Drugs as Antiviral Medications

emily_k.fw
Emily Kuang
Training Period in Lab: Summer 2015
Undergraduate, New York City College of Technology
Project: Assessing the Dimensionality of MCF7 Cells Response to Perturbations

azu_lee.fw
Azu Lee, MS
Training Period in Lab: Summer 2015
Undergraduate, University of Washington
Project: Interactive Mobile App Game for Deconvolution of Gene Set Modules from Gene Set Enrichment Analyses

aditi.fw
Aditi Dandapani, PhD
Training Period in Lab: 2009-2010
Project: Dynamical Model of Viral DI Particles

komosinski.fw
Michael Komosinski, BS
Training Period in Lab: Summer 2011
Undergraduate, Colgate University
Project: Integrating, Predicting, and Visualizing Mammalian Protein-Protein Interaction Networks

john_z.fw
John Zhuang, BS
Training Period in Lab: 2010
Undergraduate, Columbia University
Project: Regulatory Network Created from Loss of Function and Gain of Function Studies of Mouse Embryonic Stem Cells

mariola_s.fw
Mariola Szenk, PhD
Training Period in Lab: 2009
Undergraduate, Hunter College
Project: PathwayGenerator2: Automated Visualization of Signaling Pathways using Flash and ActionScript 3

Visiting High School Students

Inko Bovenzi
Training Period in Lab: Summer 2019
Hunter College High School
Project: Crossing Gene Sets Across Libraries to Discover Novel Pathways

Derek Wang
Training Period in Lab: Summer 2018
East Brunswick High School
Project: GTEx2BioJupies: Jupyter Notebook Analysis Reports from Interactive Selection of GTEx Portal Samples
Current Position: Undergraduate, Carnegie Mellon University

lin.fw
Jennifer Lin
Training Period in Lab: Summer 2016
Oceanside High School
Project: Predicting Potential Drugs for Diabetic Nephropathy using L1000 Data

mounica.fw
Mounica Kamesam
Training Period in Lab: Summer 2014 and 2015
Yorktown High School
Project: Automated Data Integration and Data Mining to Improve Breast Cancer Classification

axel.fw
Axel Feldmann
Training Period in Lab: Summer 2014 and 2016
Hunter College High School
Project: Hepatocellular Carcinoma Patient Classification with Enrichment Vectors
Current Position: PhD Student, Massachusetts Institute of Technology

jay.fw
Jayanath Krishnan
Training Period in Lab: Summer 2010
Mahopac High School
Project: Regulatory Signatures of Cancer Cell Lines Inferred from Expression Data

Visiting Medical Students

Andrew Warburton
Training Period in Lab: Summer 2019
Project: D2S2 – Collections of Disease and Drug Centric Gene Sets

barash.fw
Alexander Barash, MD
Training Period in Lab: 2007-2008
Research Focus: Systems Pharmacology

r_webb.fw
Ryan Logan Webb, MD
Training Period in Lab: 2007-2008
Research Focus: Software Tools for Systems Biology

Positions Available

The Ma’ayan Laboratory conducts multi-disciplinary NIH funded research that utilizes Big Data analytics to develop better understanding about drug action in human cells, build molecular regulatory networks from high-content genome-wide data, and predict optimized therapeutics for individual patients across several complex diseases.

Bioinformatics Software Engineer

Posted December 2023

A full-time position as a Bioinformatics Software Engineer is available in the Ma’ayan Laboratory of Computational Systems Biology and the Mount Sinai Center for Bioinformatics at the Icahn School of Medicine at Mount Sinai in New York.

What you’ll do:

The successful candidate will collaborate with an interdisciplinary team on developing, implementing, documenting and maintaining web-based software applications used by the larger scientific community. As a member of our team, we would like you to be able to:

- Work independently to identify and define technical requirements for tasks and timelines
- Design, build, test, and deploy scalable bioinformatics web-based applications in a cloud environment
- Develop, document, and maintain version-controlled code
- Mock, develop, and enhance interactive UI designs
- Build Docker containers for various bioinformatics workflows
- Maintain and enhance efficient solutions to reproducible workflow orchestrations on the cloud and local HPC
- Author and manage technical documentation that concisely describes design and implementation details
- Manage, publish and maintain code repository (eg. GitHub), container repository (eg. DockerHub)
- Respond to new feature requests, assist with issues raised by userbase as needed
- Report project status regularly to the Principal Investigator

What you’ll bring:

- Bachelor’s or Master’s degree in Computer Science, Informatics, Mathematics, Statistics, Engineering or Biomedical Science
- Knowledge of open-source bioinformatics tools and workflows
- Experience working with high performance clusters and cloud technologies
- Experience developing web-based applications with front ends utilizing frameworks such as React, NodeJS, RShiny, Flask, or Dash
- Experience with building and orchestrating containers (Docker) using technologies like Kubernetes
- Extensive experience with Git or other version control systems
- Experience in more than one programming language such as Python, JavaScript, Java, C/C++, R
- Working knowledge of relational and non-relational databases
- Strong communication (written and verbal) and organizational skills

To apply, please e-mail your CV/resume and the contact information of three references to: sherry.jenkins@mssm.edu

Postdoctoral Fellow, Big Data Science and Computational Systems Biology

Posted December 2023

A full-time postdoctoral position is available in the Ma’ayan Laboratory of Computational Systems Biology and the Mount Sinai Center for Bioinformatics at the Icahn School of Medicine at Mount Sinai in New York.

What you’ll do:

The successful candidate will collaborate with an interdisciplinary team to develop tools and algorithms for the analysis, integration, and visualization of large scale biological omics datasets. The datasets include genomics, transcriptomics, epigenomics, proteomics, and metabolomics. In addition, the position involves the application of machine learning, including deep learning, to mining electronic medical records and combining such data with omics datasets.

What you’ll bring:

Candidates are required to have a recent PhD in Biomedical Science, Computer Science, Mathematics, Biostatistics, Statistics, Physics, Engineering, and relevant experience with applications to biology.

- Experience with machine learning, multithread programming, and cloud computing
- Experience developing and deploying web-based and mobile apps
- Experience with bioinformatics research projects
- Knowledge of Python, R, Java, JavaScript, Node.js, MongoDB, MySQL, Docker

To apply, please e-mail your CV, research statement, and contact information of three references to: sherry.jenkins@mssm.edu

Bioinformatician II

Posted December 2023

A full-time position is available in the Ma’ayan Laboratory of Computational Systems Biology and the Mount Sinai Center for Bioinformatics at the Icahn School of Medicine at Mount Sinai in New York.

What you’ll do:

The successful candidate will collaborate with an interdisciplinary team on projects related to bioinformatics, big data science, and systems biology including developing, implementing, documenting and maintaining web-based software applications used by the larger scientific community. You will work on various aspects of research and infrastructure projects. Your work will include:

Developing novel dynamic data visualizations
Applying machine learning to identify patterns in large and complex datasets
Harmonizing and abstracting data from a variety of sources
Developing novel statistical mining strategies and algorithms
Developing websites, databases, APIs and other data exchange protocols

What you’ll bring:

Master’s degree in Computer Science, Informatics, Mathematics, Statistics, Physics, Engineering or Biological Sciences and a strong interest in working on data-intensive biomedical problems.
Experience with machine learning, multithread programming, and cloud computing
Experience developing and deploying web-based and mobile apps
Experience with bioinformatics research projects
Knowledge of Python, R, Java, JavaScript, Node.js, MongoDB, MySQL, Docker
Knowledge of molecular and cell biology

To apply, please e-mail your CV, research statement, and contact information of three references to: sherry.jenkins@mssm.edu

Bioinformatician I

Posted December 2023

What you’ll do:

Developing novel dynamic data visualizations
Applying machine learning to identify patterns in large and complex datasets
Harmonizing and abstracting data from a variety of sources
Developing novel statistical mining strategies and algorithms
Developing websites, databases, APIs and other data exchange protocols

What you’ll bring:

Bachelor’s degree in Computer Science, Informatics, Mathematics, Statistics, Physics, Engineering, or Biological Sciences and a strong interest in working on data-intensive biomedical problems.
Experience with machine learning, multithread programming, and cloud computing
Experience developing and deploying web-based and mobile apps
Experience with bioinformatics research projects
Knowledge of Python, R, Java, JavaScript, Node.js, MongoDB, MySQL, Docker
Knowledge of molecular and cell biology

To apply, please e-mail your CV, research statement, and contact information of three references to: sherry.jenkins@mssm.edu

Current Graduate Students

Posted September 2023

If you are interested in joining the lab as a graduate student, please email Dr. Ma’ayan at avi.maayan@mssm.edu. The Ma’ayan Laboratory accepts rotation students from all Multidisciplinary Training Areas (MTAs) within the ISMMS Graduate School of Biomedical Sciences.

Prospective Graduate Students

Posted September 2023

Prospective graduate students should apply to one of the programs at the ISMMS Graduate School of Biomedical Sciences.

Mount Sinai Health System is an equal opportunity/affirmative action employer. We recognize the power and importance of a diverse employee population and strongly encourage applicants with various experiences and backgrounds. Mount Sinai Health System – An EEO/AA-D/V Employer

News

Featured News

Researchers Characterize the Immune Landscape in Cancer
Mount Sinai Press Release

Icahn School of Medicine at Mount Sinai and the University of California San Diego Receive $8.5 Million Award to Establish a Data Integration Hub for NIH Common Fund Supported Program
Mount Sinai Press Release

AI Spotlight: Mapping Out Links Between Drugs and Birth Defects
Research Feature in Mount Sinai Today

Researchers Develop AI Model to Better Predict Which Drugs May Cause Birth Defects
Mount Sinai Press Release

Genes to Potentially Diagnose Long-Term Lyme Disease Identified
Mount Sinai Press Release

Mount Sinai Designated as National Cancer Institute Proteogenomics Data Analysis Center
Mount Sinai Press Release

Mount Sinai Lab Creates Shared Database to Help Scientists Find Drugs That Can Be Used to Treat COVID-19
Mount Sinai Today Newsletter

2020 Presentation Session Featuring the Research Projects of the Summer Fellows
Ma’ayan Lab and Mount Sinai Center for Bioinformatics

FAIR Your Data
Nature Methods

Ten Renowned Mount Sinai Faculty Members Honored at Convocation
Inside Mount Sinai

Mount Sinai Researchers Develop Software to Measure the Findability, Accessibility, Interoperability, and Reusability of Biomedical Digital Research Objects
Mount Sinai Press Release

2020 Summer Research Training Program in Biomedical Big Data Science
Ma’ayan Lab and Mount Sinai Center for Bioinformatics

Smoke Signals – Study Shows Path Linking Nicotine Addiction to Increased Risk for Diabetes
Nature via Twitter

2019 Presentation Session Featuring the Research Projects of the BD2K-LINCS Fellows
BD2K-LINCS Data Coordination and Integration Center

Mount Sinai Researchers Develop Tool that Analyzes Biomedical Data within Minutes
Mount Sinai Press Release

2018 Presentation Session Featuring the Research Projects of the BD2K-LINCS Fellows
BD2K-LINCS Data Coordination and Integration Center

Big Data, Networks Identify Cell Signaling Pathways in Lung Cancer
Medical Press

Mount Sinai Researchers Receive NIH Grant to Develop New Ways to Share and Reuse Research Data
Mount Sinai Press Release

Students Harness Big Data to Help Solve Medical Challenges
ISMMS Fall 2017 Dean’s Report

BD2K Centers Open Doors to Discovery
Biomedical Computation Review

Gene Expression’s Big Rethink
GEN

Crowdsourcing for Scientific Discovery: Mount Sinai Researchers Find Novel Ways to Analyze Data for Drug and Target Discovery
Mount Sinai Press Release

twoXAR Collaborates with Researchers at Mount Sinai to Advance New Medicines for Diabetic Nephropathy
Business Wire

Back on the Road with Coursera
ASBMB Today

Genetics: Big Hopes for Big Data
Nature | Outlook

Center to Seek New Therapeutics by Integrating Gene, Protein Databases
Mount Sinai Press Release

Systems Pharmacology Approaches for Drug and Cancer Research
Podcast

Society of Toxicology 2013 Annual Meeting
Drug Discovery News

New Computational Method to Help Organize Scientific Data
News-Medical.net

Mount Sinai Researchers Develop New Computational Method to Find Novel Connections from Gene to Gene, Drug to Drug and Between Scientists
Science Daily

Mutations in 3 Genes Linked to Autism Spectrum Disorders
Newswise.com

HIPK2 Regulator Protein Plays a Crucial Role in Kidney Fibrosis
News-Medical.net

Researchers Discover Drug Target for Kidney Failure
Mount Sinai Press Release

Recovering Protein-Protein and Domain-Domain Interactions from Aggregation of IP-MS Proteomics of Coregulator Complexes
Mount Sinai Press Release

Expression2Kinases: mRNA Profiling Linked to Multiple Upstream Regulatory Layers
Mount Sinai Press Release

Mount Sinai Researchers Develop New Computational Method to Aid Analysis of Gene Expression Experiments
Mount Sinai Press Release

Systematic Tracking of Cell Fate Changes
Nature Biotechnology

Stem Cells, Systems Biology and Human Feedback
Nature Reports Stem Cells

Courses

Big Data MOOCs on Coursera

Avi Ma’ayan PhD is the course director for two massive open online courses (MOOCs) on the Coursera platform. As of March 2023, we have over 267,600 unique visitors and a combined total of over 25,000 students enrolled for these two MOOCs.

Big Data Science with the BD2K-LINCS Data Coordination and Integration Center
The BD2K-LINCS Data Coordination and Integration Center (DCIC) was commissioned to organize, analyze, visualize and integrate LINCS data with other publicly available relevant resources. In this course, we introduce the various Centers that collect data for LINCS, describing the experimental data procedures and the various data types. We will then cover the design and collection of metadata and how metadata is linked to ontologies. Additionally, basic data processing and data normalization methods to clean and harmonize LINCS data will be presented. This will follow a discussion about how the data is served as RESTful APIs and JSON, and for this we will cover concepts from client-server computing. Most importantly, the course will focus on various bioinformatics methods of analysis including: unsupervised clustering, gene-set enrichment analyses, Bayesian integration, network visualization, and supervised machine learning applications to LINCS data and other relevant Big Data from molecular biomedicine.

Network Analysis in Systems Biology
An introduction to data integration and statistical methods used in contemporary Systems Biology, Bioinformatics and Systems Pharmacology research. The course covers methods to process raw data from genome-wide mRNA expression studies (microarrays and RNA-seq) including data normalization, differential expression, clustering, enrichment analysis and network construction. The course contains practical tutorials for using tools and setting up pipelines, but it also covers the mathematics behind the methods applied within the tools. The course is mostly appropriate for beginning graduate students and advanced undergraduates majoring in fields such as biology, math, physics, chemistry, computer science, biomedical and electrical engineering. The course should be useful for researchers who encounter large datasets in their own research. The course presents software, apps and tools developed by the Ma’ayan Laboratory, but also other freely available data analysis and visualization tools. The ultimate aim of the course is to enable participants to utilize the methods presented in this course for analyzing their own data for their own projects. For those participants who do not work in the field, the course introduces the current research challenges faced in the field of computational systems biology.

Big Data Courses at the Icahn School of Medicine at Mount Sinai

Avi Ma’ayan PhD is the course director for two graduate courses at the Icahn School of Medicine at Mount Sinai. The courses are delivered once in the Fall and once in the Spring. The Fall course is focused on data mining and the Spring course on computer programming.

BSR 6806: Programming for Big Data Biomedicine (2024)
The course covers computational methodologies applied to analyze data in the broad fields of bioinformatics and big data science. Topics covered include RNA-seq and proteomics data analysis, Machine Learning, Deep Learning, Text Mining, Python and Jupyter Notebooks, Appyters, cloud computing, data visualization, network analysis, version control, and Knowledge Graphs. Students are required to complete small programming assignments throughout the course. The course uses Jupyter Notebooks and Appyters to run most tutorials.. [YouTube playlist of course lectures]

Data Mining and Network Analysis
This course covers methods that include machine learning applications in systems biology including unsupervised clustering and supervised learning; analysis of the topology of biological regulatory networks; and a survey of how these approaches are applied to study biological molecular networks; papers that combine computational predictions with experimental validation are highlighted; and use of software tools to analyze proteomics and genomics collected by the LINCS experimental expression data.

Research

The Ma’ayan Laboratory applies computational and mathematical methods to study the complexity of regulatory networks in mammalian cells. We apply machine learning and other statistical mining techniques to study how intracellular regulatory systems function as networks to control cellular processes such as differentiation, dedifferentiation, apoptosis and proliferation. We develop software systems to help experimental biologists form novel hypotheses from high-throughput data, while aiming to better understand the structure and function of regulatory networks in mammalian cellular and multi-cellular systems.

NIH-funded Centers

- Data Resource Center (DRC) for the Common Fund Data Ecosystem (CFDE) (2023-2028)
- Mount Sinai Proteogenomic Data Analysis Center (PGDAC) (2022-2027)
- ARCHS4 an Informatics Technology for Cancer Research (ITCR) Resource (2022-2027)
- Diabetes Data and Hypothesis Hub (D2H2) (2022-2025)
- Mount Sinai’s Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG) (2014-2023)
- BD2K-LINCS Data Coordination and Integration Center (DCIC) (2014-2022)

Largest and Most Diverse Collection of Annotated Gene Sets

Gene set enrichment analysis is central to many biological and biomedical projects that measure mRNA and protein expression at the whole-genome scale. Gene set enrichment analysis is typically limited to few literature-base background knowledge libraries such as those created from the Gene Ontology and from pathway databases such as KEGG, WikiPathways, and Reactome. We have demonstrated that enrichment analysis can be expanded to using data from many other biological domains. For developing the tools Enrichr, Enrichr-KG, Rummagene, Rummageo, kinase enrichment analysis (KEA), ChIP-seq enrichment analysis (ChEA), and Harmonizome, we have integrated data from many key biomedical resources into useful gene set libraries. These libraries better inform enrichment analyses from omics studies. So far, over 2 million unique users used these bioinformatics software applications with a current rate of ~4,000 unique users per day.

Original Methods to Identify Differentially Expressed Genes, Perform Gene Set Enrichment Analyses, and Benchmark these Data Analysis Methods

One of the key statistical tests in the fields of transcriptomics is the identification of differentially expressed genes. We developed a multivariate method called the Characteristic Direction to better identify the “correct” differentially expressed genes. The Characteristic Direction method was extended to also perform improved enrichment analysis using a similar concept. Using a unique benchmarking strategy, we can objectively evaluate the Characteristic Direction method and many other leading methods for differential expression and enrichment analyses such as limma, GSEA and DESeq.

Translational Computational Research in Cancer and Kidney Disease

In collaboration with other experimental and computational biology laboratories, we have made great strides in the past several years in studying kidney disease, diabetes, HIV, and cancer. We have developed unique computational methods that led to the identification of potential targets and drugs for attenuating kidney fibrosis, diabetic kidney disease, and HIVAN. Our collaborative work also proposed treatment combinations for early-stage kidney disease intervention. These advances were possible by applying the unique algorithms that we developed which include: Expression2Kinases, SigCom LINCS, and TargetRanger.

Innovative Bioinformatics Software Infrastructure

To lower the barrier of entry for bioinformaticians and to streamline the development of bioinformatics software applications, we developed Appyters. With Appyters bioinformaticians can rapidly develop full-stack web-based bioinformatics applications from their Jupyter Notebook. Currently over 100 Appyters are available from the Appyters Catalog. For a CFDE Partnership project, our team developed the Playbook Workflow Builder, a platform that facilitates the visual dynamic construction of bioinformatics workflows. Along these efforts, we also created FAIRshake, a flexible framework for performing manual and automated evaluation of digital objects for adherence to defined community established standards.

Team

Head of Laboratory

Avi Ma’ayan PhD
Mount Sinai Endowed Professor in Bioinformatics
Center Director, Mount Sinai Center for Bioinformatics
Professor, Department of Pharmacological Sciences
Professor, Department of Artificial Intelligence and Human Health
Principal Investigator, Data Resource Center for the Common Fund Data Ecosystem
Principal Investigator, Mount Sinai’s Proteogenomic Data Analysis Center
Principal Investigator, LymeMIND Data Coordination Center
Icahn School of Medicine at Mount Sinai
E-mail: avi.maayan@mssm.edu

Program Director

jenkins_v1
Sherry Jenkins, MS
Director, Summer Research Training Program in Biomedical Big Data Science
MS in Computer Science
Fairleigh Dickinson University
E-mail: sherry.jenkins@mssm.edu

Research Assistant Professor

Alexander Lachmann, PhD
PhD in Biomedical Informatics
Columbia University

Data Scientist

Daniel Clarke, MS
Data Science Analyst III
BS, MS in Computer Engineering and Electrical Engineering
Fairleigh Dickinson University

Bioinformaticians

John Erol Evangelista, MS
Bioinformatician II
BS, MS in Computer Science
University of the Philippines Diliman

Sherry Xie, BS
Bioinformatician II
BS in Computer Science
Cornell University

Systems Analyst

Heesu Kim, MBA, MS
Systems Analyst I
MS, Business Analytics, Hult International Business School
MBA, Hult International Business School
BS in Computer Science and Information Engineering, Inha University (South Korea)

Bioinformatics Software Engineers

Nasheath Ahmed, BS, AB
Bioinformatics Software Engineer
BS in Computer Science and AB in Biology
Brown University

Eden Deng, BS
Bioinformatics Software Engineer
BS in Statistical Science and BS in Computer Science
Duke University

Ido Diamant, BS
Bioinformatics Software Engineer
BS in Biomedical Engineering, Minor: Computer Science
University of Massachusetts Amherst

Giacomo Marino, ScB, AB
Bioinformatics Software Engineer
ScB in Neuroscience and AB in Computer Science
Brown University

Stephanie Olaiya, ScB
Bioinformatics Software Engineer
ScB in Computational Neuroscience
Brown University

2023 Undergraduate Research Trainees in our Biomedical Big Data Science Program

Alexandra Agris
Bioinformatics Trainee
Bioengineering (major), Computer Science (minor), Stanford University

Gaurvi Awasthi
Bioinformatics Trainee
Biology and Society (major), Biometry and Statistics (minor), Cornell University

Rhea Desai
Bioinformatics Trainee
Biomedical Engineering, Cornell University

Nooha Kawsar
Bioinformatics Trainee
Molecular Biology, Princeton University

Adam Lalani
Bioinformatics Trainee
Computer Science and Biology, Brown University

Hannah Lee
Bioinformatics Trainee
Computer Science (major), Computational Biology (minor), Duke University

Lauren Malek
Bioinformatics Trainee
Electrical and Computer Engineering, Princeton University

Jacob Mayourian
Bioinformatics Trainee
Computer Science (major), Data Science (minor), Cornell University

Osaiyekemwen Ruth Ogbemudia
Bioinformatics Trainee
Biomedical Engineering (major), Anthropology (minor), Cornell University

Shriya Rangaswamy
Bioinformatics Trainee
Biological Engineering (major), Mathematics (minor), Massachusetts Institute of Technology

Abigail Zaroff
Bioinformatics Trainee
Bioinformatics and Computational Biology, Duke University

Mason Zhang
Bioinformatics Trainee
Computer Science (major), Philosophy (minor), Brown University

Alumni

About Us

The Ma’ayan Laboratory develops computational and mathematical methods to study the complexity of regulatory networks in mammalian cells. We apply machine learning and other statistical mining techniques to study how intracellular regulatory systems function as networks to control cellular processes such as differentiation, dedifferentiation, apoptosis and proliferation. We develop software systems to help experimental biologists form novel hypotheses from high-throughput data, while aiming to better understand the structure and function of regulatory networks in mammalian cellular and multi-cellular systems.

Enrichr

Gene-List Enrichment Analysis Tool
An integrative web-based and mobile gene list enrichment analysis tool providing various types of visualization summaries of collective functions of gene lists.
PMID: 27141961

Harmonizome

Biological Knowledge Engine
Built on top of information about genes and proteins from 114 datasets, Harmonizome is a knowledge engine for a diverse set of integrated resources.
PMID: 27374120

ARCHS4

All RNA-seq and ChIP-seq Signature Search Space
Provides access to gene counts from HiSeq 2000 and HiSeq 2500 platforms for human and mouse experiments from GEO and SRA.
PMID: 29636450

L1000CDS2

L1000 Characteristic Direction Signature Search Engine
Queries gene expression signatures against the LINCS L1000 to identify and prioritize small molecules that can reverse or mimic the observed input expression pattern.
PMID: 28413689

L1000FWD

Large-scale Visualization of Drug-induced Transcriptomic Signatures
L1000 fireworks display (L1000FWD) is a web application that provides interactive visualization of over 16,000 drug and small-molecule induced gene expression signatures.
PMID: 29420694

BioJupies

Automated Generation of Interactive Notebooks for RNA-seq Data Analysis in the Cloud
Web server that enables automated creation, storage, and deployment of Jupyter Notebooks containing RNA-seq data analyses.
PMID: 30447998

Geneshot

Submit Biomedical Terms to Receive Ranked Lists of Relevant Genes
Enables researchers to enter arbitrary search terms, to receive ranked lists of genes relevant to the search terms. Returned ranked gene lists contain genes that were previously published in association with the search terms, as well as genes predicted to be associated with the terms based on data integration from multiple sources. The search results are presented with interactive visualizations.
PMID: 31114885

ChEA3

ChIP-X Enrichment Analysis Version 3
A transcription factor enrichment analysis tool that ranks TFs associated with user-submitted gene sets. The ChEA3 background database contains a collection of gene set libraries generated from multiple sources including TF-gene co-expression from RNA-seq studies, TF-target associations from ChIP-seq experiments, and TF-gene co-occurrence computed from crowd-submitted gene lists.
PMID: 31114921

modEnrichr

A Suite of Gene Set Enrichment Analysis Tools for Model Organisms
An expansion of Enrichr for four model organisms: fish, fly, worm and yeast. The modEnrichr suite of tools provides the ability to convert gene lists across species using an ortholog conversion tool that automatically detects the species.
PMID: 31069376

eXpression2Kinases Web

Linking Expression Signatures to Upstream Cell Signaling Networks
Computationally predicts involvement of upstream cell signaling pathways, given a signature of differentially expressed genes.
PMID: 29800326

For a complete list of our software tools, databases and datasets please visit our Resources page. We apply these and other computational methods for the analysis of a variety of collaborative projects. The results from our analyses produce concrete suggestions and predictions for further functional experiments. The predictions are tested by our collaborators and our analyses methods are delivered as software tools and databases for the systems biology research community.