Big Data MOOCs on Coursera
Avi Ma’ayan PhD is the course director for two massive open online courses (MOOCs) on the Coursera platform. As of March 2016, over 33,000 students registered for these two courses and 195,000 video lectures were viewed.
Big Data Science with the BD2K-LINCS Data Coordination and Integration Center
The BD2K-LINCS Data Coordination and Integration Center (DCIC) is commissioned to organize, analyze, visualize and integrate LINCS data with other publicly available relevant resources. In this course we will introduce the various Centers that collect data for LINCS, describing the experimental data procedures and the various data types. We will then cover the design and collection of metadata and how metadata is linked to ontologies. Additionally, basic data processing and data normalization methods to clean and harmonize LINCS data will be presented. This will follow a discussion about how the data is served as RESTful APIs and JSON, and for this we will cover concepts from client-server computing. Most importantly, the course will focus on various bioinformatics methods of analysis including: unsupervised clustering, gene-set enrichment analyses, Bayesian integration, network visualization, and supervised machine learning applications to LINCS data and other relevant Big Data from molecular biomedicine.
Network Analysis in Systems Biology
An introduction to data integration and statistical methods used in contemporary Systems Biology, Bioinformatics and Systems Pharmacology research. The course covers methods to process raw data from genome-wide mRNA expression studies (microarrays and RNA-seq) including data normalization, differential expression, clustering, enrichment analysis and network construction. The course contains practical tutorials for using tools and setting up pipelines, but it also covers the mathematics behind the methods applied within the tools. The course is mostly appropriate for beginning graduate students and advanced undergraduates majoring in fields such as biology, math, physics, chemistry, computer science, biomedical and electrical engineering. The course should be useful for researchers who encounter large datasets in their own research. The course presents software, apps and tools developed by the Ma’ayan Laboratory, but also other freely available data analysis and visualization tools. The ultimate aim of the course is to enable participants to utilize the methods presented in this course for analyzing their own data for their own projects. For those participants who do not work in the field, the course introduces the current research challenges faced in the field of computational systems biology.
Big Data Courses at the Icahn School of Medicine at Mount Sinai
Avi Ma’ayan PhD is the course director for two graduate courses at the Icahn School of Medicine at Mount Sinai. The courses are delivered once in the Fall and once in the Spring. The Fall course is focused on data mining and the Spring course on computer programming .
Programming for Big Data Biomedicine
This course covers covers computer programming methodologies applied to processing data and analysis of data in the broad fields of Bioinformatics and Systems Biology. Topics covered would include an overview of data structures and algorithms, Python scripting for processing text files, cloud computing, versioning control systems, computational platforms such Jupyter Notebooks as well as database technologies such as mySQL. Students will be required to complete small programming assignments throughout the course. Spring 2019 Course Dates
BD2K-LINCS: Data Mining and Network Analysis
This course covers methods that include machine learning applications in systems biology including unsupervised clustering and supervised learning; analysis of the topology of biological regulatory networks; and a survey of how these approaches are applied to study biological molecular networks; papers that combine computational predictions with experimental validation are highlighted; and use of software tools to analyze proteomics and genomics collected by the LINCS experimental expression data.