Mount Sinai Designated as National Cancer Institute Proteogenomics Data Analysis Center

The Office of Cancer Clinical Proteomics Research at the National Cancer Institute (NCI) has awarded $4.2 million over five years to Mount Sinai researchers establishing a Proteogenomic Data Analysis Center (PGDAC) to advance cancer research and treatments. Proteogenomics, which studies the roles of proteins and genes within a cell or organism, is a crucial area of exploration to further our understanding of cancer.

The award supports the Center’s work to identify potential biomarkers and drug targets for cancer, new insights into cancer biology, and bioinformatics tools to enable more advanced exploration and discovery from cancer-related datasets.

The Center’s lead investigators, Pei Wang, PhD, Professor, Genetic and Genomic Sciences, and Avi Ma’ayan, PhD, Professor, Pharmacological Sciences, and Director of the Mount Sinai Center for Bioinformatics, will leverage their expertise in statistics/biostatistics, machine learning, data integration, systems pharmacology, and proteomics data modeling to further understand the proteogenomic complexity of tumors.

The centers of excellence are part of NCI’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) program, a national effort to advance and accelerate cancer research through the application of proteogenomics. Mount Sinai is one of 14 CPTAC centers nationwide.

Data generated and tools developed by the CPTAC centers will be made available to the public to help further advance and accelerate cancer research.

For more information, read the National Cancer Institute announcement at: https://proteomics.cancer.gov/news_and_announcements/cptac-announces-new-pcc-pgdac-and-ptrc-teams?cid=eb_govdel

Source: https://www.eurekalert.org/news-releases/958717

Introducing BayesDeBulk: A Flexible Bayesian Algorithm for the Deconvolution of Bulk Tumor Data

Characterizing the tumor microenvironment is crucial in order to improve responsiveness to immunotherapy and develop new therapeutic strategies. The fraction of different cell-types in the tumor microenvironment can be estimated based on transcriptomic profiling of bulk tumor data via deconvolution algorithms. One class of such algorithms, known as reference-based, rely on a reference signature containing gene expression data for various cell-types. The limitation of these methods is that such a signature is derived from the gene expression of pure cell-types, which might not be consistent with the transcriptomic profiling in solid tumors. On the other hand, reference-free methods usually require only a set of cell-specific markers to perform deconvolution; however, once the different components have been estimated from the data, their labeling can be problematic. To overcome these limitations, we propose BayesDeBulk – a new reference-free Bayesian method for bulk deconvolution based on gene expression data. Given a list of markers expressed in each cell-type (cell-specific markers), a repulsive prior is placed on the mean of gene expression in different cell-types to ensure that cell-specific markers are upregulated in a particular component. Contrary to existing reference-free methods, the labeling of different components is decided a priori through a repulsive prior. Furthermore, the advantage over reference-based algorithms is that the cell fractions as well as the gene expression of different cells are estimated from the data, simultaneously. Given its flexibility, BayesDeBulk can be utilized to perform bulk deconvolution beyond transcriptomic data, based on other data types such as proteomic profiles or the integration of both transcriptomic and proteomic profiles.

The BayesDeBulk web application can be visited at: https://calina01.u.hpc.mssm.edu/bayesdebulk/

For more information, please see the Biorxiv preprint at: https://www.biorxiv.org/content/10.1101/2021.06.25.449763v1.full

CPTAC Pancancer Phosphoproteomics Kinase Enrichment Analysis with ProKAP Provides Insights into Immunogenic Signaling Pathways

The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) initiative has generated extensive phosphoproteomics and proteomics data for tumor and tumor-adjacent normal tissue across multiple cancer types. This dataset provides an unprecedented opportunity to systematically characterize pan-cancer kinase activities, which is essential for coupling tumor subtypes with kinase inhibitors as potential treatment. In this work, we performed Kinase Enrichment Analysis (KEA) using a CPTAC phosphoproteomics dataset to identify putative differences in kinase state between tumor and normal tissues within and across five types of cancer. We then implemented an interactive web-portal, the ProTrack Kinase Activity Portal (ProKAP), for querying, visualizing, and downloading the derived pan-cancer kinase activity scores together with the corresponding sample metadata, and protein and phosphoprotein expression profiles. To illustrate the usage of this digital resource, we analyzed the association between kinase activity scores and immune subtypes of clear cell renal cell carcinoma (ccRCC) derived from the CPTAC ccRCC study. We found multiple kinases, whose inhibition has been suggested to have therapeutic effect in other tumor types, are highly active in CD8+-enriched ccRCC tumors.

The ProKAP web application is available at: http://pancan-kea3.cptac-data-view.org/

For more information, visit the preprint on Biorxiv: https://www.biorxiv.org/content/10.1101/2021.11.05.450069v1

In the following video, Anna Calinawan presents ProKAP to members of the e-IDG symposium:

https://youtu.be/a0tbojE9DEU

Proteogenomic analysis provides new insights into pediatric brain tumors

In collaboration with the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and Children’s Brain Tumor Tissue Consortium (CBTTC), we have published the first large-scale, comprehensive proteogenomic analysis of 218 tumors across seven histologic types of childhood brain cancer.

Integrated Proteogenomic Characterization across Major Histological Types of Pediatric Brain Cancer
Petralia et al. Cell. Dec 2020

Highlights
• Proteogenomics characterization of 218 pediatric brain tumor samples of 7 histologies
• Proteomic clusters reveal actionable biological features spanning histological boundaries
• Proteomics reveal downstream effects of DNA alterations not evident in transcriptomics
• Kinase activity analyses provide insights into pathway activities and druggable targets

The pediatric brain tumor data can be explored as interactive heatmap with our ProTrack web tool at http://pbt.cptac-data-view.org/


Related press:
First large-scale proteogenomic analysis offers insights into pediatric brain tumors

This study is the first comprehensive survey of genomics (which aims to characterize DNA sequence alterations in a sample), transcriptomics (which aims to quantify copies of RNAs), global proteomics (which aims to identify and quantify proteins), and phosphoproteomics (which quantifies active proteins) across a large cohort of 218 tumor samples representing seven distinct types of brain tumors.

Study offers new clues for treating brain cancer in children

“All in all, this study is fantastic news for the children with brain tumors and their families,” said the University of Alabama’s Van Meir. “Some of the findings are immediately applicable in the clinic, and others reveal new directions for future therapy development.”

Inaugural CPTAC Scientific Symposium 2019

The Mount Sinai team joined over 300 proteomic scientists and CPTAC collaborators in the first CPTAC Scientific Symposium, hosted at the NIH Bethesda campus. We were pleased to present the current state of our findings across multiple cancer types, including clear cell renal cell carcinoma, endometrial carcinoma, and pediatric brain tumors. We also performed live demos the various interactive web tools for data download and analysis developed by the team.

Read about the highlights of the visit on the NIH website.

Introducing the CPTAC Network Exploration Portal and CPTAC Data Browser

Starting with the the CPTAC3 clear cell Renal cell carcinoma (ccRCC) dataset, we have developed the interactive visual data exploration portals CPTAC Data Browser and CPTAC Network Exploration Portal. These portals will greatly enhance the sharing of large-scale proteomics resources by allowing researchers to intuitively browse, query, and download data and analysis results from CPTAC projects. 

CPTAC Network Exploration Portal
http://ccrcc.cptac-network-view.org/

Selim Kalayci1, 2, Zeynep H. Gümüş1, 2

CPTAC Data Analysis centers are increasingly utilizing network analyses to interpret the massive, multi-scale data that are being generated by the consortium. While there are several network exploration tools available in the field, to share and explore these integrated proteomic datasets and their associated metadata in a user-intuitive way, we need a simple, unified and custom interface that integrates multiple network data types.

With this web tool, users can interactively explore complex CPTAC networks and modules within these networks in 3D. 

Users can also:
(i) display peptides and clinical variables associated with each gene;
(ii) filter the networks based on phenotype
(iii) highlight pathways enriched within network

CPTAC Data Browser
http://ccrcc.cptac-data-view.org

Anna Calinawan1, 2, Boris Reva1, 2

This web application provides comprehensive and granular visualization of CPTAC data as interactive heatmaps. The browser has been implemented for three tumor types, including kidney cancer, lung adenocarcinoma, and pediatric brain tumors. While each portal has been tailored to the needs of the underlying data, they serve the same basic function: enabling users to specify genes of interest and generate an interactive visualization.

With this web tool, users can  render and explore heatmaps with proteome, phosphoproteome, transcriptome, and methylome data.

Users can also:
(i) sort the entire heatmap, in ascending or descending order
(ii) customize which tracks are shown
(iii) export the image and download the data set in exactly the sample order that produced the pattern of interest
(iv) view additional underlying data for each point, including clinical information, links to histologic images, and gene descriptions

 

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
2 Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY

Job Opening

Postdoc in biostatistics/bioinformatics at Icahn School of Medicine at Mount Sinai

Description: The research group led by Dr. Pei Wang in the Department of Genetics and Genomics Sciences at Icahn School of Medicine at Mount Sinai (New York, NY) invites applications for a postdoctoral researcher in Biostatistics/Bioinformatics. The positions will provide wonderful research opportunities to develop novel statistical/computation methods to solve cutting edge problems in cancer genomics and digital health research. The postdoctoral researcher will work with the established interdisciplinary research team consisting of statisticians, bioinformaticians, and laboratory and clinical scientists. He/She will be able to participate exciting projects from digital health (e.g. mobile health) studies, as well as cancer studies in NCI-CPTAC (Clinical Proteomic Tumor Analysis Consortium), the sister consortium of TCGA.

Duties and Responsibilities: The successful applicant will collaborate with quantitative researchers on methodology development and data analyses for large/complex data sets from omics technologies and/or digital health studies.

Position Qualifications: A Ph.D. in statistics/biostatistics/data sciences or related area is required. Relevant areas of expertise include statistical learning, machine/deep learning and et al. Some experience with applied techniques and real analysis is expected. Experience with computational programming such as R and/or C is required.

  1. A strong quantitative background in computing, with a demonstrated ability to design and implement algorithms in working code.
  2. Coursework and practical experience in applying statistical and/or data mining approaches to complex and/or high dimensional biological data.
  3. Programming experience in a UNIX/Linux environment using programming languages such as R.
  4. Excellent communication and teamwork skills to take advantage of the highly collaborative environment, adaptability and willingness to contribute to the overall goals of the research.

Salary Range: $60,000-$70,000 Annual

Benefits: Excellent benefit packages including insurance and housing.

http://icahn.mssm.edu/education/postdoctoral-training/about-us

Application Information: Applicants should send a curriculum vitae/resume and provide contact information for references. The posting will remain open until the position is filled.

Contact Email: pei.wang@mssm.edu