Software

BayesDeBulk http://www.bayesdebulk.com

To improve immunotherapy response, one crucial step is to study the immune/stromal cell composition in the tumor microenvironment. Deconvolution analyses based on transcriptomic data have been utilized to estimate cell composition from bulk data. However, there is no proper deconvolution algorithm suitable for proteomics data, hindering the relevant investigation in the fast growing proteogenomic research. To bridge this gap, we propose BayesDeBulk, a flexible deconvolution method which can be applied to estimate the immune/stromal composition from gene expression as well as proteomic profiling. A challenge when performing deconvolution analysis based on proteomic data is that the abundance levels across different proteomic datasets are in general not comparable. This makes the reference-based deconvolution methods not suitable. To overcome this problem, we propose a novel reference-free Bayesian model. The idea is to leverage the information of known cell-type-specific markers without using their absolute abundance levels from purified cells. Specifically, BayesDebulk imposes a novel prior on the mean of cell-specific marker abundances to ensure the upregulation of cell-specific markers in a particular component. This flexible framework enables BayesDeBulk to jointly model multiple types of omics data, such as proteomic and transcriptomic data, measured for the same set of samples.

Deconprolute https://github.com/PNNL-CompBio/proteomicsTumorDeconv

The goal of this package is to provide various metrics to assess the ability of different deconvolution algorithms to estimate the fraction of different cell-types from bulk proteogenomic data. The package is fully dockerized and can be run with the installation of Docker and a CWL-compliant tool. We employed a modular architecture to enable ‘plug and play’ comparisons of different datasets and tools.
R Package: TSNet
Tumor tissue samples often contain an unknown fraction of immune and stromal cells. This problem widely known as tumor purity heterogeneity (TPH) was recently recognized as a severe issue in bulk data analysis. Specifically, if TPH is ignored when inferring co-expression networks, edges are likely to be estimated among genes with mean shift between non-tumor- and tumor cells rather than among gene pairs interacting with each other in tumor cells. To address this issue, we propose Tumor Specific Net (TSNet), a new method which constructs tumor-cell specific gene/protein co-expression networks based on gene/protein expression profiles of tumor tissues.

TSNet https://github.com/petraf01/TSNet

Tumor tissue samples often contain an unknown fraction of stromal cells. This problem is widely known as tumor purity heterogeneity (TPH) was recently recognized as a severe issue in omics studies. Specifically, if TPH is ignored when inferring co-expression networks, edges are likely to be estimated among genes with mean shift between non-tumor- and tumor cells rather than among gene pairs interacting with each other in tumor cells. To address this issue, we propose Tumor Specific Net (TSNet), a new method which constructs tumor-cell specific gene/protein co-expression networks based on gene/protein expression profiles of tumor tissues. TSNet treats the observed expression profile as a mixture of expressions from different cell types and explicitly models tumor purity percentage in each tumor sample.

R Package: JRF (Available on R Cran)
Joint Random Forest (JRF) is a novel nonparametric algorithm to simultaneously estimate multiple co-expression networks by effectively borrowing information across different data types.

R Package: iRafNet (Available on R Cran)
https://cran.r-project.org/web/packages/iRafNet/index.html
iRafNet is a flexible, unified integrative framework that allows information from heterogeneous data, such as protein–protein interactions, transcription factor (TF)-DNA-binding, gene knock-down, to be jointly considered for co-expression network inference.

PREDICTNet https://gumuslab.github.io/PREDICTNet/

This is the first integrative multi-omics study from serum data to investigate IBD initiation and progression up to diagnosis. Identifying key markers and pathways driving disease initiation and expansion before diagnosis will allow a deeper understanding of the biology of disease without the confounding effects of established and chronic inflammation. Based on longitudinal serum data, we will estimate networks capturing the associations across biomarkers at different times before diagnosis for diseased participants and healthy controls by integrating data from the PREDICTS cohort. We will publicly release all our networks and statistical associations through a web-interface (PREDICTNet) to enable interactive exploration of the PREDICTS data such as time-varying trajectories, results from predictive models and display the associations across all biomarkers measured within this effort. This PREDICTS web-portal will lower the barriers for exploration of these high-dimensional PREDICTS network datasets and thereby accelerate our understanding of the preclinical phase of IBD.