Introducing BayesDeBulk: A Flexible Bayesian Algorithm for the Deconvolution of Bulk Tumor Data

Characterizing the tumor microenvironment is crucial in order to improve responsiveness to immunotherapy and develop new therapeutic strategies. The fraction of different cell-types in the tumor microenvironment can be estimated based on transcriptomic profiling of bulk tumor data via deconvolution algorithms. One class of such algorithms, known as reference-based, rely on a reference signature containing gene expression data for various cell-types. The limitation of these methods is that such a signature is derived from the gene expression of pure cell-types, which might not be consistent with the transcriptomic profiling in solid tumors. On the other hand, reference-free methods usually require only a set of cell-specific markers to perform deconvolution; however, once the different components have been estimated from the data, their labeling can be problematic. To overcome these limitations, we propose BayesDeBulk – a new reference-free Bayesian method for bulk deconvolution based on gene expression data. Given a list of markers expressed in each cell-type (cell-specific markers), a repulsive prior is placed on the mean of gene expression in different cell-types to ensure that cell-specific markers are upregulated in a particular component. Contrary to existing reference-free methods, the labeling of different components is decided a priori through a repulsive prior. Furthermore, the advantage over reference-based algorithms is that the cell fractions as well as the gene expression of different cells are estimated from the data, simultaneously. Given its flexibility, BayesDeBulk can be utilized to perform bulk deconvolution beyond transcriptomic data, based on other data types such as proteomic profiles or the integration of both transcriptomic and proteomic profiles.

The BayesDeBulk web application can be visited at:

For more information, please see the Biorxiv preprint at: