Our projects fall under one of the following pillars:
Pillar 1: Understanding the System Architecture through Genetic Discovery: A complete understanding of the immune system requires mapping its genetic blueprint. Thus, a central theme of my work is uncovering how genetic variation shapes infectious and immune-mediated disease risk.
Pillar 2: Understanding How Information Flows Through Immune Networks: A predictive virtual immune system requires tracing information flow across biological layers and mapping causal relationships within immune networks. We tackle this by analyzing system responses to perturbations and by applying causal inference to map causal links and shared components across traits and pathways.
Pillar 3: Understanding How System Context Shapes Genetic Risk and Information Flow: Environmental, behavioral, and social factors profoundly shape immune function, yet have traditionally been studied apart from genetics and molecular processes. Building a virtual immune system requires modeling how these influences dynamically alter immune states and contribute to disease risk.
You can read about our current projects related to each pillar below.
Understanding the System Architecture through Genetic Discovery
Human genetic architecture of infectious diseases. Infectious diseases, as a major evolutionary pressure, provide a powerful lens for studying immune network architecture. Multi-ancestry studies are especially informative, revealing both population-specific adaptations and shared mechanisms that transcend ancestry. To enable this work, we developed a cloud-based pipeline for large-scale genetic association studies across hundreds of thousands of individuals, automating QC, association testing, and downstream analyses. Using this framework, we are leading the largest genetic studies of infectious disease phenotypes to date, spanning 99 traits and >800,000 diverse individuals from three biobanks (Mount Sinai biobank, UK Biobank, and All of Us biobank). To date, we have identified 217 associations, 88% of which are novel, including loci under positive selection, providing insight into human evolution, and novel loci with therapeutic potential.
Somatic mutation landscape and (auto)inflammatory disease risk. Somatic mutations acquired during life represent an underexplored source of immune dysfunction. Using exome and genome sequencing data from >60,000 biobank participants, we developed a pipeline to systematically detect somatic variants at scale and benchmark findings for accuracy and reliability. By integrating mutation calls with longitudinal clinical data, we can identify causal variants while accounting for clonal expansion and temporal dynamics. This approach has confirmed UBA1 mutations underlying VEXAS syndrome and uncovered candidate variants in six additional immune genes with links to adult-onset (auto)inflammatory disease.
Understanding How Information Flows Through Immune Networks
Understanding how genetic risk translates to downstream molecular and physiological functions. Genetic variants act as natural perturbations, providing a powerful resource for studying information flow within the immune network. To this end, we combine results from genetic association studies with large-scale quantitative molecular biology datasets, including bulk expression and DNA methylation (GTEx), proteomics (UK Biobank), and single-cell expression (Human Cell Atlas) to determine how genetic variations shape these molecular functions. To capture higher-level immune functions, we extended this approach to physiological and tissue-level data. Using hundreds of laboratory measurements from hundreds of thousands of biobank participants, we are mapping how genetic variants influence immune traits such as blood cell counts, inflammatory biomarkers, and immunoglobulin levels.
Causal inference analyses. A central challenge in network biology is distinguishing causal from correlative relationships. To address this, we use statistical frameworks such as colocalization, Mendelian randomization, and shared genetic correlation. Colocalization tests whether two traits (e.g., gene expression and immunoglobulin levels) share the same causal variant at a locus. Mendelian randomization leverages genetic variants as natural instruments to estimate the causal effect of one trait on another. Shared genetic correlation quantifies genome-wide overlap between traits, highlighting pathways that underlie their associations. Combined these methods help understand how information is shared across the immune system to shape its highly pleotropic nature.
Understanding How System Context Shapes Genetic Risk and Information Flow:
Environmental determinants of immune function. Past pathogen exposures leave durable imprints on the immune system, but these effects are rarely examined alongside genetic or functional data. To address this, in a preliminary project, we profiled viral exposure histories in 12 individuals using high-throughput viral exposure history profiling, integrating these data with proteomics after ex vivo activation of antiviral pathways (TLR3 and TLR7/8). We observed that chronic and acute viral infections can have lasting effects on cytokine production; for example, HSV-1 exposure was linked to altered CCL4 levels, while norovirus exposure affected TNF responses. Read about this project here: https://www.biorxiv.org/content/10.1101/2025.09.15.675924v1
Social and behavioral determinants of health. Social and behavioral factors strongly influence disease risk but remain underutilized in genetic and mechanistic studies. Using survey data from >400,000 biobank participants, we developed a framework to distill hundreds of sparse, correlated variables into low-dimensional, orthogonal embeddings that capture the non-genetic component of disease risk. These embeddings can be combined with genetic data to improve disease risk prediction and to explore how genetic and non-genetic factors interact to shape disease susceptibility and discovery power. read about this project here: https://www.medrxiv.org/content/10.1101/2025.09.29.25336903v1
Recent preprints and publications
Viral infections shape inter-individual immune variability https://www.biorxiv.org/content/10.1101/2025.09.15.675924v1.full
A Scalable Framework to Integrate Social Determinants of Health into Disease Risk Models using Biobank Survey Data https://www.medrxiv.org/content/10.1101/2025.09.29.25336903v1
Shared genetic and neuroimmune architecture links type 1 diabetes with neurocognitive traits (collaboration with Alagpulinsa lab at Yale) https://www.medrxiv.org/content/10.1101/2025.09.14.25335719v2
Systematic comparison of phenome-wide admixture mapping and genome-wide association in a diverse biobank (collaboration with Kenny lab at Mount Sinai) https://www.medrxiv.org/content/10.1101/2024.11.18.24317494v1
Clinical Manifestations of VEXAS Syndrome Across a Broad Spectrum of UBA1 Mutation Burden (collaboration with Beck lab at NYU) https://doi.org/10.1002/art.43327
Full publication list: https://scholar.google.ch/citations?user=6RAl95sAAAAJ&hl=en