Pejaver Laboratory

We are recruiting

Postdoctoral fellows

Multiple postdoctoral positions are available in the Pejaver Lab. We are an interdisciplinary group with diverse but interlinked research interests related to biomedical data science and machine learning methodology development. We are broadly seeking postdocs with research interests and complementary expertise related to our three focus areas (see below for details). 

All candidates must have strong communication skills, a commitment to methodological rigor, and the ability to work creatively and collaboratively. Please send inquiries to vikas [dot] pejaver [at] mssm [dot] edu with Postdoc position in the subject and the following materials:

  • A complete CV
  • A cover letter describing which focus area you would be most suited for and how your training and expertise relate to our research interests
  • Contact information for at least 2 references

1. Variant and genome interpretation

We have had a longstanding interest in the prediction of the functional and phenotypic impact of genetic variants, and developing disease risk scores using genomic data. A key aspect of our work is the integration and predictive modeling of protein function, allowing for mechanistic “explanations” of how a variant may lead to disease.  We have ongoing and planned collaborations with the ClinGen Consortium to standardize the use of predictive methods such as ours in clinical variant interpretation and the Impact of Genomic Variation on Function (IGVF) Consortium to expand our prior work to non-coding variation.

Candidate requirements include:

  • PhD in Bioinformatics, Computational Biology, Genomics, Molecular Biology, or a related discipline
  • Expertise in functional annotation of human genomes and proficiency in extracting information from biological knowledgebases such as UniProt and Ensembl
  • Experience developing efficient pipelines to process large genomic data sets in high-performance computing environments
  • Working knowledge of statistical testing and data exploration techniques

 Desired qualifications include:

  • Experience applying machine learning methods to genomics and molecular data sets
  • Experience with functional genomics consortia data resources such as GTEx and ENCODE
  • Familiarity with large genomics data resources such as the 1000 Genomes Project, UK Biobank, ClinVar and gnomAD, among others
  • Working knowledge of modern web service and/or software implementation

2. Deep phenotyping from electronic health records

The Pejaver Lab is interested in the extraction of genetic disease-related information from electronic health records (EHRs) and its integration with genomic, molecular and other clinical data sets to build better cohorts for variant discovery and aid in the timely diagnosis of patients in the clinic. We are currently funded to develop data science methods to identify patients with rare genetic diseases from their health records, particularly exploiting patterns in clinical notes and their odyssey through the health care system. We also collaborate with other labs at IGH to develop phenotypic risk scores using information from EHRs.

Candidate requirements include:

  • PhD or MD-PhD in Biomedical Informatics, Biomedical Data Science, Clinical Research Informatics, or a related discipline
  • Expertise in integrating genomic and health record data sets and proficiency in common data models such as OMOP and i2b2
  • Working knowledge of statistical testing and data exploration techniques
  • Experience working in high-performance computing environments

 Desired qualifications include:

  • Familiarity with standardized vocabularies and ontologies such as UMLS and HPO
  • Working knowledge of natural language processing techniques including transformers and foundational models
  • Working knowledge of modern web service and/or software implementation
  • Familiarity with HIPAA and data governance in large healthcare systems

3. Applied machine learning for biomedical data sets

The applications that we are interested in developing methods for, naturally require innovative problem formulations, novel learning algorithms and customized objective functions. We also particularly emphasize end-to-end implementation of our methods, with the aim of increased adoption and improved decision support in research and clinical settings. Topics of interest include similarity learning, interpretable machine learning, multi-task learning, and structured output learning. We also collaborate with the ClinGen Consortium and the Critical Assessment of Genome Interpretation (CAGI) on developing novel metrics to evaluate computational methods on real-world biomedical data sets.

Candidate requirements include:

  • PhD in Computer Science, Informatics, Data Science, or a related discipline
  • Expertise in machine learning methodology development and proficiency in MLOps
  • Demonstrable interest in applying and evaluating computational methods for biomedical data sets, particularly genomics, longitudinal and/or text data
  • Experience with data pre-processing and cleaning techniques

Desired qualifications include:

  • Experience with deep learning frameworks such as TensorFlow and PyTorch
  • Proficiency in natural language processing techniques including transformers and foundational models
  • Working knowledge of probabilistic modeling, statistical testing and/or data exploration techniques
  • Basic understanding of concepts in molecular biology or biomedical informatics

The Pejaver Lab is seeking doctoral students interested in developing, adapting, evaluating and/or implementing data science and machine learning methods for biomedical data sets, with an emphasis on genetic diseases. We are best suited to advise and mentor doctoral students in Genetics and Genomic Sciences and Artificial Intelligence and Emerging Technologies in Medicine. However, students from other programs/training areas in the Graduate School of Biomedical Sciences are also welcome.

Students interested in setting up rotations in the Pejaver Lab should send inquiries to vikas [dot] pejaver [at] mssm [dot] edu with Doctoral student in the subject and a brief description (1-2 paragraphs) of their educational background and skills.

The Pejaver Lab has opportunities for computationally inclined Master’s students looking to work on their capstone research project or thesis. We have several ideas for feasible projects related to our three focus areas. While we are most suited to design capstone projects or theses for students from the programs below, we are open to working with students from other programs/specializations as well (subject to the rules of their program):

  • MS in Biomedical Data Science (computational genomics, biomedical engineering, imaging and visualization, biostatistics, health information technology)
  • MS in Biomedical Science (Systems Biology Core)
  • MS in Biostatistics (Theory and Methods Track)

Interested students may send inquiries to vikas [dot] pejaver [at] mssm [dot] edu with Master’s project in the subject and a brief description (1-2 paragraphs) of their educational background, skills and career goals. If students already have a project idea, they may include it, along with its relevance to one or more of our focus areas.