Research Publications with leadership effort (from members in the Do lab)

Duffy Á, Petrazzini BO, Stein D, Park JK, Forrest IS, Gibson K, Vy HM, Chen R, Márquez-Luna C, Mort M, Verbanck M, Schlessinger A, Itan Y, Cooper DN, Rocheleau G, Jordan DM, Do R. Development of a human genetics-guided priority score for 19,365 genes and 399 drug indications. Nature Genetics. 2024 Jan 3. doi: 10.1038/s41588-023-01609-2.


Chen R, Petrazzini BO, Malick W, Rosenson R, Do R. Prediction of Venous Thromboembolism in Diverse Populations Using Machine Learning and Structured Electronic Health Records. Arterioscler Thromb Vasc Biol. 2023 Dec 14. doi: 10.1161/ATVBAHA.123.320331.

Chen R, Petrazzini BO, Nadkarni G, Rocheleau G, Bansal M, Do R. Machine Learning Enables Single-Score Assessment of MASLD Presence and Severity. medRxiv. 2023 Oct 25:2023.10.24.23297423. doi: 10.1101/2023.10.24.23297423.

Forrest IS, O’Neal AJ, Pedra JH, Do R. Cholesterol contributes to risk, severity, and machine learning-driven diagnosis of Lyme disease. Clinical Infectious Diseases. 2023 May 25:ciad307.

Forrest IS, Petrazzini BO, Duffy Á, Park JK, O’Neal AJ, Jordan DM, Rocheleau G, Nadkarni GN, Cho JH, Blazer AD, Do R. A machine learning model identifies patients in need of autoimmune disease testing using electronic health records. Nature Communications. 2023 Apr 25;14(1):2385.

Park JK, Petrazzini BO, Saha A, Vaid A, Vy HM, Márquez‐Luna C, Chan L, Nadkarni GN, Do R. Machine Learning Identifies Plasma Metabolites Associated With Heart Failure in Underrepresented Populations With the TTR V122I Variant. Journal of the American Heart Association. 2023 Apr 18;12(8):e027736.

Park JK, Bafna S, Forrest IS, Duffy Á, Marquez-Luna C, Petrazzini BO, Vy HM, Jordan DM, Verbanck M, Narula J, Rosenson RS, Rocheleau G, Do R. Phenome-wide Mendelian randomization study of plasma triglyceride levels and 2600 disease traits. Elife. 2023 Mar 29;12:e80560.

Petrazzini B, Vaid A, Park J, Marquez-Luna C, Vy H, Saha A, Chaudhary K, Cho J, Chan L, Argulian E, Narula J, Nadkarni GN, Do R. Short-term prediction of coronary artery disease using serum metabolomic patterns. American Heart Journal Plus: Cardiology Research and Practice. 2022 Nov 23;24.

Forrest IS, Petrazzini BO, Duffy Á, Park JK, Marquez-Luna C, Jordan DM, Rocheleau G, Cho JH, Rosenson RS, Narula J, Nadkarni GN, Do R. Machine learning-based marker for coronary artery disease: derivation and validation in two longitudinal cohorts. The Lancet. 2023 Jan 21;401(10372):215-25.



Rocheleau G, Forrest IS, Duffy Á, Bafna S, Dobbyn A, Verbanck M, Won HH, Jordan DM, Do R. A tissue-level phenome-wide network map of colocalized genes and phenotypes in the UK Biobank. Communications Biology. 2022 Aug 20;5(1):849.

Forrest IS, Chan L, Chaudhary K, Saha A, Wen HH, Cepin CL, Marquez-Luna C, Rocheleau G, Cho J, Narula J, Nadkarni GN, Do R. Genome-first recall of healthy individuals by polygenic risk score reveals differences in coronary artery calcium. Am Heart J. 2022 Aug;250:29-33. doi: 10.1016/j.ahj.2022.04.006. Epub 2022 May 5. PMID: 35526571.

Ben O. Petrazzini, Kumardeep Chaudhary, Carla Márquez-Luna, Iain S. Forrest, Ghislain Rocheleau, Judy Cho, Jagat Narula, Girish Nadkarni, and Do R. Coronary Risk Estimation Based on Clinical Data in Electronic Health Records. J Am Coll Cardiol. 2022 Mar, 79 (12) 1155–1166

Forrest IS, Rocheleau G, Bafna S, Argulian E, Narula J, Natarajan P, Do R. Genetic and phenotypic profiling of supranormal ejection fraction reveals decreased survival and underdiagnosed heart failure. European Journal of Heart Failure. 2022 Mar. DOI: 10.1002/ejhf.2482. PMID: 35278270.

Forrest IS, Chaudhary K, Vy HM, Petrazzini BO, Bafna S, Jordan DM, Rocheleau G, Loos RJ, Nadkarni GN, Cho JH, Do R. Population-Based Penetrance of Deleterious Clinical Variants. JAMA. 2022 Jan 25;327(4):350-9.

Balick DJ, Jordan DM, Sunyaev S, Do R. Overcoming constraints on the detection of recessive selection in human genes from population frequency data. Am J Hum Genet. 2022 Jan 6;109(1):33-49. doi: 10.1016/j.ajhg.2021.12.001.



Ben O. PetrazziniDaniel J. BalickIain S. ForrestJudy ChoGhislain RocheleauDaniel M. JordanDo R. Prediction of recessive inheritance for missense variants in human disease.

Forrest IS, Jaladanki SK, Paranjpe I, Glicksberg BS, Nadkarni GN, Do R. Non-invasive ventilation versus mechanical ventilation in hypoxemic patients with COVID-19. Infection. 2021 Oct;49(5):989-997. doi: 10.1007/s15010-021-01633-6. Epub 2021 Jun 5. PMID: 34089483; PMCID: PMC8179090.

Forrest IS, Chaudhary K, Vy HMT, Bafna S, Kim S, Won HH, Loos RJF, Cho J, Pasquale LR, Nadkarni GN, Rocheleau G, Do R. Genetic pleiotropy of ERCC6 loss-of-function and deleterious missense variants links retinal dystrophy, arrhythmia, and immunodeficiency in diverse ancestries. Hum Mutat. 2021 Aug;42(8):969-977. doi: 10.1002/humu.24220. Epub 2021 May 31. PMID: 34005834; PMCID: PMC8295228.

G. Rocheleau, I. S. Forrest, Á. Duffy, S. Bafna, A. Dobbyn, M. Verbanck, Hong-Hee Won, Daniel M. Jordan, Do R. A tissue-level phenome-wide network map of colocalized genes and phenotypes in the UK Biobank. bioRxiv 2021 Pages 2021.04.30.441974 DOI: 10.1101/2021.04.30.441974.

Forrest IS, Chaudhary K, Paranjpe I, Vy HMT, Marquez-Luna C, Rocheleau G, Saha A, Chan L, Van Vleck T, Loos RJF, Cho J, Pasquale LR, Nadkarni GN, Do R. Genome-wide polygenic risk score for retinopathy of type 2 diabetes. Hum Mol Genet. 2021 May 29;30(10):952-960. doi: 10.1093/hmg/ddab067. PMID: 33704450; PMCID: PMC8165647.

Chaudhary K, Petrazzini BO, Narula J, Nadkarni GN, Do R. Prediction of Incident Heart Failure in TTR Val122Ile Carriers One Year Ahead of Diagnosis in a Multiethnic Biobank. Am J Cardiol. 2021 Mar 1;142:151-153. doi: 10.1016/j.amjcard.2020.12.015. Epub 2020 Dec 18. PMID: 33333072.

Vy HMT, Jordan DM, Balick DJ, Do R. Probing the aggregated effects of purifying selection per individual on 1,380 medical phenotypes in the UK Biobank. PLoS Genet. 2021 Jan 25;17(1):e1009337. doi: 10.1371/journal.pgen.1009337. PMID: 33493176; PMCID: PMC7861521.



Duffy Á, Verbanck M, Dobbyn A, Won HH, Rein JL, Forrest IS,Nadkarni G, Rocheleau G, Do R. Tissue-specific genetic features inform prediction of drug side effects in clinical trials. Sci Adv. 2020 Sep 10;6(37):eabb6242. doi: 10.1126/sciadv.abb6242. PMID: 32917698.

Aragam KG*, Dobbyn A*, Judy R, Chaffin M, Chaudhary K, Hindy G, Cagan A, Finneran P, Weng LC, Loos RJF, Nadkarni G, Cho JH, Kember RL, Baras A, Reid J, Overton J, Philippakis A, Ellinor PT, Weiss ST, Rader DJ, Lubitz SA, Smoller JW, Karlson EW, Khera AV, Kathiresan S, Do R°, Damrauer SM°, Natarajan P°. Limitations of Contemporary Guidelines for Managing Patients at High Genetic Risk of Coronary Artery Disease. J Am Coll Cardiol. 2020 Jun 9;75(22):2769-2780.



Damrauer SM*, Chaudhary K*, Cho JH*, Liang LW, Argulian E, Chan L, Dobbyn A, Guerraty MA, Judy R, Kay J, Kember RL, Levin MG, Saha A, Van Vleck T, Verma SS, Weaver J, Abul-Husn NS, Baras A, Chirinos JA, Drachman B, Kenny EE, Loos RJF, Narula J, Overton J, Reid J, Ritchie M, Sirugo G, Nadkarni G°, Rader DJ°, Do R°. Association of the V122I Hereditary Transthyretin Amyloidosis Genetic Variant With Heart Failure Among Individuals of African or Hispanic/Latino Ancestry. JAMA. 2019 Dec 10;322(22):2191-2202.

 Jordan DM*, Verbanck M*, Do R. HOPS: a quantitative score reveals pervasive horizontal pleiotropy in human genetic variation is driven by extreme polygenicity of human traits and diseases. Genome Biol. 2019 Oct 25;20(1):222.

Jordan DM*, Choi HK*, Verbanck M*, Topless R, Won HH, Nadkarni G, Merriman TR, Do R. No causal effects of serum urate levels on the risk of chronic kidney disease: A Mendelian randomization study. PLOS Med. 2019 Jan 15;16(1):e1002725.



Verbanck M*, Chen CY*, Neale B°, Do R°. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018 Apr 23.

Jordan DM, Do R. Using Full Genomic Information to Predict Disease: Breaking Down the Barriers Between Complex and Mendelian Diseases. Annu Rev Genomics Hum Genet. 2018 Apr 11



What can we learn about lipoprotein metabolism and coronary heart disease from studying rare variants? Jeff JM, Peloso GM, Do R. Curr Opin Lipidol. 2016 Apr;27(2):99-104.



Won HH, Natarajan P, Dobbyn A, Jordan DM, Roussos P, Lage K, Raychaudhuri S, Stahl E, Do R. Disproportionate Contributions of Select Genomic Compartments and Cell Types to Genetic Risk for Coronary Artery Disease. PLoS Genet. 2015 Oct 28;11(10):e1005622.

Do R, Balick D, Li H, Adzhubei I, Sunyaev S, Reich D.  No evidence that natural selection has been less effective at removing deleterious mutations in Europeans than in West Africans. Nature Genetics. 2015 Feb;47(2):126-31.



Do R*, Stitziel NO*, Won H-H*, Jørgensen AB, Duga S, Merlini PA, Kiezun A, Won HH, Farrall M, Goel A, Zuk O, Guella I, Asselta R, Lange LA, Peloso GM, Auer PL, NHLBI Exome Sequencing Project, Girelli D, Martinelli N, Farlow DN, DePristo MA, Roberts R, Stewart AFR, Saleheen D, Danesh J, Epstein SE, Sivapalaratnam S, Hoving GK, Kastelein JJ, Samani NJ, Schunkert H, Erdmann J, Shah SH, Kraus WE, Davies R, Nikpay M, Johansen CT, Wang  J, Hegele RA, Hechter E, Marz W, Kleber ME, Huang  J, Johnson AD, Li M, Burke GL, Gross M, Liu Y, Assimes TL, Heiss G, Lange EM, Folsom AR, Taylor HA, Olivieri O, Hamsten A, Clarke R, Rivas MA, Donnelly P, Rossouw JE, Psaty BM, Herrington DM, Wilson JG, Rich SS, Bamshad MJ, Tracy RP, Cupples LA, Rader DJ, Reilly MP, Spertus JA, Cresci S, Hartiala J, Wilson Tang WH, Hazen SL, Allayee H, Reiner AP, Carlson CS, Kooperberg C, Jackson RD, Boerwinkle E, Lander ES, Schwartz SM, Siscovick DS, McPherson R, Tybjaerg-Hansen A, Abecasis GR, Watkins H, Nickerson DA, Ardissino D,Sunyaev SR, O’Donnell CJ, Altshuler DA, Gabriel S, Kathiresan S.  Multiple rare alleles at LDLR and APOA5 confer risk for early-onset myocardial infarction.  Nature. 2014 Dec 10. doi: 10.1038/nature13917.



Do R, Willer CJ, Schmidt EM, Sengupta S, Gao C, Peloso GM, Gustafsson S, Kanoni S, Ganna A, Chen J, Buchkovich ML, Mora S, Beckmann JS, Bragg-Gresham JL, Chang HY, Demirkan A, Den Hertog HM, Donnelly, Ehret GB, Esko T, Feitosa MF, Ferreira T, Fischer K, Fontanillas P, Fraser RM, Freitag DF, Gurdasani D, Heikkila K, Hyppponen E, Isaacs A, Jackson AU, Johansson J, Johnson T, Kaakinen M, Kettunen J, Kleber ME, Li X, Luan J, Lyytikainen LP, Magnusson PKE, Mangino M, Mihailov, Montasser ME, Muller-Nurasyid M, Nolte IM, O’Connell JR, Palmer CD, Perola M, Petersen AK, Sanna S, Saxena R, Service SK, Shah S, Shungin D, Sidore C, Song C, Strawbridge RJ, Surakka I, Tanaka T, Teslovich TM, Thorleifsson G, Van den Herik EG, Voight BF, Volcik KA, Waite LL, Wong A, Wu Y, Zhang W, Absher D, Asiki G, Barroso I, Been LF, Bolton JL, Bonneycastle LL, Brambilla P, Burnett MS, Cesana G, Dimitriou M, Doney ASF, Doring A, Elliott P, Epstein SE, Eyjolfsson GI, Gigante B, Goodarzi MO, Grallert H, Gravito ML, Groves CJ, Hallmans G, Hartikainen AL, Hayward C, Hernandez D, Hicks AA, Holm H, Hung YK, Illig T, Jones MR, Kaleebu P, Kastelein JJP, Khaw KT, Kim E, Klopp N, Komulainen P, Kumari M, Langenberg C, Lehtimäki T, Lin SY, Lindström J, Loos RJF, Mach F, McArdle WL, Meisinger C, Mitchell BD, Müller G, NagarajaR, Narisu N, Nieminen TVM, Nsubuga RN, Olafsson I, Ong KK, Palotie A, Papamarkou T, Pomilla C, Pouta A, Rader DJ, Reilly MP, Ridker PM, Rivadeneira F, Rudan I, Ruokonen A, Samani N, Scharnagl H, Seeley J, Silander K, Stančáková A, Stirrups K, Swift AJ, Tiret L, Uitterlinden AG, Joost van Pelt L, Vedantam S, Wainwright N, Wijmenga C, Wild SH, Willemsen G, Wilsgaard T, Wilson JF, Young EH, Zhao JH, Adair LS, Arveiler D, Assimes TL, Bandinelli S, Bennett F, Bochud M, Boehm BO, Boomsma DI, Borecki IB, Bornstein SR, Bovet P, Burnier M, Campbell H, Chakravarti A, Chambers JC, Chen YI, Collins FS, Cooper RS, Danesh J, Dedoussis G, de Faire U, Feranil AB, Ferrières J, Ferrucci L, Freimer NB, Gieger C, Groop LC, Gudnason V, Gyllensten U, Hamsten A, Harris TB, Hingorani A, Hirschhorn JN, Hofman A, Hovingh GK, Hsiung CA, Humphries SE, Hunt SC, Hveem K, Iribarren C, Järvelin MR, Jula A, Kähönen M,  Kaprio J, Kesäniemi A, Kivimaki M, Kooner JS, Koudstaal PJ, Krauss RM, Kuh D, Kuusisto J, Kyvik KO, Laakso M, Lakka TA, Lind L, Lindgren CM, Martin NG, März W, McCarthy MI, McKenzie CA, Meneton P, Metspalu A, Moilanen L, Morris AD, Munroe PB, Njølstad I, Pedersen NL, Power C, Pramstaller PP, Price JF, Psaty BM, Quertermous T, Rauramaa R, Saleheen D, Salomaa V, Sanghera DK, Saramies J, Schwarz PEH, Sheu WHH, Shuldiner AR, Siegbahn A, Spector TD, Stefansson K, Strachan DP, Tayo BO, Tremoli E, Tuomilehto J, Uusitupa M, van Duijn CM, Vollenweider P, Wallentin L, Wareham NJ, Whitfield JB, Wolffenbuttel BHR, Altshuler D, Ordovas JM, Boerwinkle E, Palmer CNA, Thorsteinsdottir U, Chasman DI, Rotter JI, Franks PW, Ripatti S, Cupples LA, Sandhu MS, Rich SS, Boehnke M, Deloukas P, Mohlke KL, Ingelsson E, Abecasis GR, Daly MJ, Neale BM, Kathiresan S.  Common variants associated with plasma triglycerides and risk for coronary artery disease. Nature Genetics.2013 Nov;45(11):1345-52. Source Code is available here.



Do R, Kathiresan S, Abecasis G.  Exome sequencing and complex disease: practical aspects of rare variant association studies.  Human Molecular Genetics.  2012 Oct 15,21(R1):R1-9.

Kiezun A*, Garimella K*, Do R*, Stitziel N*, Neale BM, McLaren P, Sklar P, Sullivan P, Moran J, Hultman C, Lichtenstein P, Magnusson P, International HIV Controllers Study, Lehner T, Shugart YY, Price A, de Bakker P, Purcell S, Sunyaev SS.  Exome sequencing and the genetic basis of complex traits.  Nature Genetics. 2012 May 29,44(6):623-30.



Do R, Xie C, Zhang X, Islam S, Bailey SD, Ragarajan S, McQueen M, Wang X, Yusuf S, Engert JC, Anand SS.  The effect of chromosome 9p21 variants on cardiovascular disease may be modified by diet:  evidence from a case/control and a prospective Study.  PloS Medicine.  2011 Oct,9(10).



Musunuru K*, Pirruccello J*, Do R*, Peloso GM, Guiducci C, Sougnez C, Garimella K, Fisher S, Abreu J, Barry A, Fennell T, Banks E, Ambrogio L, Cibulskis K, Kernytsky A, Gonzalez E, Rudzicz N, Engert JC, DePristo M, Daly MJ, Cohen J, Hobbs H, Altshuler D, Schonfeld G, Gabriel S, Yue P, Kathiresan S.  Exome Sequencing identifies ANGPTL3 as a cause of familial combined hypolipidemia. New England Journal of Medicine. 2010 Dec 2;363(23):2220-7.

Do R, Bailey SD, Pare G, Montpetit A, Desbiens K, Hudson TJ, Bouchard C, Gaudet D, Perusse L, Anand SA, Vohl MC, Pastinen T, Engert JC.  Fine mapping of the INSIG2 gene identifies a variant associated with low-density lipoprotein cholesterol and total ApoB levels.  Circulation: Cardiovascular Genetics. 2010 Oct;3(5):454-61.


 Do R, Kiss RS, Gaudet D, Engert JC.  Squalene synthase: a critical enzyme in the cholesterol biosynthesis pathway.  Clinical Genetics, Jan,75(1):19-29 (2009).



Do R, Bartlett KH, Dimich-Ward H, Chu W, Kennedy SM.  Biomarkers of airway acidity and oxidative stress in exhaled breath condensate from grain workers. American Journal of Respiratory and Critical Care Medicine. 2008 Nov 15;178(10):1048-54.

Do R, Pare G, Montpetit A, Hudson TJ, Gaudet D and Engert JC,  K45R variant of squalene synthase increases total cholesterol levels in two study samples from a French Canadian population. Human Mutation. 2008 May;29(5):689-94.

Do R, Bailey SD, Desbiens K, Belisle A, Montpetit A, Bouchard C, Perusse L, Vohl MC, Engert JC.  Genetic variants of FTO influence adiposity, insulin sensitivity, leptin levels, and resting metabolic rate in the Quebec Family Study. Diabetes. 2008 Apr;57(4):1147-50.

Do R, Bartlett KH, Chu W, Dimich-Ward H, Kennedy SM.  Within- and between-person variability of exhaled breath condensate pH and NH4+ in never and current smokers. Respiratory Medicine. 2008 Mar;102(3):457-63.