Whole Genome Sequencing in the NHLBI Trans-Omics for Precision Medicine

Trans-Omics for Precision Medicine (TOPMed), sponsored by the National Institutes of Health's National Heart, Lung and Blood Institute (NHLBI), is a program to generate scientific resources to enhance our understanding of fundamental biological processes that underlie heart, lung, blood and sleep disorders (HLBS).  It is part of a broader Precision Medicine Initiative, which aims to provide disease treatments that are tailored to an individual’s unique genes and environment. TOPMed will contribute to this initiative through the integration of whole-genome sequencing (WGS) and other –omics (e.g., metabolic profiles, protein and RNA expression patterns) data with molecular, behavioral, imaging, environmental, and clinical data. In doing so, this program seeks to uncover factors that increase or decrease the risk of disease, identify subtypes of disease, and develop more targeted and personalized treatments.

The Whole Genome Sequencing (WGS) project is part of NHLBI’s TOPMed program and serves as an initial step for the larger initiative.  In recent years, genetic research of complex disease using Genome-Wide Association Study (GWAS) and Exome-sequencing approaches has resulted in an unprecedented explosion of genetic discovery.  However, a large portion of heritability in complex diseases remains elusive.  Whole Genome Sequencing (WGS) will provide a comprehensive view of the genome, an opportunity to further understand the genetic architecture relevant to HLBS disorders, and an unprecedented resource to the scientific community.  

The WGS project started in 2014 to generate deep WGS data for studies with diverse ancestries and extensive characterization of HLBS-related traits. The current TOPMed project studies have a variety of study designs including family, case-control, pharmacogenomic, cohort based designs, founder populations and clinical studies. As of November 2016, >54,000 whole genomes have been completed and an additional 18,000 are in progress.  Future plans will bring the total to over 120,000 deeply sequenced whole genomes.

Currently, the TOPMed program consortium includes centers that support program activities such as data coordination, informatics research, whole-genome sequencing, RNA sequencing, and metabolite and methylation profiling. Two of these centers, the Data Coordination Center and the Informatics Research Center, serve the entire TOPMed program. There are multiple participating sequencing centers. As the TOPMed program grows, it is anticipated that other centers will join the consortium.

The sequence reads and genotype call sets for all TOPMed studies will be deposited in The database of Genotypes and Phenotypes (dbGaP) for controlled access by the scientific community. The first wave of data release was completed in October 2016 with more than 8,600 samples in 15 different study accessions.  These accessions can be identified by searching the dbGaP web site for “TOPMed”. Additional waves of data release will be made approximately every six months as the program progresses.