Twenty years after the publication of the first draft of the human genome, our views have changed on the cipher for humanity, the identity of our ancestors, the differences among us, and the genomic nuances that can govern disease. Much has been deciphered and plenty remains. It is an opportune moment to assess how far we have come in the application of genomics to medicine and the obstacles and priorities ahead.
The first draft of the human genome sequence was published in Nature in February 2001 by the international genome sequencing consortium. The rival team at Celera Genomics published their own genome assembly the same week in Science.
Yardsticks of success
“There are three domains in which we should measure the success of genomics in clinical medicine,” says David Altshuler, Ph.D., EVP, global research and chief scientific officer at Vertex Pharmaceuticals. Altshuler started working on the Human Genome Project (HGP) in 1997, and had leadership roles on the SNP consortium, International HapMap project, and the 1000 Genomes public-private partnerships, which generated rich maps of genome variation and serve as a reference for disease research.
“First, the tools, technologies and data used to perform biomedical research. Second, genetic testing to better understand people’s risk. And third, insights into human biology that inspire therapeutics,” Altshuler says. “It is the data technology and insights into causal biology that are most important, not genetic testing. Testing a patient is only of value if there is something you can do about it.” That comes about through “better understanding tools, technologies, and data, or developing new therapeutics.”
Genetic diseases caused by single or a combination of genetic lesions have reaped direct or indirect therapeutic benefits since the publication of the human genome. “The focus is on how common is a cause that can be targeted with a therapeutic approach,” says Heidi Rehm, Ph.D., chief genomics officer, Center for Genomic Medicine and Department of Medicine at Massachusetts General Hospital.
The reference human genome sequence annotated with comprehensive genetic variant datasets is making it possible to target rare diseases that in aggregate affect 10% of the population. “There are increasing numbers of drugs that are being developed for rare diseases, although we’re still in the infancy of therapeutic development,” says Rehm.
Joshua Denny, M.D., CEO of the National Institutes of Health’s All of Us Research Program, says there are four fundamental areas in clinical therapeutics that have been driven by genomics: the genetic bases of rare diseases, common diseases, pharmacogenomics, and cancer care. “For cystic fibrosis and spinal muscular atrophy, we now have genetically targeted therapies that are curative and transformational for parts of the population. We now know thousands of genetic loci for common diseases that can point us to better predictions in the future. We’re not there yet, but at least in one case, we actually have a drug on the market that was discovered purely with genetics—the PCSK9 inhibitors that lower cholesterol.”
At the onset of the HGP in 1990, few disease genes were known. Today we have catalogued thousands of Mendelian diseases along with an arsenal of analytical tools—SNP and haplotype maps, genotyping, exome sequencing ,and whole-genome sequencing. We have also documented tens of thousands of genetic risk factors for common diseases. And the impact goes beyond the lab and the clinic.
“If you had said to me, 15 years ago that I would watch football on TV and many of the commercials would be for consumer genomic testing for personalized risk and ancestry, I would never have believed it,” says Altshuler. “That was such a basic science activity. The idea that tens of millions of people would be tested didn’t occur to us.”
Obstacles in applying genomics to develop clinical therapeutics
Despite the progressive insights into the complexity of the human genome, the application of genomics to clinical diagnostics and the development of therapeutics is still fraught with roadblocks.
“The major obstacles include the rarity of each disorder, the diversity of causal genetic variation, and the lack of understanding of the pathogenic mechanisms of disease,” says Rehm. Despite major advances in the ability to diagnose genetic diseases through comprehensive next-generation sequencing panels, exome and whole-genome approaches, “diagnostic yields are still only 20 to 50% depending on the clinical phenotype due to the large number of causes of genetic disease that remain unknown,” Rehm says.
“We probably have a lot of rare diseases that are undiagnosed. Routine and earlier sequencing might help that,” says Denny.
But inroads are being made. Understanding the regulatory meshwork of gene expression and splicing are two key benefits from the HGP that have facilitated understanding of pathogenic mechanisms. “However, knowing what the expressed protein does and how its absence or augmented function leads to effects in the human body are more difficult tasks,” says Rehm. “Therapeutic approaches that target levels of gene expression as opposed to biological pathways have been easier to tackle.”
Surprisingly perhaps, more than 100 Mb of genome sequence has been excluded from the reference genome until recently, and therefore inaccessible for biomedical research. “These new sequences introduced through the T2T [telomere-to-telomere] Consortium represent large regions of repetitive DNA. It is a new challenge to study these regions using conventional experimental and computational tools that are designed for single copy, easily mappable sites in our genome,” says Karen Miga, Ph.D., research scientist at the UC Santa Cruz Genomics Institute, which led the telomere-to-telomere assembly of a complete human X chromosome.
Research into fundamental biological pathways has lagged the rapid progress in genomics, creating a chasm between technological and analytic capabilities. “The most overarching barrier in using genomics for clinical care is the sizable gap between our technological capability of generating the sequence of a patient’s genome and how to rapidly and reliably interpret all of those variants that you immediately encounter, and not only know which ones are biologically relevant, but which ones are clinically relevant,” says Eric Green, M.D., Ph.D., director of the National Human Genome Research Institute (NHGRI).
Green hopes we reach a point when we can “rapidly interpret a patient’s genome sequence and tell the busy healthcare professional what in that genome sequence is medically important” as easily as generating the data in the first place.
Fostering precision medicine
There is continued excitement in the developing field of pharmacogenomics, which can prevent adverse events in patients receiving medications and allow the stratification of patients in clinical trials. However, the lack of rigorous evidence, not to mention the lack of population diversity, remain major roadblocks.
“Nearly everyone has a genetic variant that would alter drug prescribing but only a handful of centers do pharmacogenomics to alter care,” says Denny. Challenges include inadequate health records, and practical difficulties in bringing all the relevant data to the busy physician. “We don’t have as diverse a genomic sequence or genotype population as we’d like to have.” A third of the U.S. population is represented in 4% of the genome wide association studies (GWAS). The NIH All of Us research program is focused on recruiting a diverse population to address this gap.
Most clinicians do not have access to patients’ genomic data. “The idea of having genomic information about a patient before you make decisions about what medications to give them is an exciting frontier,” says Green. But the field is still gathering data to correlate variants with either a positive or a negative response to a particular medication. “There’s just not enough evidence to indicate that it should be put into routine medical practice,” Green concludes.
There are exceptions, however. In some Asian countries, epilepsy patients must be genotyped before receiving lamotrigine that can cause deadly skin reactions—truly an example of genomic medicine. “To implement precision medicine, we have to have our [EHRs] ready to support clinicians, with the complexity of the genomic knowledge at the patient’s bedside. [EHRs] have to become genomics aware and really help the physician do pharmacogenomics. That’s the last mile that will really make a difference in a patient’s life,” says Denny.
Digital technologies are key
Another lesson has been the application of digital technology. “The systematic collection and widespread sharing of standardized datasets, analyses and knowledge, have been, and will continue to be, necessary to decipher the human genome,” says Rehm.
Collaborative opportunities are emerging in sharing and analyzing standardized genome datasets. “ClinGen is a great example of an NIH-funded program working closely with commercial diagnostic labs to share data to interpret variants,” says Rehm.
The Schizophrenia Exome Sequencing Meta-analysis (SCHEMA), based on the UK Biobank data, will be released in early 2021 and originates from a multi-site collaboration including AbbVie, Alnylam Pharmaceuticals, AstraZeneca, Biogen, Pfizer, Regeneron, and the Broad Institute. This exome sequencing study has reported rare coding variants in ten genes that confer substantial risk for schizophrenia.
“We’re working across academia and industry in the Gene Curation Coalition database (GenCC DB),” says Rehm. The GenCC DB curates gene-disease relationships, particularly Mendelian diseases, submitted by member organizations that currently provide online resources and by diagnostic laboratories that have committed to sharing their curated gene-level knowledge.
We are at a point where genomics is the heart of biomedical research. “It is as impossible to imagine doing biomedical research today without the human genome as it is impossible for us function without computers and the internet,” says Altshuler. Having genomics data at your fingertips has become so commonplace in biotech, that it is often taken for granted, he adds.
Bioinformatics boosted genomics has shrunk the timelines between discovery of a disease, its diagnosis, and the development of a treatment. For instance, the rapidity with which the cause for COVID-19 was identified and a vaccine developed is a direct result of the HGP.
It took three years between the description of the clinical symptoms of AIDS and the discovery that HIV was the cause. “How long did it take with COVID? Two weeks!” notes Altshuler. “The tools of genomics made it possible to sequence bodily fluids, to then computationally subtract out everything that wasn’t new, and arrive at COVID. If you hadn’t sequenced the human genome and all those viruses and bacteria—if you hadn’t had the technology for sequencing at high-depth and high-quality—you wouldn’t have that.”
One result of this is we now have mRNA vaccines developed in just nine months based on mRNA technology that turned the sequence into a candidate vaccine.
Need for harmonized datasets
Hundreds of thousands of large-scale whole genome sequencing (WGS) datasets from diverse populations are now publicly available. Harnessing these datasets to their full potential depends on the availability of high-quality variant calls—identification of single nucleotide polymorphisms (SNPs) and small insertions and deletion (indels)—from large populations. This, in turn, requires joint analysis of all raw data.
However, different WGS data processing pipelines cause substantial differences in variant calling in combined datasets and therefore require computationally expensive reprocessing to make joint analysis feasible. Many people recognize the need to collectively solve the issues of generating, sharing, analyzing, and standardizing the various datasets needed to make the dream of applying genomics for clinical therapeutics a reality.
Examples of such collaborative efforts include the Global Alliance for Genomics and Health (GA4GH), a nonprofit formed in 2013, and the International Hundred Thousand Plus Cohort Consortium (IHCC) established in 2018. These initiatives bring together international cohorts, develop standards for storing and representing data, harmonize genomic datasets for functional equivalence and joint analysis, promote genotyping and sequencing populations at greater depth, gather data on phenotypes, foster innovation in data processing pipelines, and work together to solve problems.
“It will be key that we work internationally to create bigger data sets, but also as a way for expanding the diversity and our population,” says Denny.
The next 20 years
The next 20 years will see a fundamental change in healthcare and the day-to-day maintenance of health, as a direct or tangential consequence of the HGP.
Rehm anticipates an acceleration in therapeutic development for rare diseases and identifying common therapeutic strategies that can be implemented across multiple disorders. “I also anticipate widescale use of genomics in the coming years as a basic tool for both diagnostic and preventive clinical care,” she adds.
Over the next decade, Altshuler predicts further progress in understanding the role of genes and conserved non-coding elements, “which will fuel a whole new world of biology.” Genome sequencing and testing is becoming routine and for some diseases and certain people, “but not necessarily for everyone.” Althsuler also anticipates “more examples of precision medicines that target underlying causes of human disease, and that provide transformative benefit to patients, because rather than basing their hypothesis on an overly reductionist model or a model system, they will be based on causal human biology.”
To expand the evidence-base in general, Denny says we need to sequence more diverse populations. “That means not just knowing their genotypes and their sequences, but knowing what their clinical outcomes were. A lot of our genetic data isn’t necessarily as richly phenotyped to help move things forward. All these are things I see getting a lot better over the next decade,” says Denny.
“If you want to diagnose a rare disease, you really can’t do that with genotyping. We know at least 10% of the population has a rare disease. A lot of those will have genetic influences, but some common diseases—kidney disease, liver failure—may well have an underlying undiagnosed genetic component.” Denny predicts greater use of sequencing in discovering Mendelian disease as parts of those patient populations.
Once EHRs become more genomics-friendly, “we will have to figure out how to better connect them to knowledge resources. All of Us will provide richly phenotyped, diverse whole genome sequenced population that will help us interpret genetic variants that are hard right now,” Denny concludes.
“In the next 20 years,” Green says, “genomics will feel just as much a natural part of medicine as a thermometer or a stethoscope.” Genomic information will become as much as part of the patient’s medical data as their date of birth or blood type. “That’s just going to be fundamental information in the practice of medicine,” he says.