Researchers have sequenced 64 full human genomes in a project aimed in part at improving the diversity represented by current sequencing data and establishing a new “reference” genome. Their data includes “32 diverse human genomes,” from around the world, the authors write, and will enable population-specific studies on genetic predispositions to human diseases as well as the discovery of more complex forms of genetic variation.
Researchers at the University of Maryland School of Medicine (UMSOM) co-authored the study, published this week in the journal Science.
When the first draft of the human genome was published twenty year ago, it was a composite that did not accurately capture the complexity of human genetic variation. Since then, researchers have tried to improve the human reference genome to better reflect human genetic diversity.
This new data was obtained using a combination of advanced sequencing and mapping technologies. Importantly, each of the genomes was assembled without guidance from the first human genome composite. As a result, the new dataset captures genetic differences from different human populations.
The researchers write that “Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent–child trio data.” Further, they found 107,590 structural variants, “of which 68% are not discovered by short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence.”
“We’ve entered a new era in genomics where whole human genomes can be sequenced with exciting new technologies that provide more substantial and accurate reads of the DNA bases,” said study co-author Scott Devine, Ph.D., Associate Professor of Medicine at UMSOM and faculty member of IGS. “This is allowing researchers to study areas of the genome that previously were not accessible but are relevant to human traits and diseases.”
Institute of Genome Science (IGS)’s Genome Resource Center (GRC) was one of three sequencing centers, along with Jackson Labs and the University of Washington, that generated the data using a new sequencing technology that was developed recently by Pacific Biosciences. The GRC was one of only five early access centers that was asked to test the new platform.
Devine helped to lead the sequencing efforts for this study and also led the team that discovered the presence of “mobile elements” (i.e., pieces of DNA that can move around and get inserted into other areas of the genome). Other members of the Institute for Genome Sciences (IGS) at the University of Maryland School of Medicine are among the 65 co-authors.
“The landmark new research demonstrates a giant step forward in our understanding of the underpinnings of genetically-driven health conditions,” said E. Albert Reece, MD, PhD, MBA, Executive Vice President for Medical Affairs, UM Baltimore. “This advance will hopefully fuel future studies aimed at understanding the impact of human genome variation on human diseases.”
The scientists credit their work in part to new technology, writing “Advances in long-read sequencing, coupled with orthogonal genome-wide mapping technologies, have made it possible to fully resolve and assemble both haplotypes of a human genome.”