Human Proteome Sequence Map Released by International Research Collaboration

Human Proteome Sequence Map Released by International Research Collaboration

Ten years after starting the project, the Human Proteome Organization (HUPO) has completed a 90.4% read of the human proteome, knowledge essential for diagnosing and treating many human diseases and for advancing a more personalized approach to medicine.

The Human Proteome Project was launched in 2010, 10 years after the official completion of the human genome sequence. It has involved extensive data sharing and research collaboration between scientists based at universities and research institutions around the world.

While the human genome sequence is extremely useful, understanding how the proteins they encode interact with each other, the body, and the environment to influence health has the potential to revolutionize the way medicine is practiced.

“In COVID-19, for instance, there are two proteomes involved, that of the SARS-CoV-2 virus and that of the infected cells, both of which likely interact with, modify, and change the function of the other,” says University of British Columbia’s Chris Overall, Ph.D., a professor who was involved with the project and an author of the paper describing the work that is published in Nature Communications.

“Understanding this relationship can shed light on why some cells and individuals are more resilient to COVID-19 and others more vulnerable, providing essential functional information about the human body that genomics alone cannot answer,” Overall adds.

Five different protein existence (PE) categories were used by the researchers working on the proteome map, with PE1 being the strongest and PE5 the weakest. PE1 classification requires clear experimental evidence of at least one form of the protein, whereas PE5 suggests evidence for the protein is inconclusive or may be incorrect.

The 90.4% coverage of the proteome reported by the team (17,874 proteins) in 2020 is all accounted for by well validated PE1 proteins, whereas the remaining 9.6% is at levels PE2-5 and remains to be confirmed. This is an increase from 13,588 PE1 proteins that had been identified in 2011. The team aims to find PE1 evidence for at least all the current PE2 level proteins (evidence presently limited to the corresponding transcript) in the future.

One of the areas that stands to gain from this understanding of the human proteome is precision medicine. Mutations in the genome can be informative for learning more about cancers, but they do not necessarily cause proteomic changes, so using genomic data alone can be limiting. Using a combination of proteomic and genomic data can help narrow down which mutations cause tumors to form and help find therapy targets more quickly.

Protein assays also form the basis of many diagnostic tests, so learning more about a wider range of proteins will help make these tests more accurate.

Although genes encode proteins, the process of gene expression is complex and many genes are subject to alternative splicing—a process where one gene encodes several related, but different proteins. Some genes are more prone to this than others and altered gene expression due to alternative splicing is common in cardiovascular disorders. For example, developments in proteomics have helped develop phospho-posttranslational modification analysis that has helped identify targets for heart failure therapy associated with the PDE5A protein.

Proteomics also plays a vital role in diagnosing and treating infectious diseases such as the SARS-CoV-2 virus, the cause of the current global pandemic. It has already been used to discover more about the viral particle and the changes it causes in the host cells during infection,  which revealing many of the possible drug targets currently under investigation.

The Human Proteome Project will continue to work on improving the quality of the map they have created. It also plans to encourage the establishing of more multi-omic studies and promote the development of innovative technologies, such as single-cell proteomics, in the future.

“Proteomics plays increasingly important roles in understanding viral outbreak biology, accurate diagnosis and effective treatment and is positioned to continue to co-ordinate and drive international collaborations towards these goals,” conclude the authors in their paper.