Blood proteins may be better than clinical information at determining a person’s risk of developing 67 diseases within the next 10 years, according to information collected through the UK Biobank.
The findings demonstrate how thousands of proteins measured in a single blood sample can predict the onset of diverse diseases.
These diseases covered a broad range of pathology types and included multiple myeloma, motor neuron disease, pulmonary fibrosis, non-Hodgkin lymphoma and dilated cardiomyopathy.
The findings are published in the journal Nature Medicine.
“We are extremely excited about the opportunity to identify new markers for screening and diagnosis from the thousands of proteins circulating and now measurable in human blood,” said lead author Claudia Langenberg, PhD, director of the Precision Healthcare University Research Institute (PHURI) at Queen Mary University of London.
First author Julia Carrasco Zanini Sanchez, PhD, currently a postdoctoral researcher at PHURI, agreed that the protein signatures offered opportunities to improve the detection and prognosis of many diseases.
“Several of our protein signatures performed similar or even better than proteins already trialled for their potential as screening tests, such a prostate specific antigen for prostate cancer,” she said.
A central challenge in precision medicine is to develop clinically useful tools for identifying individuals at high risk, the authors note.
Clinically recommended tools for predicting the risk of a heart attack and stroke are widely used but this is the currently the case for few other diseases.
To explore prediction tools further, the researchers studied the value of approximately 3000 plasma proteins among 41,931 participants in the largest proteomics study to date, the UK Biobank Pharma Proteomics Project.
Specifically, the team examined the 10-year potential of this measurable plasma proteome to predict 218 pathologically diverse diseases by linking it to more than 80 incident cases on electronic health records.
The results were compared with prediction models using basic clinical information collected during usual care without and with data from 37 blood assays used in current clinical practice, together with polygenic risk scores.
Sparse protein signatures, using the 5 to 20 proteins most important for prediction, were better than models developed using basic clinical information for 67 pathologically diverse diseases.
For 52 of the diseases studied, protein-based models were better than clinical models with blood assays.
Polygenic risk scores were available for 23 diseases, and these significantly outperformed clinical models without blood assays in seven diseases. Proteins outperformed the polygenic risk scores for all these diseases except for breast cancer.
Single-cell RNA sequencing from bone marrow in patients newly diagnosed with multiple myeloma showed that four of the five predictor proteins were expressed specifically in plasma cells, which was consistent with the strong predictive power of these proteins.
External replication was possible for six of the 67 diseases using data from the EPIC-Norfolk study. Here, proteins were superior to clinical models and supported the generalizability of the findings.
“A key challenge in drug development is the identification of patients most likely to benefit from new medicines,” said co-lead author Robert Scott, GSK head of human genetics and genomics.
“This work demonstrates the promise in the use of large-scale proteomic technologies to identify individuals at high risk across a wide range of diseases, and aligns with our approach to use tech to deepen our understanding of human biology and disease.”