As a graduate student at the Hebrew University’s Interdisciplinary Center for Neuronal Computation in the late 1990s, Tuvik Beker, PhD, applied his mathematics and machine learning training to investigate the use of artificial neural networks to solve complex problems.
Twenty years later, after a long and winding road of applying machine learning to various domains, including aerospace, Beker closed the loop and found himself as CEO of Pangea Bio, which was founded on the scientific findings of Beker’s PhD advisor, Eytan Ruppin, MD, PhD, Chief Cancer Data Science Laboratory at the National Cancer Institute.
Now, Beker is applying what he has learned in machine learning to one of medicine’s most difficult challenges: cancer.
Beker’s team at Pangea Biomed collaborated with the Australian National University and the National Cancer Institute to develop ENLIGHT-DP, a deep-learning method that uses imputed transcriptomics from histopathology images to predict cancer treatment responses. The study, published in Nature Cancer, describes a new deep-learning framework called DeepPT that can predict genome-wide tumor mRNA expression from hematoxylin and eosin (H&E)-stained slides. The Pangea Biomed ENLIGHT algorithm then uses these expression values to predict how each patient will respond to treatment.
“One of the biggest problems in medicine right now is treating cancer better with the arsenal of drugs that we already have because we have excellent drugs that are underutilized due to poor biomarkers,” Beker told Inside Precision Medicine. “We believe [ENLIGHT-DP] is a technology that will significantly affect how oncologists decide which treatment to give patients.”
The entire precision oncology pie
Precision oncology requires more than just genetic variants, which are currently used to assign precision oncology treatments and only help treat a subset of cancer patients.
“The reality is that we have a lot of targeted and immune checkpoint blockers, and they can sometimes do miracles—today we see patients who have full remission after being in stage four advanced metastatic cancer, but these complete remission stories are still very rare,” Beker said. “When you look at the total patient population, this only applies to a small percentage of patients.”
According to Beker, the reason is that “actionable” mutations—those targeted by novel drugs such as EGFR, BRAF, and immune checkpoint inhibitors—are extremely rare. In fact, most patients do not have any targetable mutations, and even if they do, the chances of a response are low, ranging from 30% to nothing. Pangea devised a strategy that considers the tumor’s overall picture to account for the remaining cancers.
Pangea began by looking at the transcriptome rather than just the DNA because, according to Beker, abnormal expression is much more common and richer in cancer. The key is to find a way to interpret the transcriptome to identify cancer vulnerabilities that can guide treatment decisions for virtually all patients.
“Rather than asking whether a patient has a specific mutation, we are attempting to determine what gene expression patterns reveal about this patient’s ability to respond to inhibition,” Beker explained. “We find networks of interacting genes that inform treatment by predicting whether a tumor will respond strongly to treatment or quickly develop resistance.”
However, Beker doesn’t think RNA-seq is a viable solution to aid therapeutic recommendations globally because of the lack of access to sequencing instruments and the expertise to analyze genomic and transcriptomic data. Beker and his team wondered if there was a way to obtain gene expression profiles from common, standard, and easy-to-use samples.
“Since we are looking at the transcript and expression patterns to predict response, we asked if we could do it without actually sequencing the tumor because tumor sequencing is costly and time-consuming,” Beker explained. “It takes about a month to receive the test results, and it is available in the United States and Europe, but not the rest of the world.”
Spatial “sequencing”
Pangea enlisted the help of Danh-Tai Hoang, PhD, a research fellow at the Australian National University who specializes in precision oncology and machine learning. Hoang and colleagues used data from the Cancer Genome Atlas Program (TCGA), which contains genomic and transcriptomics RNA sequencing data for approximately 12,000 patients. The TCGA also has many slide images, including standard pathology slides stained with H&E, which pathologists use to diagnose. The digitized versions of these slides are scanned at 20–40x magnification and very high resolution, yielding massive files.
“The good thing about digital pathology is that you get the spatial analysis for free,” Beker said. “When we do RNA sequencing, we usually use bulk RNA from the entire tumor piece from the whole slide, but when we do image analysis, we naturally break down the image into smaller pieces and analyze each of these separately so that the same approach can teach about tumor heterogeneity.”
This concept sparked the idea that machine learning could be used to match spatial features to gene expression patterns, resulting in DeepPT. While DeepPT’s output is not as precise, accurate, or sensitive as RNA-seq, it works well enough at inferring gene expression values to serve as sufficient input for their algorithm ENLIGHT, producing results comparable to those obtained with RNA sequencing data, which was a very unexpected result.
“It was so surprising to us that we initially thought it was a bug, so we halted development and tried to figure out what could be the mistake,” Beker explained. “It took us some time to understand how a noisy inference could still produce such an accurate prediction of the treatment response.”
Beker stated that they cannot fully explain what DeepPT is doing because they are not using explainable artificial intelligence, implying that DeepPT is somewhat of a black box.
“We are just getting started with explainable models to understand what is going on truly,” Beker said. “What about the image that allows this machine to see so much more than we can?”
Understanding the inner workings of their algorithms could help turn ENLIGHT-PT into a global solution, which seems to be the goal of Beker and Pangea Biomed.
“We are not just replacing RNA sequencing with a computer program that analyzes image files—we are replacing, in principle, spatial transcriptomics, which costs 100 times more than regular bulk transcriptomic,” said Beker.
Seeking ENLIGHTenment
DeepPT might be enough to figure out gene expression patterns that can help with treatment choices, but it probably will not be able to replace spatial transcriptomics in all labs around the world. This is because spatial transcriptomics needs much more refined and sensitive RNA analysis, especially with high spatial precision and accuracy.
In fact, Beker and his team have a lot of work to prove that ENLIGHT-PT can be used in clinical settings. DeepPT was trained on five TCGA cohorts at the time of publication, and that number has since grown to seven. The five cohorts in the manuscript represented five different types of solid tumors: breast, lung, pancreatic, head and neck, and cervical cancer. Since the algorithm uses spatial information, Beker doesn’t think ENLIGHT-PT is very useful for liquid tumors.
“The ENLIGHT-DP pipeline was highly predictive, but as far as we know, it can theoretically apply to any solid tumor because the image analysis algorithm requires some structure to work on,” Beker explained. “If you just take a blood smear, you lose that structure and cannot use that methodology. So, we need a tissue section to see the morphology.”
Pangea Biomed will work to improve ENLIGHT’s prediction capabilities by testing it on thousands of new patients. Meanwhile, Pangea Biomed is developing a clinical trial strategy to bring tests based on this technology to market. Pangea Biomed will be very busy collecting data from leading cancer centers over the next 18 months to support that, having recently completed a couple of such trials on blinded cohorts of patients with lung and neck cancer, both of which will be sent to the FDA for regulatory oversight.
Beker stated, “We are working to bring this to the clinic. We have had several wonderful and heartwarming success stories of patients who have basically gained back the lives of patients who, in some cases, were already referred to hospice and are still with us four years later, thanks to a treatment that, according to standard biomarkers, they should not have gotten and would not have received if it were not for. We are on the verge of another revolution in cancer care, driven by improved AI-based predictive biomechanics.”