Close up of a cancer cell - oncology target 3d illustration
Credit: peterschreiber.media/Getty Images

Scientists the University of Leeds have developed a novel technique using artificial intelligence (AI), to predict cancer from patient data without putting personal information at risk. The researchers set out to discover whether a form of AI, called swarm learning, could be used to help computers predict cancer in medical images of patient tissue samples, without releasing the data from hospitals.

The scientists published their study in Nature Medicine.

“Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical, and legal obstacles. These obstacles could be overcome with swarm learning (SL), in which partners jointly train AI models while avoiding data transfer and monopolistic data governance,” write the investigators.

“Here, we demonstrate the successful use of SL in large, multi-centric datasets of gigapixel histopathology images from over 5,000 patients. We show that AI models trained using SL can predict BRAF mutational status and microsatellite instability directly from hematoxylin and eosin (H&E)-stained pathology slides of colorectal cancer. We trained AI models on three patient cohorts from Northern Ireland, Germany, and the United States, and validated the prediction performance in two independent datasets from the United Kingdom.

“Our data show that SL-trained AI models outperform most locally trained models, and perform on par with models that are trained on the merged datasets. In addition, we show that SL-based AI models are data efficient. In the future, SL can be used to train distributed AI models for any histopathology image analysis task, eliminating the need for data transfer.”

The swarm learning system then sends this newly trained algorithm—but no local data or patient information–to a central computer. There, it is combined with algorithms generated by other hospitals in an identical way to create an optimized algorithm. This is then sent back to the local hospital, where it is reapplied to the original data, improving detection of genetic changes thanks to its more sensitive detection capabilities.

By undertaking this several times, the algorithm can be improved and one created that works on all the data sets. This means that the technique can be applied without the need for any data to be released to third party companies or to be sent between hospitals or across international borders.

The team trained AI algorithms on study data from three groups of patients from Northern Ireland, Germany, and the U.S. The algorithms were tested on two large sets of data images generated at Leeds, and were found to have successfully learned how to predict the presence of different sub types of cancer in the images.

“Based on data from over 5,000 patients, we were able to show that AI models trained with swarm learning can predict clinically relevant genetic changes directly from images of tissue from colon tumors,” said Jakob Nikolas Kather, visiting associate professor at the University of Leeds’ School of Medicine and researcher at the University Hospital RWTH Aachen

“We have shown that swarm learning can be used in medicine to train independent AI algorithms for any image analysis task,” added Phil Quirke, PhD, professor of pathology in the University of Leeds’s School of Medicine. “This means it is possible to overcome the need for data transfer without institutions having to relinquish secure control of their data. Creating an AI system which can perform this task improves our ability to apply AI in the future.”

Also of Interest