Opportunities in Cancer Research: Artificial Intelligence
Artificial intelligence (AI) is everywhere: personal digital assistants answer our questions, robo-advisors trade stocks for us, and driverless cars will someday take us where we want to go. AI has penetrated our lives, and its use is exploding in biomedical research and health care—including across all dimensions of cancer research, where the potential applications for AI are vast.
Integration of AI technology in cancer care could improve accuracy and speed of diagnosis, aid clinical decision-making, and lead to better health outcomes.
AI excels at recognizing patterns in large volumes of data, extracting relationships between complex features in the data, and identifying characteristics in data (including images) that cannot be perceived by the human brain. It has already produced results in radiology, where clinicians use computers to process images rapidly, thus allowing radiologists to focus their time on aspects for which their technical judgment is critical. For example, last year, the Food and Drug Administration approved the first AI-based software to process images rapidly and assist radiologists in detecting breast cancer in screening mammograms.
Integration of AI technology in cancer care could improve the accuracy and speed of diagnosis, aid clinical decision-making, and lead to better health outcomes. AI-guided clinical care has the potential to play an important role in reducing health disparities, particularly in low-resource settings. NCI will invest in supporting research, developing infrastructure, and training the workforce to help achieve these goals and more.
Emerging AI Applications in Oncology
NCI-funded research has already led to several opportunities for the use of AI. Here are some examples:
Improving Cancer Screening and Diagnosis
Scientists in NCI’s intramural research program are leveraging the capabilities of AI to improve cancer screening in cervical and prostate cancer. NCI investigators developed a deep learning approach for the automated detection of precancerous cervical lesions from digital images. Read more about this in Mark's story.
Another group of NCI intramural investigators and their collaborators trained a computer algorithm to analyze MRI images of the prostate. Historically, standard biopsies of the prostate did not always produce the most accurate information. Starting 15 years ago, clinicians at NCI began performing biopsies guided by findings from MRI, enabling them to focus on regions of the prostate most likely to be cancerous. MRI-guided biopsy improved diagnosis and treatment when utilized by prostate cancer experts, but the method did not transfer well to clinics without prostate cancer expertise. The NCI clinicians used AI to capture their diagnostic expertise and made the algorithm accessible to clinics across the country as a tool to help with diagnosis and clinical decision-making.
The full potential of the MRI-guided biopsy developed by NCI researchers is being realized in clinics without prostate cancer–specific expertise because of this AI tool. New AI algorithms under development now aim to surpass the capabilities of well-trained radiologists by enabling the prediction of patient outcomes from MRI.
Aiding the Genomic Characterization of Tumors
AI methods can also be used to identify specific gene mutations from tumor pathology images instead of using traditional genomic sequencing. For instance, NCI-funded researchers at New York University used deep learning (DL) to analyze pathology images of lung tumors obtained from The Cancer Genome Atlas. Not only could the DL method accurately distinguish between two of the most common lung cancer subtypes, adenocarcinoma and squamous cell carcinoma, it could predict commonly mutated genes from the images.
In the context of brain tumors, identifying mutations using noninvasive techniques is a particularly challenging problem. With NCI support, an international team, including investigators at Harvard University and the University of Pennsylvania, recently developed a DL method to identify IDH mutations noninvasively from MRI images of gliomas. These research findings suggest that, in the future, AI could help identify gene mutations in innovative ways.
Accelerating Drug Discovery
NCI is leveraging the power of AI in multiple ways to discover new treatments for cancer. The Cancer MoonshotSM is supporting two major efforts in partnership with the Department of Energy (DOE) to leverage its supercomputing expertise and power for cancer research. In one effort, AI is being used to detect and interpret features of target molecules (e.g., proteins or nucleic acids that are important in cancer growth), make predictions for new drugs to target those molecules, and help evaluate the effectiveness of those drugs. Research is also being done to identify novel approaches for creating new drugs more effectively.
A project that is part of the second effort is using computational methods to model the interaction of KRAS protein with the cell membrane in detailed ways that were not previously possible. A cross-agency research team collaborating with the RAS Initiative developed a model of KRAS–lipid membrane binding to simulate the behavior of KRAS at the membrane. This model could help identify novel ways to inhibit the activity of mutant KRAS protein. This work will help scientists find new avenues to target mutations in the KRAS gene, one of the most frequently mutated oncogenes in tumors. In the future, this could be applied to other important oncogenes.
Improving Cancer Surveillance
The NCI–DOE collaboration is also enabling the application of DL to analyze patient information and cancer statistics collected by the NCI Surveillance, Epidemiology, and End Results (SEER) program. As part of this effort, DL algorithms were developed to extract tumor features automatically from pathology reports, saving thousands of hours of manual processing time. The goal of the project is to transform cancer care by applying AI capabilities to population-based cancer data in real time. This will help us better understand how new diagnostic methods, treatments, and other factors affect patient outcomes. Real-time data analysis will also allow for newly diagnosed individuals to be linked with clinical trials that may benefit them. NCI’s long-term investment in the SEER program and its infrastructure, coupled with newer investments in AI, will enable pattern recognition in population data that was impossible before. AI will aid in predicting treatment response, likelihood of recurrence (local or metastatic), and survival.
Applying AI capabilities to population-based cancer data in real time will help us better understand how new diagnostic methods, treatments, and other factors affect patient outcomes.
Realizing the Promise of AI in Oncology–and Avoiding the Pitfalls
The potential applications of AI in medicine and cancer research hold great promise. Leveraging these opportunities will require increasing investments and addressing some challenges that will have to be overcome.
Building an AI Cancer Research Community
The data science and AI communities will be important partners in realizing the promise of AI in cancer research. NCI can engage these communities by providing appropriate funding opportunities and access to data sources; linking cancer researchers and AI researchers; and supporting the training and development of a workforce with expertise in AI, data science, and cancer. Building on the NCI–DOE collaboration, a series of workshops are being held to build a community engaged in pushing the limits of current computational practices in cancer research to develop new computational technologies.
Bridging the Gap from Research to Practice
Currently, the use of AI in cancer research and care is in its infancy. Most research is focused on methods development, rather than on implementing those methods in clinical practice. NCI has an opportunity to lead the way in implementing AI in cancer care by supporting research to find effective pathways for clinical integration (including ways to understand uncertainty and validate AI approaches), educating medical personnel about the strengths and weaknesses of the technology, and rigorously assessing its benefits in terms of clinical outcomes, patient experience, and costs.
Accessing Quality Cancer Data
The lack of large, publicly available, well-annotated cancer datasets has been a significant barrier for AI research and algorithm development. The lack of benchmarking datasets in cancer research hampers reproducibility and validation. Support for annotation, harmonization, and sharing of standardized cancer datasets to drive AI innovation and support training and validation of AI models will be essential. With even greater volumes of data anticipated in the future, support for developing approaches to generate and aggregate new research and clinical data coherently will be critical for long-term success.
To support this work and to make cancer data broadly available for all types of research, NCI is refining policies and practices to enhance and improve data sharing. As part of those efforts, NCI is building a Cancer Research Data Commons (CRDC). One node of the CRDC is an Imaging Data Commons that will connect to The Cancer Imaging Archive, a unique resource of publicly available, archival cancer images with supporting data to enable discovery. NCI also recently launched the Childhood Cancer Data Initiative to accelerate progress for children, adolescents, and young adults with cancer by optimizing the collection, aggregation, and utility of research and clinical data.
NCI’s data aggregation and sharing efforts are crucial to moving AI and many areas of cancer research forward. As new sources of biomedical and health data emerge, the amount of information will continue growing faster than it can be interrogated. AI will be an essential tool for processing, aggregating, and analyzing the vast amounts of information the data hold to drive discovery and improve patient care.
Understanding the Method Behind the Machine
One challenge of AI, and DL specifically, is the “black box” problem: not fully understanding what features of the data a computer has used in its decision-making process. For example, a DL algorithm that predicts the optimal treatment for a patient does not provide the reasoning it used to make that prediction. Additional efforts are needed to reveal how algorithms arrive at a decision or prediction so that the process becomes transparent to scientists and clinicians. Making these algorithms transparent could help researchers identify new biological features relevant to disease diagnosis or treatment.
NCI’s data aggregation and sharing efforts are crucial to moving AI and many areas of cancer research forward.
Incorporating information about biological processes into the algorithm is likely to improve its accuracy and decrease dependence on large amounts of annotated data, which may not be available. One danger of the “black box” problem is that DL may inadvertently perpetuate existing unconscious biases. Researchers need to carefully consider how potential biases affect the data being used to develop a model, adopt practices to address and monitor those biases, and monitor performance and applicability of AI models.
With increased investments, NCI’s efforts to realize AI’s potential will lead to more accurate and rapid diagnoses, improved clinical decision-making, and, ultimately, better health outcomes for patients with cancer and those at risk.