Skip to main content
An official website of the United States government
Email

Automated Real-World Data Integration Improves Cancer Outcome Prediction

December 11, 2025 | 10:00 AM – 11:00 AM

Virtual

Add to Outlook Calendar

Watch the recording

If you’re interested in learning how machine learning can help you tap into the rich data within electronic health records, join us to learn more about the “Memorial Sloan Kettering (MSK) Clinicogenomic Harmonized Oncologic Real-World Data set” otherwise known as MSK-CHORD.

This cancer data set combines natural language-processed annotations with structured medication, demographic, and genomic data at the MSK Cancer Center. The result: a data set of more than 25,000 cancer patients where researchers can discover the relationship between the clinical outcomes and the genome.

During the webinar, you can ask MSK computational biologist and NCI grantee, Dr. Nikolaus Schultz, about the data set, such as:

  • How can I leverage MSK-CHORD to train my machine learning model to predict overall cancer survival?
  • How can MSK-CHORD uncover predictors of metastasis to specific organ sites?
  • How efficient is the annotation of unstructured notes, and what about MSK-CHORD’s utility in predicting patient outcomes?

You can find some of the resulting data through cBioPortal.

About the Speaker

Nikolaus Schultz, Ph.D.

Dr. Schultz is an attending computational oncologist in the Department of Epidemiology and Biostatistics and a Joint Member of the Human Oncology and Pathogenesis Program at MSK Cancer Center. Dr. Schultz is head of the new Cancer Data Science Initiative, which aims to abstract and integrate multi-modal patient data from various institutional resources and make it accessible and interpretable by clinicians and scientists.

NCI’s Informatics and Technology for Cancer Research (ITCR) program funded Dr. Schultz’s work to develop the cBioPortal for Cancer Genomics, a web-based resource for analyzing complex cancer genomics data.

Email