Genomic Data Commons

  • Resize font
  • Print
  • Email
  • Facebook
  • Twitter
  • Google+
  • Pinterest

The NCI Genomic Data Commons (GDC) is a unified knowledge base that promotes sharing of genomic and clinical data between researchers and facilitates precision medicine in oncology.

Cancer is fundamentally a disease of the genome, caused by mutations and other harmful genomic changes that alter its function and contribute to the malignant behavior of cancer cells. Genomic aberrations can influence the aggressiveness of tumors and the response of tumors to particular drugs.

Cancer genomics makes use of advanced DNA sequencing technology to give scientists enormous power to uncover how these genomic changes drive cancer formation and growth.

The GDC contains standardized data from approximately 14,500 cancer patients that derive from large-scale NCI programs, such as:

These represent some of the largest and most comprehensive cancer genomics datasets in the world, together comprising more than two petabytes of data.

The GDC will also soon expand to provide data on over 30,000 cancer patients with the submission of data from about 18,000 cancer patients provided by Foundation Medicine, Inc., a molecular information company, and over 1,000 patients with multiple myeloma contributed by the Multiple Myeloma Research Foundation (MMRF), a non-profit advocacy organization. 

By providing an expandable data sharing platform to the cancer research community, the GDC aims to accelerate discoveries in cancer research and promote precision medicine in oncology.

Breaking Down Research Silos

As the success of the landmark TCGA program demonstrates, releasing the knowledge available in these huge datasets requires collaboration and data sharing across the cancer research community.

In today's cancer research framework, several barriers prevent most researchers from fully exploiting all of the genomic data that is available, impeding progress:

  • Genomic data from different projects, clinical trials, and cancer types are siloed in different locations with local management systems, making it difficult to share the data.
  • The data are often generated using different methods, so that even if researchers can access two different datasets, they cannot use both in a single study.
  • Sophisticated analysis tools that allow researchers to derive useful knowledge from large, complex data sets are not available to all researchers.

The NCI GDC breaks down these barriers by bringing cancer genomics datasets and associated clinical data into one location that any researcher may access, and “harmonizing” the data so that datasets that were generated with different protocols can be studied side by side. Then, by making these data available using secure and compliant cloud technology through the NCI Cancer Genomics Cloud Pilots, the GDC will make it possible for any researcher to ask new and fundamental questions about cancer.

A Strong Foundation for Cancer Research

The NCI GDC is more than just a data repository; it continues to evolve by encouraging independent groups such as clinical research consortia, companies, and advocacy organizations to submit their own cancer genomic data to the GDC. Submitters can analyze the data that they contributed in concert with other data sets in the GDC, while expanding on the resources available to the cancer research community.

The GDC will also house data from a new era of NCI programs that will sequence the DNA of patients enrolled in NCI clinical trials. These datasets will lead to a much deeper understanding of which therapies are most effective for individual cancer patients.

With each new addition, such as that from Foundation Medicine, Inc. and others, the GDC will evolve into a smarter, more comprehensive knowledge base that will foster important discoveries in cancer research and increase the success of cancer treatment for patients.

This NCI initiative is being built and managed by the University of Chicago Center for Data Intensive Science, in collaboration with the Ontario Institute for Cancer Research and under a subcontract with Leidos Biomedical Research.

Learn more about the GDC from CCG Director Louis M. Staudt, M.D., Ph.D., and other GDC experts in their article entitled Toward a Shared Vision for Cancer Genomic Data, published in the New England Journal of Medicine. To find about more about the resources provided for data access, analysis, and submission, download the GDC Fact Sheet