Computational Genomics Research

image of someone in front of three computer screens — The Genomic Data Commons (GDC) was launched at the University of Chicago on June 6, 2016. GDC is a unified data system that promotes sharing of genomic and clinical data between researchers.

Credit: Univ. of Chicago

Computational genomics applies algorithms and statistical models to big datasets. OCG generates large genomic and clinical datasets through the Genome Characterization Pipeline, shares data through the Genomic Data Commons (GDC), and makes data accessible on commercial clouds by partnering with NCI Cloud Resources. Members of OCG’s Genomics Data Analysis Network and external researchers from around the world develop and apply a range of computational techniques to analyze data in the GDC.

Key Questions

Can analyzing cancer genomic datasets compiled from an exceedingly large number of patients increase our power to discover new cancer driver mutations?
What are the best ways to display cancer genomic data such that cancer researchers can explore and visualize large, complex datasets?
How can investigators effectively integrate data from multiple modes of genomic analysis into a unified view of oncogenic pathways?
What new technologies provide fresh views of cancer mechanisms, such as single cell DNA and RNA sequencing?

Tools and Methods

Computational genomics applies algorithms and statistical models to big datasets. OCG generates large genomic and clinical datasets through the Genome Characterization Pipeline, shares data through the Genomic Data Commons (GDC), and makes data accessible on commercial clouds by partnering with the NCI Cloud Resources. Members of OCG’s Genomics Data Analysis Network and external researchers from around the world develop and apply a range of computational techniques to analyze data in the GDC.

Programs and Collaborations

Genomic Data Commons

NCI's Genomic Data Commons (GDC) is a data sharing and analysis platform aiming to make genomic and clinical data truly accessible to the cancer research community. In addition to serving as one of the most comprehensive genomic data repositories, the GDC identifies and implements best-in-class bioinformatic pipelines and produces a standardized collection of data that may be cross-analyzed and compared. The GDC also develops web-based, interactive and clinically relevant tools and offers multiple high-performance options for accessing data.

The Genomic Data Analysis Network

NCI’s Genomic Data Analysis Network (GDAN) is a collaborative team that develops and applies computational analysis methods to large-scale datasets. The GDAN’s goal is to help the research community leverage the genomic data produced by NCI and other programs.