Skip to main content
An official website of the United States government
Email

Access OCG Data

An overarching goal of the Office of Cancer Genomics (OCG) is to create genomics resources for the research community, and a key resource is data. OCG strives to generate high-quality, accessible genomic data and disseminate the data to the research community in accordance with National Institute of Health’s data sharing policies in a timely manner. 

Data is generally made available to the public once OCG researchers have published an initial overview and analysis of the data. The majority of genomics data generated by OCG programs is available through the Genomic Data Commons (GDC). Analysis data and supplementary data files generated by program researchers is available through publication pages at the GDC. Raw and harmonized genomic characterization data (i.e., primary data), is available through the GDC Data Portal

Data is available via open-access when possible. However, certain data which may contain patient-identifying information, such as raw DNA sequencing data, is controlled-access. GDC guidelines describe how to apply for access through NIH’s Database of Genotype and Phenotypes (dbGaP) with the study accession numbers below. Researchers using OCG program data in their work are also encouraged to check program descriptions for appropriate acknowledgement.

OCG Genomic Data Resources by Program

ProgramDescriptionData
CGCIClinical, biospecimen, and molecular characterization data for selected rare cancers (phs000235). 
CTD2Experimental data investigating cancer targets and drug combinations.
Exceptional RespondersClinical, biospecimen, and molecular characterization data for patients with unexpected and long-lasting responses to treatment (phs001145).  
HCMIClinical, biospecimen, and molecular characterization data for patient-derived next-generation cancer models such as organoids (phs001486).
TARGETClinical, biospecimen, and molecular characterization data for pediatric cancers (phs000218). 

TCGA & Continuing Analyses Genomic Data Resources

TopicDescriptionData
TCGAClinical, biospecimen, molecular characterization, and imaging data for samples from 11,000 patients spanning 33 cancer types. All primary TCGA and subsequently produced data can be accessed via phs000178.
PanCancer AtlasA collection of studies analyzing TCGA data as a whole, investigating  cross-cancer topics: cell of origin, oncogenic processes, and oncogenic pathways.
ATAC-seqGenome-wide chromatin accessibility profiles of 410 tumor samples spanning 23 cancer types from TCGA.
Ancestry and Molecular CorrelatesA study of ancestry effects on mutation rates, DNA methylation, and mRNA and miRNA expression among TCGA patients. 

If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “Access OCG Data was originally published by the National Cancer Institute.”

Email