CPTAC, the Complementary Sibling of TCGA: An Interview with Dr. Henry Rodriguez about NCI’s Proteomics Program

September 18, 2015, by NCI Staff

NCI’s Clinical Proteomics Tumor Analysis Consortium is focused on studying whether proteomics data can be used to help improve cancer diagnosis, treatment, and prevention.

Credit: National Cancer Institute

Henry Rodriguez, Ph.D., M.B.A., is director of NCI’s Office of Cancer Clinical Proteomics Research. In this interview, Dr. Rodriguez talks about proteomics and NCI’s Clinical Proteomic Tumor Analysis Consortium (CPTAC).

What is proteomics?

Proteomics is a highly automated and rapid method for measuring all the proteins in a biological sample. Proteins are the molecules that actually do most of the work inside a cell. When researchers develop cancer drugs, those drugs typically target proteins, so scientists and clinicians really have to understand what the proteins are doing.

Proteomics researchers are now able to measure up to 10,000 proteins per tumor sample. This is termed “discovery proteomics,” which refers to the unbiased identification and quantification of as many proteins as possible in a biological sample.

How are proteins measured?

Proteins are large molecules and, historically, identifying and measuring them has been a challenge. A technique called mass spectrometry has been used for more than 30 years to measure small molecules and metabolites. But in the last several years, it’s been adapted as a way to help measure proteins.

Doing so, however, requires breaking proteins in a tissue sample into small fragments (called peptides) using enzymes that cut the protein at specific locations. Once the protein fragments have been measured, researchers need to put the puzzle pieces back together to identify the full protein. Researchers use the information contained in the genome to determine which pieces go together.

The term “targeted proteomics” refers to measurements focused on a defined subset of proteins of interest in a biological sample. Once the protein targets of interest are identified, high-throughput targeted assays are developed for confirmatory studies: tests to affirm that the initial tests were accurate. If higher detection sensitivity is required—for example, when a protein is present at very low concentrations—researchers use antibodies that bind to the proteins of interest that will amplify the signal produced by mass spectrometry.

But antibodies have historically been problematic reagents; they are the culprit in many irreproducible studies, and many commercially available antibodies are poorly characterized, which means that researchers don’t actually know what they bind to.

To address the lack of affordable, well-characterized and analytically validated antibodies available to the scientific community, CPTAC leadership also built an antibody characterization lab at NCI’s Frederick National Laboratory for Cancer Research. Every year, researchers in this resource lab solicit the cancer research community for feedback on which protein targets are in need of a good antibody. If a new antibody is warranted, they will develop it, characterize it, and make it available to the research community.

What is the CPTAC?

The CPTAC is a collaborative consortium of institutions and investigators with expertise in proteomics, genomics, cancer biology, oncology, bioinformatics, and clinical chemistry. They perform coordinated research projects to identify the proteins found in cancer specimens whose genomes have already been characterized, such as in NCI’s The Cancer Genome Atlas (TCGA) program.

One of the tenets of the consortium is that all the proteomics data is made publicly available in a repository that is accessible by the global research community, similar to the TCGA data portal.

What is CPTAC’s goal?

The goal of the CPTAC program is to shed new light on tumor biology—biology that remains unclear based on genomic analysis alone—while accelerating research in cancer proteomics and genomics by disseminating research resources for the scientific community, such as data, assays, and reagents. We believe that proteomics data can eventually complement genomic and transcriptomic analyses to improve diagnostics, therapeutics, and prevention for cancer.

Before we can really get to that point, though, the CPTAC members first had to demonstrate that proteomics methods were reliable, quantifiable, and reproducible.

CPTAC’s first program, from 2006 to 2011, focused on understanding and addressing the experimental and analytical sources of error in existing proteomics technologies. Standardization of proteomics methods was particularly important for making sure that data produced by different proteome characterization centers were reproducible and could be combined into a coherent dataset.

Now that CPTAC researchers successfully demonstrated the reliability of proteomic methods, they are pursuing a more difficult question: What is the biological or clinical value of proteomics data?

A challenge for both genomics and proteomics is to find the biological and clinical signal within the treasure trove of data. Imagine you’ve identified 10,000 proteins. While the majority are biologically relevant, most likely few will be clinically meaningful. How do you pick the few that are clinically relevant? Finding the needles in the haystack, or uncovering the biological and/or clinical meaning from these datasets is the role of cancer biologists, in collaboration with the clinical community, in the years to come.

I used to believe that producing the proteomics data would be the hardest part. Although the process of producing proteomics data has become much more efficient, researchers are only starting to extract knowledge out of that data. It is so rich, and there are so many ways to look at it. Figuring out the biological and clinical significance of all this data is still an art more than a science. You can have many investigators, and each is going to look at the data in a different way. That is the beauty of publicly sharing the research data.

What are the future directions for CPTAC?

CPTAC has established that proteomics methods—mass spectrometry analysis of protein fragments, bioinformatics methods to piece the fragments back together, and antibody-based targeted assays to detect proteins at low concentrations—are trustworthy.

In the second CPTAC project, researchers at the proteomics characterization centers comprehensively characterized a sub-set of tumor samples that were initially genomically characterized by NCI’s TCGA program. Investigators studied three cancer types (colorectal, ovarian, and breast tumors), and successfully demonstrated the scientific benefits of integrating proteomic research with genomics to produce a more unified understanding of tumor biology.

For example, in a study published last year in Nature, CPTAC researchers comprehensively analyzed samples from 95 tumors previously analyzed by TCGA researchers. They demonstrated that measurements of messenger RNA do not reliably predict protein abundance and identified possible therapeutic targets and five colon cancer subtypes, including one associated with highly aggressive disease and poor clinical outcome.

But the real promise of proteomics is to apply it in the context of clinical trials, where you have control and treatment groups. Clinicians frequently see patients who don’t respond to treatment in clinical trials the way they anticipated, and genomic analyses often aren’t enough to understand the full picture of the cancer’s biology.

Photo of Henry Rodriguez, Ph.D., M.S., M.B.A. — Henry Rodriguez, Ph.D., Director, NCI Office of Cancer Clinical Proteomics Research

Credit: National Cancer Institute