Researcher Interview: Tom Hudson
September 9, 2015, by Emma J. Spaulding, M.P.H.
Tom Hudson, M.D., President and Scientific Director of the Ontario Institute for Cancer Research (OICR), Chair of the Executive Committee for the International Cancer Genome Consortium (ICGC) and Member for the Global Alliance for Genomics and Health spoke with Emma J. Spaulding, M.P.H., for this Researcher Interview.
Emma Spaulding (ES): Can you explain the ICGC and describe the most exciting projects happening there right now?
Tom Hudson (TH): ICGC is an international collaborative project to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in at least 50 different tumor types of subtypes around the world that have a significant public health impact. We’re still in a period of growth, even for the first wave of analyses. By the end of the next data release, there will be data from 55 tumor types in the public domain. Each quarter, we collect an average of 1,000 more tumor samples and that growth has been steady for the past two years.
One of the things I am most impressed by at ICGC is the increasing geographic diversity of samples. For example, China has released new data sets on colorectal cancer. Pathologists tell me that when they look at samples under the microscope that come from colon cancer patients in Asia or North America, there can be noticeable differences. They’re actually slightly different diseases. While there are similar projects in North America, such as TCGA, we have large data sets from many parts of the world. I think we’re going to see distinct mutational profiles because of population-level differences in exposures, diet and other factors.
The potential variety for cancer mutations around the world is not well understood. To better elucidate it, we need to be able to do comparative studies. Not only will this research reveal new mutations and driver genes, it will help us start to identify why we see such diversity in cancer types and subtypes across the world.
These studies could shed light on the environmental differences for cancers around the world. A lot of data will be from clinical trials and some from cohort studies or epidemiological studies where we have rich environmental information. So, we’ll be able to link more risk factors with mutation profiles.
ES: What additional international tumor projects is ICGC planning?
TH: Well, there’s a project coming out of France on liver hepatocellular macronodules. It’s a relatively slow-growing cancer, with some aggressive exceptions. We’ll be able to examine mutational differences between benign and aggressive tumors. Again, this is a great resource for the research community to examine the evolution of cancer mutations in benign and aggressive subtypes.
So far, we’ve published on the most common cancers, but there are many forms of rare cancers across the world. Rare cancers were missing from the first wave of projects. About 30 percent of cancer patients actually have a rare form, but there are so many subtypes it takes longer to collect samples for these projects. We’ll see the next phase of ICGC include more rare forms of cancer, such as the French project on the bone cancer Ewing sarcoma, a gallbladder cancer project from Singapore, a type of endocrine cancer from Italy, and TCGA’s clear cell kidney cancer project.
ES: What are the most exciting recent advances in genome sequencing?
TH: Next generation sequencing can now generate high-quality data from formalin-fixed, paraffin embedded samples. We weren’t able to do that with previous technology. We had to work with fresh tissue. Now, we have the ability to examine samples from existing bio-banks, tumor repositories, previous cohort studies, and other similar collections. From there, we can select cases that we think will be informative for biomarker generation or for linking risk factors or treatment outcomes with mutation profiles, for example. This will lead to better biomarkers for prognosis or prediction of treatment outcomes
ES: What other projects does ICGC have in the works?
TH: The Pan-Cancer Analysis of Whole Genomes (PCAWG) is a coordinated effort between TCGA and ICGC to analyze 2,500 whole cancer genomes, as well as the matched normal sample. To date, many studies have focused on analyzing the exome, or protein coding regions of the genome. Our goal is to explore the non-coding parts of the genome, because we believe there are many non-coding driver mutations.
This very large collaboration is re-analyzing the raw data sets to make sure they’re harmonized before analysis begins. We hope that our research on genomes will reveal even more information than previous projects focused on only exomes. There’s a lot of unexplored space in the genome - I think in the next two or three years, we’ll discover more about cancer genomes and what’s shaping cancer because of these studies.
ES: What are the major challenges for the future of cancer genomics and how should the cancer genomics research community solve these problems?
TH: What I struggle with the most is the logistics of data sharing – both in terms of the size of the data and privacy protection. These data are so big that very few institutions in the world can actually download them for analysis. We need to look at new IT solutions which will make sure all approved scientists have access to the data and access to compute environments where they can actually perform the query.
Even with the best bioinformatics teams in the world, we are still struggling with the size. It takes four months to move the whole ICGC data set from one data center to another. Can you imagine thousands of labs trying to download that information? There’s not enough capacity!
We need to ask ourselves, “How do we exploit novel IT solutions for big data for health research information?” The solution requires both novel algorithm development and better methods of computing and analyzing the data.
We also need policy changes; we especially need clear policy on how to protect confidential information. There are differences between countries on privacy protection leading to the question: "How do we get equivalent safeguard mechanisms in place so there can be true data sharing for international projects?" That’s how we came to develop the Global Alliance for Genomics and Health. Many of us in the genomics research community realized that we needed new solutions for data sharing.
ES: What are the goals for the Global Alliance for Genomics and Health?
TH: The goal is to create harmonized standards for datasets- just like how my cell phone can contact your cell phone even though they’re created and run by different companies. Just as many companies have established standards for transmitting data between cell phones, we need to create the same standards for transmitting bioinformatics data.
The difference is that for genomic data we need to ensure the highest levels of data security as the data are transmitted across borders. In other sectors, like finance, some equivalent procedures have been established for sharing of non-health-related data.
There’s been a lot of momentum in the Global Alliance for Genomics and Health in the past two years. Several major organizations have put their efforts behind it: the NIH, the Wellcome Trust, OICR, and the Broad Institute. Key leaders in the cancer genomics community have coalesced around this because we can see the importance.
I have been so impressed with the dedication of the group, especially the scientists who are volunteering to take the lead on some projects. These people are so motivated and willing to put in their personal time. By working together, we can answer more difficult questions. I’m very much looking forward to continuing to work with Global Alliance for Genomics and Health and what the future will bring for it.