Researcher Interview: Jinghui Zhang
January 12, 2016, by Amy E Blum, M.A.
Jinghui Zhang, Ph.D., Chair of the newly established Computational Biology Department at St. Jude Children’s Research Hospital, lead computational analyst for the Pediatric Cancer Genome Project (PCGP), and member of the St. Jude Faculty, spoke to Amy E. Blum, M.A., for this Researcher Interview.
Amy E. Blum: What is your primary area of expertise as a researcher?
Jinghui Zhang: I am a computational biologist, so the work that I do focuses on the development of novel computational methods for data analysis. Specifically, my research in algorithm development aims to identify and analyze chromosomal rearrangements – structural variations, copy number changes, and fusion genes – that cause cancer, with a focus on pediatric cancers. This type of research is critically important because understanding the genomic changes that drive cancer requires advanced computational tools.
The genome of a tumor is fundamentally different from the genome of normal tissue, where each cell has the same genome. Instead, each tumor is more akin to an ecological site; as the tumor evolves and progresses over time, it morphs into a complex combination of multiple genomes. We often study a tumor’s genome by analyzing a tissue biopsy, which represents a small snapshot of the tumor, and we must try to decipher the genomic characteristics of the entire tumor from this small section of tissue. To think that just one tool can illuminate the complexity of cancer biology vastly simplifies the disease. The research community benefits greatly from the continual development of computational tools that help us understand genomic data, and in this effort we must constantly improve the balance between the sensitivity of our tools to changes in the genome and their specificity in identifying the critical changes that drive tumor progression. This is why algorithm development is an important and continuous effort.
AEB: What is your intellectual process for developing new tools?
JZ: My team and I identify areas of need by working closely with scientists and oncologists. The oncologists who are very familiar with a particular disease help us recognize that there are new ways to pull additional data from whole genome sequence that may shed light on the root of their patients’ cancer. For example, when working with scientists studying adrenocortical carcinoma we recognized that viral DNA integration might be a very important signature in this disease that we had not looked into. To address this hypothesis, we built a new algorithm to identify viral genome sequences from the whole genome sequence of the tumors, which includes integrated viral DNA from retroviruses like herpesvirus, Epstein-Barr virus, and Hepatitis B virus.
Another example is our study of pediatric neuroblastoma, in which we identified the important role of telomere length. Before we started the project, we did not anticipate that telomere length derived from whole genome sequences would be an important measure for neuroblastoma. The “abnormal” telomere length in a subset of neuroblastoma and its association with ATRX mutation emerged after we applied our newly developed telomere analysis method on whole genome sequencing data generated from a variety of pediatric cancer types. When discussing our findings with pediatric neuroblastoma experts, they pointed out that aberrant telomere length occurred in a subtype of neuroblastoma with a chronic or indolent course of disease, highlighting that the relationship between ATRX and telomere length may be clinically significant in neuroblastoma. The beauty of the data and comprehensiveness of these sequencing approaches, coupled with biological questions from doctors and scientists, guides us to new approaches to extract the most information possible from next generation sequencing.
AEB: What research project are you most proud of?
JZ: I am most proud of the joint TARGET/PCGP project on the diagnosis, relapse, and remission of Acute Lymphoblastic Leukemia (ALL), published in Nature Communications. We looked at a dataset that included diagnosis, relapse, and remission data of twenty cases of ALL, a subset of which also had the whole genome sequence. We found a lot more complexity at the clonal level in leukemia than previously recognized, prompting us to perform deep sequencing on variants and structural rearrangements to map out the clonal structures, distinct populations of cells within the larger tumor.
What was very striking was that 60 percent of the relapsed tumors appeared to derive from a very minor subclone. We further observed that some of the variants, like kinase activating mutations, originated in subclone populations of the original tumor. Unfortunately what this means is that even if a drug kills the majority of a tumor, it may still leave a subclone intact that can grow and become the dominant clone, causing the patient to relapse.
AEB: Does this finding have therapeutic implications for patients with ALL?
JZ: It suggests that some of the monitoring, like minimal residual disease, may need to become an ongoing activity to monitor the possible emergence of subclones, and additional monitoring to detect mutations that are characteristic of relapse may also be valuable during the course of treatment. The finding also indicates that combination therapy may be more important than we had anticipated. Treatment with drugs that target subclones concurrently with drugs that target the dominant clone may prevent the emergence of a subclone and therefore prevent relapse.
AEB: You have studied adult cancers as part of The Cancer Genome Atlas (TCGA), and pediatric cancers through TARGET and PCGP. What are the major differences between adult and pediatric cancer?
JZ: People always ask, “is pediatric cancer a miniature version of adult cancer?” Absolutely not! This is of particular interest to me because I have studied both pediatric and adult cancers. In brain cancer, for example, we discovered that the broad biological pathways influencing these cancers in children and adults are similar, but the key genes are very different. In pediatric brain cancer mutations in EGFR are quite rare, while EGFR is the dominant player in adult brain cancer. On the other hand, mutations in PDGFRA are very common in pediatric brain cancer but rare in adults. Further, the most recurrently mutated gene in pediatric glioma, which encodes Histone H3, is almost never mutated in adult brain tumors.
Because pediatric cancer is clearly not a miniature version of adult cancer, it must not be treated as such. There must be drug discovery aimed to specifically target biological alterations in pediatric cancer.
AEB: How do differences between pediatric and adult cancers influence research on these diseases?
The many differences between pediatric and adult cancers lead to different challenges in the research. One major difference that has a large impact on research is the mutation rate, which is much lower in pediatric cancer compared to adult cancer. For example, some pediatric diseases have only one gene mutated in the entire tumor genome, while an adult cancer with a high mutation rate, like lung cancer, can harbor hundreds, or even thousands, of mutations. This puts a large emphasis on the sensitivity of the computational algorithms to analyze pediatric genomic information, because if you were to miss the one mutation, you would have a 100 percent failure rate in finding the cause of the cancer. On the other hand, once you have correctly identified the mutations of the tumor, the analysis to determine the causal mutation of a pediatric cancer is comparatively straightforward.
Another key difference is the much lower incidence of pediatric cancer. This presents a real challenge for research because it is difficult to find enough cases to give the analysis strong power to discover variants associated with cancer. I am very lucky to be involved in consortia like TARGET and PCGP that share data across intuitions for collaborative studies to increase the power of the analyses, but I also see a strong need to aggregate data from more institutions around the world. My team and I directly addressed this need with our most recent work, ProteinPaint, a data portal designed to present pediatric cancer genomic data generated from multiple research groups that we hope will become a platform to facilitate collaboration and data sharing. While ProteinPaint represents a significant step forward, I believe that further expanding our efforts to form cross-institutional, international collaborations must be an important aspect of pediatric cancer research, so that we can work as a community to get more power to understand this disease.