Special Issue: Bioinformatics
Cancer Genomics: Building Haystacks, Finding Needles
Without computers and sophisticated mathematics to organize and sift through the recent explosion of genomic information about tumors, important clues to cancer might have remained hidden within jumbles of genetic code.
But of course this has not happened. Instead, with increasing efficiency over the last decade, biological information has been gathered, stored on computer servers, and shared through the Internet. With the help of informatics, researchers around the world have been mining this repository of data and uncovering more than a few new cancer-related findings.
One surprise was the recent discovery of fused genes in prostate cancer. Gene fusions, which arise when DNA sequences from two genes merge inappropriately, are a hallmark of cancers of the blood. But they had eluded detection in "solid" tumors until 2005, when a bioinformatics approach was applied to the challenge.
Researchers at the University of Michigan Medical School developed an algorithm to search an online database called Oncomine for unusual patterns of gene activity in subsets of prostate tumors. It's now known that fused genes are common in prostate tumors and may drive the disease. Fused genes have also been found in lung tumors and may yet be discovered in other common cancers.
Like many online databases, Oncomine can be used to explore diverse questions about cancer biology, and it could not have been built in the days before bioinformatics. The database contains results from 20,000 microarray experiments, most of which collected information on thousands of genes.
Another such bioinformatics resource is the Connectivity Map. Developed at the Broad Institute and supported by NCI's Integrative Cancer Biology Program, this is an online database of gene signatures that can help identify potential drugs for treating disease. Users can search for drugs that modify the genetic program of a cancer cell in a way that may benefit patients. Recent studies have yielded candidates for targeting leukemia stem cells and treating a rare leukemia.
As with all such computational predictions, the findings need validation. Nonetheless, bioinformatics can provide leads when there are few other options. For example, computational algorithms have uncovered what are truly needles in the haystack of the human genome—microRNAs. First identified in other species, these snippets of genetic material are only about 22 nucleotides in length. But they are important regulators of genes and have been linked to cancer and metastasis.
Bioinformatics can also help with a central challenge of cancer genomics: Distinguishing genetic alterations that initiate and fuel cancers (known as drivers) from changes that are merely present in tumors but do not contribute to the disease (the passengers). Computational tools can help identify potential drivers based on statistical measures such as how common a mutation is.
With all studies of cancer genomes, the goal is always to translate knowledge into improvements in the prevention, detection, and treatment of the disease. Being able to collect and compare large amounts of clinical and genomic data can help achieve this goal, as a recent finding from The Cancer Genome Atlas (TCGA) project suggests.
By comparing genomic and epigenomic data on brain tumors with the treatment records of patients, the study uncovered a potential mechanism of resistance to a cancer drug. While the insight itself is perhaps not unusual, the promise of bioinformatics is that by making such comparisons on a large scale and with powerful analytical tools, scientists can accelerate the pace of discovery.