The potential of large studies for building genetic risk prediction models

  • Posted: March 4, 2013
NCI Press Office


NCI scientists have developed a new paradigm to assess hereditary risk prediction in common diseases, such as prostate cancer. This genetic risk prediction concept is based on polygenic analysis—the study of a group of common DNA sequences, known as single nucleotide polymorphisms (SNPs), each of which contributes a very small amount to overall disease risk, but has a strong effect when grouped together. Nilanjan Chatterjee, Ph.D., Biostatistics Branch, Division of Cancer Epidemiology and Genetics, NCI, and colleagues, assessed the potential of using very large, genome-wide association studies (GWAS) to develop polygenic models that can compute risk scores for hereditary diseases for individuals in the general population. The results of their work appeared in the March 3, 2013, Nature Genetics.

The investigators estimate that there are likely to be thousands of SNPs responsible for common health outcomes ranging from type 2 diabetes to cancer. They project how the predictive accuracy of risk models may improve as the sample size for GWAS could increase in the future. For example, it is estimated that a polygenic model built for prostate cancer (based on the largest current GWAS available) would categorize seven percent of the population as a high-risk group, which is at least twice the risk for prostate cancer compared to men at average risk. The investigators project that tripling GWAS sample sizes in the future could improve the risk-stratification of the model such that 12 percent of the population would be categorized in the high-risk group. If polygenic models could be built using very large sample sizes, they could be a useful tool for identifying a significant number of high-risk individuals in the population. However, the authors note that any strategy for early diagnosis and prevention that only targets genetically high-risk individuals would be incomplete. The analysis suggests information about other risk factors, including family medical history, would be needed to enhance the broader utility of these models.