Skip to main content
An official website of the United States government
Email

Chapter 3: Statistics Concepts and Principles for Cancer Data Science

Do you need to be a math genius to understand the statistical concepts of data science? Don’t let the math daunt you. In this chapter, you will learn how to avoid common pitfalls in designing studies for data science and explore common statistical concepts you’ll need to know.

Watch the Video

Watch 5 Common Stats Questions from Early Career Researcher (approx. 6 minutes long).

Test Your Knowledge

You’re planning to do a study. When is the best time to talk to a statistician?

A. Before you determine your hypothesis.
B. When you’re designing your study.
C. After you’ve collected your data. 
D. Once you’re trying to choose what formula to use for analysis.
E. Never. Statisticians are in a separate field and won’t be able to advise cancer researchers.

The correct answer is B. 
You don’t have to know all there is to know about statistics for cancer data science, as long as you recognize the importance of engaging statistical expertise as needed. Statisticians can provide advice on study design and choice of analysis methods or assist with analyses. You can contact NCI and other NIH statisticians who may consult on your research project or join the research team as full collaborators.

The incorrect answers are: 

A. You should determine your hypothesis before reaching out to a statistician. Once you know what question you want to answer, the statistician can help advise you on study design.

C. You should talk to a statistician when designing your study. If there’s a flaw in your study design, it may be too late to fix by the time you collect and start to analyze your data, and your findings may not be statistically valid.

D. You should talk to a statistician when designing your study. If there’s a flaw in your study design, it may be too late to fix by the time you collect and start to analyze your data, and your findings may not be statistically valid.

E. Statisticians are able (and often happy) to advise cancer researchers. They can help you avoid serious flaws in your study design.

  • Reporting Guidelines: Find reporting guidelines for a wide variety of health research studies under EQUATOR (Enhancing the QUAlity and Transparency Of health Research). By looking at reporting guidelines relevant your type of study and their accompanying explanatory publications as you are planning your study, you’ll have a good sense for what you should anticipate needing to report about your study and why all of those aspects matter for others to evaluate the quality of your study and properly interpret its results.

These courses require sign-up to attend. Keep an eye on their calendars to see when they become available.

Keep Going

Continue to Chapter 4 to learn about big data technologies we think can accelerate your education and research.

Instructor

Lisa McShane, Ph.D., NCI Division of Cancer Treatment and Diagnosis (DCTD)
Dr. McShane is the associate director of NCI DCTD’s Biometric Research Program. She is an internationally recognized expert on precision medicine clinical trial design, the development of tumor markers and omics predictors, and reporting guidelines for health research studies.

For questions and feedback about this chapter, email our team at ncicbiit@mail.nih.gov

  • Updated:

If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “Chapter 3: Statistics Concepts and Principles for Cancer Data Science was originally published by the National Cancer Institute.”

Email