Skip to main content
An official website of the United States government

Career Confessions From a Cancer Data Scientist

We asked 8 cancer data scientists across the National Cancer Institute (NCI) to share their advice and career journeys. Together, they help answer the question: What should I know to start my career in cancer data science?

Infographic

Downloadable version of blog post. 

Featuring insights from:

  • Yuri Kotliarov, Ph.D. – Computational Biologist, NCI DCTD
  • Jay Donquillo, M.D. – Data Scientist, NCI CBIIT
  • Dana Wolff-Hughes, Ph.D. – Program Director, NCI DCCPS
  • Stephanie Harmon, Ph.D. – Staff Scientist, NCI CCR
  • Yu Fan, M.S. – Bioinformatician, NCI CBIIT
  • Roxanne Jensen, Ph.D. – Program Director, NCI DCCPS
  • Shashi Ratnayaka, M.S. – Bioinformatician, NCI CBIIT
  • Peng Jiang, Ph.D. – Investigator, NCI CCR


What Did You Wish You Knew at the Beginning?

Yu Fan:
“I wish I spent more time on statistics. It will help you choose the right algorithm or model for your research.”

Shashi:
“Personally, I would’ve considered pursuing a computer science degree. I like coding, and it helps to understand major tools and packages used in data science.”

Yuri:
“One of my early mistakes was thinking data was just data. If you knew the structure, you could analyze it. I’ve since learned you must know both how the data was generated and the biology behind it.”

Dana:
“I didn’t think of myself as a data scientist at first—I was working with many of the tools. Whether you start in this career or not, data science is a tool you can learn to complement whatever field you are in.”

Peng:
“Data science is not for one individual. To do meaningful work you need collaborators from other fields (e.g., cancer biologists for bioinformatics). Teamwork is key.”

Stephanie:
“The knowledge transfer with clinicians is so important—context for deployment and development of algorithms. I didn’t ask enough questions as a graduate student. Turns out, that was really important.”

Jay:
“Biomedical informatics and data science apply across healthcare. From specific conditions to patient data to education policies—broad initiatives all benefit.”

Roxanne:
“I wish I knew more about how best to work with teams. Technology evolves quickly and it’s always a challenge to balance the fast-moving tech world with the steady research environment.”

Advice for Those Starting Out

Yu Fan:
“Don’t let the learning stay in the lecture! Get hands-on research or internship experience as early as possible.”

Shashi:
“Hands-on experience is critical at the beginning of your career.”

Stephanie:
“Jump in! There are so many publicly available data sets. I encourage people to start exploring them right away.”

Dana:
“Data science is a team science. Don’t try to do it all yourself. Find others who can help you.”

Jay:
“Build a strong scientific and technical foundation. Find mentors to guide you and seek out opportunities.”

Roxanne:
“When I started, I was in Japan where most textbooks were in Japanese. But there are lots of resources now! Take advantage of the online community and ask questions.”

Peng:
“A thorough understanding of biology is essential, especially for those from math and computer science. Read basic textbooks, keep up with new research, and know when not to pursue less useful problems.”

Skills a Cancer Data Scientist Needs

  • Programming
  • Stats and Math
  • Science Backgrounds
  • Technology and Trends
  • Collaboration

What Programming Languages Do They Work With?

  • R – 67% of respondents use it
  • Python – 67% of respondents use it
  • C or C++ – 17% of respondents use it
  • Matlab – 17% of respondents use it
  • SPSS – 17% of respondents use it
< Older Post

Next Generation Artificial Intelligence: New Models Help Unleash the Power of AI

Newer Post >

NCI’s Data Science Time Capsule—A Snapshot in Time

If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “Career Confessions From a Cancer Data Scientist was originally published by the National Cancer Institute.”

Featured Posts

Archive

Email