Skip to main content
An official website of the United States government
Email

Career Confessions: 5 Things a Research Fellow Learned in a Cancer Data Scientist Lab

, by Kun Wang, Ph.D.

As part of our Career Confessions from Cancer Data Scientists series, we present a one-on-one with an NCI data scientist to share his career journey.

Kun Wang, Ph.D.
Research Fellow
Cancer Data Science Lab
NCI’s Center for Cancer Research
 

What did you do after getting your doctorate?

I immediately started looking for a post-doctorate training position and landed my current role with NCI. I’ve worked with Dr. Eytan Ruppin in the Cancer Data Science Lab for almost five years now.

How different is your current data science position from when you were pursuing your education?

While working on my doctorate, besides dedicating time to my research, I had to take many classes and work as a teaching assistant. Now, in my current position, I can concentrate on my research without other variables.

From the time you began your professional career, what are five things you understand now that you didn’t know while getting your doctorate?

  1. Work on your collaboration skills. When I was gaining my Ph.D., my collaboration experience was very limited. Now, I collaborate with many scientists with different expertise and gain input from collaborators for our studies.
  2. Know your data. As data scientists, we do not generate the data; we work with it. Talking to biologists and collaborators about the data helps me learn and answer my questions. We need to know the optimal way to preprocess the data and understand if there are potential confounding factors that can distort our results.
  3. Come up with a good research question. You need to know the field and understand what has and hasn’t been done. Ask yourself: how important is your question, and why do you think it could be impactful? Hopefully, you can convince yourself first and then convince your colleagues.
  4. Understand that biological knowledge is much more powerful than the models. As data scientists, we need to work on good data for our models. That requires a strong understanding of the data's background and how you can interpret the data once you collect it. Many effective models, methods, and algorithms are available or can be developed, but we cannot guarantee that they will work well with the data. You can always incorporate biological knowledge into suitable models to substantially boost their performance.
  5. Develop mentoring skills. I feel fortunate to be mentored by my current principal investigator. He offers me a unique opportunity to co-mentor post-bacs and junior post-bacs in our lab. I dedicate time to teach them fundamental knowledge and guide them on research. It is essential to know how to inspire others when they get stuck or feel disappointed with failure. You should know how to point them in a new direction.

What Programming Language do you use the most?

I use R software more than others because, for me, it’s easier to process the data I work with. It is an open-source programming language, specifically powerful enough for statistical analyses and visualization. I can always find a suitable R package to address my specific challenges because there are several novel R packages/tools that have been developed and released.

What closing advice do you have for anyone pursuing a post-doctorate in a data science-oriented field?

There are several times when I try to pursue perfection, but you must remember there is no perfection in research. It is good enough to develop experiments, test out hypotheses, and compare them with other methods based on our best knowledge and reasonable assumptions. In my opinion, young scientists struggle with figuring out when it’s time to converge. I’ve learned it’s essential to consult with experienced mentors and collaborators to determine when the time is right.

< Older Post

NCI’s National Cancer Plan and Data’s Critical Role

Newer Post >

Complete Your Research Project with Tips from a Cancer Data Scientist

If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “Career Confessions: 5 Things a Research Fellow Learned in a Cancer Data Scientist Lab was originally published by the National Cancer Institute.”

Featured Posts

Archive

Email