Skip to main content
An official website of the United States government
Email

FusOn-pLM: A Fusion Oncoprotein-Specific Language Model for Predicting Drug-Resistant Mutations

Fusion oncoproteins are major drivers of several types of pediatric cancer. If you study pediatric cancer, you’ll want to learn about “FusOn-pLM”—a fusion oncoprotein-specific language model to predict drug-resistant mutations and inform the design of therapies. NCI-funded researchers built and trained this model that could aid in predicting which medications are most useful and help in drug development research. 

Fusion oncoproteins are different from other proteins because they are highly disordered (i.e., they don’t fold into predictable shapes) and have altered structural and functional properties. This makes them difficult for traditional structure-based drugs to target. Researchers believe FusOn-pLM is the first protein language model (pLM) focusing on the unique characteristics of fusion oncoproteins.

What makes FusOn-pLM different from other models?

  • Unlike other protein models (e.g., AlphaFold or ESM-2) trained on “normal” proteins, researchers trained FusOn-pLM on a curated database of fusion proteins.
  • FusOn-pLM uses a new training strategy called “cosine-scheduled masking,” which gradually increases the difficulty of the learning task. It helps the model focus on what makes fusion proteins distinct. 

An NCI-funded study revealed that FusOn-pLM:

  • performs better than existing models on tasks specific to fusion proteins (including identifying disordered regions, forecasting which mutations might cause drug resistance, and more).
  • enables the prediction of current and future drug-resistant mutations in fusion oncoproteins.
  • has the potential to inform therapeutic strategies and anticipate resistance mechanisms.
  • could help create new therapeutic models that only focus on fusion proteins in cancer cells and not healthy proteins. 

Read the full article for results of the study and more information on how the researchers built the model.

< Older Post

Validate and Test your DICOM De-Identification Algorithms Using the MIDI Benchmark

Newer Post >

NCI Data Science at the 2025 AACR Annual Meeting

If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “FusOn-pLM: A Fusion Oncoprotein-Specific Language Model for Predicting Drug-Resistant Mutations was originally published by the National Cancer Institute.”

Archive

Email