Healthcare

Predicting Recurrent SARS-CoV-2 Mutations Using Machine Learning

SARS-CoV-2 mutation prediction
machine learning virology
AI COVID-19 genomics
recurrent mutation prediction
neural networks genomics
MLJAR AutoML research
SHAP interpretability biology
viral evolution modeling
variant of concern prediction
genome mutation forecasting

MLJAR tools were used in the following publication.

Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks

Bryan Saldivar-Espinoza, Guillem Macip, Pol Garcia-Segura, Júlia Mestres-Truyol, Pere Puigbò, Adrià Cereto-Massagué, Gerard Pujadas, Santiago Garcia-Vallve

Universitat Rovira i Virgili, Spain | University of Turku, Finland | EURECAT Technology Centre of Catalonia, Spain

This study published in the International Journal of Molecular Sciences presents a machine learning framework for predicting recurrent mutations in the SARS-CoV-2 genome. Using large-scale genomic data from GISAID (over 877,000 genomes for training and 4.6 million for evaluation), neural network models were developed to predict both recurrent mutation positions and specific nucleotide changes. The models achieved ROC-AUC up to 0.84 for mutation prediction and up to 0.81 for position prediction, with strong performance in M-pro protein (ROC-AUC 0.879). The study demonstrates that machine learning can successfully identify biologically meaningful mutation patterns and anticipate future recurrent mutations.

International Journal of Molecular Sciences • November 24, 2022

DOI: 10.3390/ijms232314683

Research Domains

Explore peer-reviewed and applied machine learning studies across diverse domains, including healthcare analytics, financial modeling, manufacturing optimization, and structured data classification problems.

Why Researchers and ML Engineers Choose MLJAR Studio

A private, AI-powered Python notebook designed for reproducible machine learning experiments, structured benchmarking, and applied research workflows - fully under your control.

Reproducible Machine Learning Experiments

Design structured pipelines, save experiment runs, and compare results across iterations with full transparency. Every validation setup, hyperparameter configuration, and model benchmark is recorded - making your research repeatable and defensible.

Local-First Execution & Data Control

Run all workflows directly on your machine. Sensitive datasets remain private, with no mandatory cloud uploads or external AI services required. Maintain full control over runtime environments and compliance requirements.

Autonomous Model Benchmarking & Optimization

Automatically compare candidate models, perform cross-validation, and run hyperparameter optimization while retaining full visibility into generated Python code and evaluation metrics. Accelerate experimentation without sacrificing methodological rigor.

Build Research-Grade ML Workflows Locally

Run automated model benchmarking, hyperparameter optimization, and autonomous experiments while keeping full control over your data.

Download MLJAR Studio

View Documentation