NLP

Comparison of AutoML Tools for SMS Spam Message Filtering Including mljar-supervised

AutoML comparison
mljar-supervised benchmark
SMS spam classification
machine learning text filtering
ensemble AutoML
H2O AutoML vs mljar
TPOT AutoML comparison
short text classification
Log Loss AutoML
AUC spam filtering

MLJAR tools were used in the following publication.

Comparison of Automated Machine Learning Tools for SMS Spam Message Filtering

Waddah Saeed

Center for Artificial Intelligence Research (CAIR), University of Agder, Grimstad, Norway

This study presents a comparative evaluation of three Automated Machine Learning tools — mljar-supervised, H2O AutoML, and TPOT — for SMS spam message filtering. Using a dataset of 5,610 SMS messages and feature subset sizes of 50, 100, and 200, ensemble models consistently achieved the best classification performance. The mljar-supervised Stacked Ensemble model achieved Log Loss as low as 0.8863 and AUC up to 0.9487, demonstrating strong and competitive performance. The work highlights the effectiveness of ensemble-based AutoML pipelines in short-text classification tasks.

arXiv • June 28, 2021

DOI: https://doi.org/10.48550/arXiv.2106.08671

Research Domains

Explore peer-reviewed and applied machine learning studies across diverse domains, including healthcare analytics, financial modeling, manufacturing optimization, and structured data classification problems.

Why Researchers and ML Engineers Choose MLJAR Studio

A private, AI-powered Python notebook designed for reproducible machine learning experiments, structured benchmarking, and applied research workflows - fully under your control.

Reproducible Machine Learning Experiments

Design structured pipelines, save experiment runs, and compare results across iterations with full transparency. Every validation setup, hyperparameter configuration, and model benchmark is recorded - making your research repeatable and defensible.

Local-First Execution & Data Control

Run all workflows directly on your machine. Sensitive datasets remain private, with no mandatory cloud uploads or external AI services required. Maintain full control over runtime environments and compliance requirements.

Autonomous Model Benchmarking & Optimization

Automatically compare candidate models, perform cross-validation, and run hyperparameter optimization while retaining full visibility into generated Python code and evaluation metrics. Accelerate experimentation without sacrificing methodological rigor.

Build Research-Grade ML Workflows Locally

Run automated model benchmarking, hyperparameter optimization, and autonomous experiments while keeping full control over your data.

Download MLJAR Studio

View Documentation