Responsible Ai

Responsible AI and AutoML: Studying the Impact of Missing Data on Group Fairness

responsible AI
fairness-aware machine learning
group fairness
missing data imputation
AutoML research
MLJAR AutoML
mljar-supervised
machine learning fairness
data preprocessing
data-centric AI
bias in machine learning
AI ethics
fair AI systems
artificial intelligence research

MLJAR tools were used in the following publication.

Exploring the Influence of Missing Data Imputation in Group Fairness Metrics

Arthur Dantas Mangussi, Ricardo Cardoso Pereira, Miriam Seoane Santos, Ana Carolina Lorena, Mykola Pechenizkiy, Pedro Henriques Abreu

This research investigates how missing-data imputation affects group fairness in machine learning systems. The study uses mljar-supervised as part of its experimental machine learning workflow to evaluate how preprocessing choices, classifiers, and missing-data mechanisms influence fairness metrics. By analyzing the interaction between missing values, imputation strategies, and predictive models, the work helps researchers and practitioners understand how technical decisions in data preparation can affect different groups of people. This case study illustrates how MLJAR AutoML can support responsible AI research by helping build and evaluate machine learning pipelines that are not only accurate, but also more transparent, reproducible, and socially responsible.

Artificial Intelligence • May 1, 2026

DOI: https://doi.org/10.1016/j.artint.2026.104559

Research Domains

Explore peer-reviewed and applied machine learning studies across diverse domains, including healthcare analytics, financial modeling, manufacturing optimization, and structured data classification problems.

Why Researchers and ML Engineers Choose MLJAR Studio

A private, AI-powered Python notebook designed for reproducible machine learning experiments, structured benchmarking, and applied research workflows - fully under your control.

Reproducible Machine Learning Experiments

Design structured pipelines, save experiment runs, and compare results across iterations with full transparency. Every validation setup, hyperparameter configuration, and model benchmark is recorded - making your research repeatable and defensible.

Local-First Execution & Data Control

Run all workflows directly on your machine. Sensitive datasets remain private, with no mandatory cloud uploads or external AI services required. Maintain full control over runtime environments and compliance requirements.

Autonomous Model Benchmarking & Optimization

Automatically compare candidate models, perform cross-validation, and run hyperparameter optimization while retaining full visibility into generated Python code and evaluation metrics. Accelerate experimentation without sacrificing methodological rigor.

Build Research-Grade ML Workflows Locally

Run automated model benchmarking, hyperparameter optimization, and autonomous experiments while keeping full control over your data.

Download MLJAR Studio

View Documentation