Open-source AutoML projects in 2026

If you haven't looked at AutoML tools in a while, the landscape has changed more than you might expect.

A few years ago, the main question was simple:

Which AutoML library gives the best score?

In 2026, the question is different:

How much of the machine learning workflow do you want to automate, and how much control, transparency, and privacy do you need to keep?

AutoML has matured, but it has also fragmented. Some tools are focused on tabular data and production-ready model training. Some tools are broader platforms that support tabular, text, image, time series, or multimodal data. Some older projects are still useful, but are no longer actively developed. And a new category is growing very quickly: AutoML agents that use LLMs to write code, run experiments, debug errors, and improve models step by step.

This article is a practical overview of open-source AutoML in 2026. I focus on projects that are useful for Python users, data scientists, analysts, and teams working with real-world machine learning problems.

The state of AutoML in 2026

Open-source AutoML in 2026 can be split into four main groups.

The first group is tabular-first AutoML. These tools focus on classical supervised machine learning: binary classification, multiclass classification, and regression. They train a portfolio of models, compare them, tune them, and often build ensembles. This group includes MLJAR AutoML, AutoGluon, H2O AutoML, FLAML, and auto-sklearn.

The second group is pipeline-search AutoML. Instead of only tuning a fixed list of algorithms, these systems search for the structure of the machine learning pipeline itself. The most important project here is TPOT, which uses genetic programming and has been refreshed with a newer graph-based architecture.

The third group is deep learning and multimodal AutoML. These tools are less focused on classic tabular problems and more focused on neural networks, text, images, multimodal inputs, or model configuration. Ludwig, AutoKeras, and parts of AutoGluon belong here.

The fourth group is new and very important: agent-based AutoML. These systems use LLMs as autonomous machine learning engineers. They can inspect data, write Python code, run experiments, debug failures, compare metrics, and improve solutions over multiple iterations. This group includes AIDE ML, AutoGluon Assistant / MLZero, R&D-Agent, MLE-STAR, and MLJAR Studio AutoLab.

The biggest change is that AutoML is no longer only about choosing the best model. It is becoming about automating the whole experiment loop.

Which projects are actively developed?

Project activity matters. A machine learning library can be excellent, but if it does not support recent Python, pandas, scikit-learn, PyTorch, or NumPy versions, it becomes harder to use in real projects.

As of 2026, the most active open-source AutoML projects include:

MLJAR AutoML — actively developed product for explainable tabular AutoML, powered by the open-source mljar-supervised package.
AutoGluon — very active, broad AutoML stack with tabular, time series, multimodal, and foundation-model support.
FLAML — active, lightweight, and focused on resource-aware AutoML and tuning.
Ludwig — active, low-code deep learning and LLM-oriented framework.
TPOT — active again after a major rewrite and renewed architecture.
AutoGluon Assistant / MLZero — active agent-based AutoML research project.
AIDE ML — active open-source ML engineering agent.
MLE-STAR — research system showing strong results for agentic machine learning engineering.

Some projects are still known and useful, but their activity level is different:

H2O AutoML is mature and enterprise-oriented. It is still relevant, especially for distributed tabular workloads.
AutoKeras is still relevant for neural AutoML, but recent releases narrowed its scope.
auto-sklearn remains historically important, but its latest official release is from 2023, so it should be treated as a legacy or research-baseline option rather than a fresh production default.

This does not mean that older tools are bad. It means that in 2026, maintenance status should be part of your selection process.

Why tabular data still matters

A lot of AI discussion today is about LLMs, agents, images, and multimodal systems. But most practical machine learning work still happens on tables.

Customer churn, fraud detection, credit scoring, insurance pricing, clinical risk prediction, sales forecasting, lead scoring, conversion prediction, demand planning, and many internal business models are still built on structured tabular data.

This is why tabular AutoML is still important.

In many companies, the real challenge is not training a neural network on images. The real challenge is taking a CSV file, understanding messy columns, handling missing values, training reliable models, explaining results to stakeholders, and saving the whole process in a way that can be reviewed later.

This is where focused tabular AutoML tools are very strong.

A good tabular AutoML tool should not only return a model. It should also answer questions like:

What preprocessing was applied?
Which models were trained?
Which model won and why?
Which features were important?
How stable was the validation score?
Can I reproduce this result later?
Can I explain this model to a business user?
Can I keep all data local?

For many real-world projects, those questions are more important than a tiny improvement in AUC.

MLJAR AutoML

MLJAR AutoML is our AutoML product for supervised tabular machine learning. It is powered by the open-source Python package mljar-supervised.

This naming is important:

MLJAR AutoML is the product name.
mljar-supervised is the open-source Python package name.

MLJAR AutoML supports binary classification, multiclass classification, and regression. It is built for tabular data and focuses on one idea that many AutoML tools ignore: users should be able to understand, audit, and reproduce every step of the machine learning process.

The package has four main modes:

Explain — for exploratory analysis and model understanding.
Perform — for production-oriented training.
Compete — for stronger performance and competition-style workflows.
Optuna — for longer hyperparameter tuning with Optuna.

This mode-based design is practical. Not every project needs the same strategy. Sometimes you want quick explanations. Sometimes you want a stronger ensemble. Sometimes you want a longer tuning process. Having clear modes makes it easier for users to choose the right workflow.

MLJAR AutoML includes automated preprocessing, algorithm selection, hyperparameter search, feature engineering, Golden Features, feature selection, hill climbing, stacking, and ensembling. Golden Features are especially interesting because they automatically create useful interactions from pairs of original columns, such as differences or ratios.

The most distinctive part is reporting. After training, MLJAR AutoML writes results to a local folder. You get a leaderboard, model reports, metrics, learning curves, confusion matrices, feature importance, and explanations. These artifacts can be opened later, shared with a teammate, or committed to version control.

This is very important in real work. Many AutoML systems behave like black boxes. They return a predictor object, but it is hard to understand what happened inside. MLJAR AutoML takes a different approach: the results are visible, local, and readable.

You can use MLJAR AutoML directly from Python with the mljar-supervised package. You can also use it inside MLJAR Studio, where AutoML workflows can be created in notebooks with help from AI.

Another strong feature is fairness-aware training. MLJAR AutoML can measure and mitigate fairness issues for provided sensitive features. This is useful for regulated or high-risk domains, especially when models are used in finance, insurance, healthcare, HR, or public services.

The main limitation is also its strength: MLJAR AutoML is focused on tabular supervised learning. It is not a multimodal framework. It is not a native forecasting framework. It is not an LLM fine-tuning system.

But if your problem is tabular classification or regression, and you care about local execution, explanations, reproducibility, and readable reports, MLJAR AutoML is one of the easiest open-source AutoML solutions to recommend.

AutoGluon

AutoGluon is one of the broadest open-source AutoML projects.

It started with a very practical promise: strong machine learning with very little code. In 2026, it covers much more than tabular data. It includes tabular prediction, time series forecasting, multimodal learning, text, image, document data, and foundation-model-based workflows.

For tabular data, AutoGluon is known for strong predictive performance. It uses bagging, stacking, weighted ensembling, presets, and carefully tuned model portfolios. It often performs very well in benchmarks because it combines many strong learners and ensemble strategies.

AutoGluon 1.5 added new tabular foundation models and improved model portfolios. This is important because foundation models for tabular data are becoming one of the most interesting trends in AutoML. AutoGluon is moving quickly in that direction.

AutoGluon is also strong outside tabular data. AutoGluon TimeSeries supports probabilistic forecasting. AutoGluon MultiModal supports tasks such as classification, regression, object detection, semantic matching, named entity recognition, and image segmentation.

This breadth makes AutoGluon very attractive if you want one open-source stack for many ML tasks.

The trade-off is complexity. AutoGluon is powerful, but it is not as focused or artifact-centric as MLJAR AutoML for pure tabular business modeling. If your main need is a local folder with easy-to-read reports for every model, MLJAR AutoML may feel more transparent. If your main need is maximum coverage across tabular, forecasting, text, image, and multimodal tasks, AutoGluon is stronger.

AutoGluon is probably the best open-source choice when you want one flexible AutoML ecosystem that can grow with many types of data.

H2O AutoML

H2O AutoML is the enterprise-style system in this comparison.

H2O-3 is an open-source, distributed, in-memory machine learning platform. It has APIs for Python, R, Java, Scala, REST, and the Flow UI. H2O AutoML automates model training and tuning within a user-specified time budget or model-count budget.

It trains a portfolio of algorithms such as GBM, XGBoost, GLM, DRF, Extremely Randomized Trees, Deep Learning models, and Stacked Ensembles. H2O also includes explainability tools and deployment options such as MOJO export.

The biggest strength of H2O AutoML is operational maturity. It is well suited for organizations that already think in terms of clusters, distributed data, controlled deployment, and enterprise infrastructure.

The main limitation is that it can feel heavier than Python-native libraries. If you want a simple local Python package that creates model reports in a folder, H2O may be more infrastructure than you need. But if you need scalable tabular AutoML with enterprise deployment patterns, H2O remains a serious option.

H2O AutoML is best for teams that need mature, distributed, operational tabular AutoML.

TPOT

TPOT is different from most AutoML tools.

Many AutoML libraries search over a fixed list of models and hyperparameters. TPOT searches over pipeline structures. It uses genetic programming to evolve machine learning pipelines.

This is useful when you do not only want to tune a model, but also discover an interesting pipeline. TPOT can combine preprocessing steps, feature selection, models, and other transformations into candidate pipelines.

In recent years, TPOT has gone through an important rewrite. The newer version moved toward graph-based pipelines, more flexible search spaces, feature selection, and multi-objective optimization. This makes TPOT interesting again, especially for research and experimentation.

The main trade-off is that evolutionary pipeline search can be computationally expensive and less predictable than portfolio-based AutoML. It is not always the easiest tool when you simply want a reliable production model quickly.

TPOT is best when the pipeline structure itself is part of the search problem.

FLAML

FLAML is a lightweight AutoML and tuning library from Microsoft.

Its main idea is efficiency. FLAML tries to find good models with low computational cost. This makes it different from heavier AutoML systems that rely on large ensembles or long training budgets.

FLAML supports classification, regression, ranking, forecasting, and several other task-oriented workflows. It also has strong support for custom estimators and custom search spaces.

This makes FLAML useful when compute budget matters. If you need fast turnaround, limited resource usage, or direct control over the search process, FLAML is a strong candidate.

The trade-off is that FLAML is more code-first and less report-first. It is excellent as a search and tuning engine, but it does not try to produce the same kind of rich per-model documentation that MLJAR AutoML produces.

FLAML is best when you want efficient AutoML with strong control over compute cost.

Ludwig

Ludwig is not a classical AutoML library. It is better described as a declarative machine learning framework.

With Ludwig, users define models using configuration files. It supports many data types and deep learning workflows, including text, image, audio, tabular, multimodal data, and LLM-related tasks. It also supports distributed training and model export.

Ludwig has an AutoML component, but its AutoML is not the center of the project. The real strength of Ludwig is low-code deep learning, configuration-driven model building, and LLM or multimodal workflows.

Choose Ludwig when you want a declarative framework for deep learning systems, not when you only want the simplest tabular AutoML solution.

AutoKeras

AutoKeras is one of the best-known neural AutoML projects.

It is based on Keras and has a history connected to neural architecture search. It can be useful for quick experiments with image and text models, especially for users already working in the Keras ecosystem.

However, AutoKeras changed direction. Recent AutoKeras release notes removed structured-data and time-series public APIs. AutoKeras 3.0 also repositioned the project toward quick experiments on small datasets with simple deep learning models.

This means AutoKeras should not be treated as a broad AutoML solution for tabular business problems in 2026. It is still useful, but its current role is narrower.

AutoKeras is best for lightweight neural AutoML experiments and education.

auto-sklearn

auto-sklearn is historically one of the most important open-source AutoML libraries.

It represents the classic academic AutoML approach: algorithm selection, hyperparameter optimization, meta-learning, and ensemble selection for scikit-learn pipelines.

The problem is maintenance. The latest official release on GitHub is 0.15.0 from February 2023. That does not make auto-sklearn useless, but it changes how it should be evaluated.

In 2026, auto-sklearn is still valuable as a research baseline or for teams that already have working environments built around it. But for new production projects, I would be careful. Compatibility with modern Python and machine learning package versions matters a lot.

auto-sklearn is best understood as a classic and respected AutoML project, but not the freshest default choice.

Agent-based AutoML systems

The most interesting new category in 2026 is agent-based AutoML.

Traditional AutoML automates model training. Agent-based AutoML tries to automate the whole machine learning workflow.

An AutoML agent can:

read the task description,
inspect files,
understand columns,
write Python code,
run experiments,
debug errors,
compare metrics,
create new features,
try different models,
save outputs,
and improve the solution in multiple iterations.

This is much closer to how a human data scientist works.

The reason this category became serious is that LLMs are now good enough to write and modify code, and benchmarks such as MLE-bench made machine learning engineering agents measurable. MLE-bench evaluates agents on Kaggle-style machine learning tasks that require model training, dataset preparation, and experiment execution.

Agent-based AutoML is not only about calling an LLM and asking for a model. The best systems combine code execution, memory, search, debugging, evaluation, and iterative refinement.

AIDE ML

AIDE ML is one of the clearest examples of an open-source machine learning engineering agent.

It treats machine learning engineering as a search problem in the space of code. The agent drafts code, runs it, debugs it, benchmarks it, and then explores improved versions. Promising solutions are kept. Bad branches are discarded.

This is different from traditional AutoML. AIDE is not only selecting a model from a predefined list. It can change the whole script.

This is powerful, but it also introduces risk. Generated code must be inspected. Validation must be designed carefully. Data leakage must be checked. Compute usage can grow quickly.

AIDE is best for research, Kaggle-style workflows, and advanced users who want to experiment with autonomous ML engineering.

AutoGluon Assistant and MLZero

AutoGluon Assistant, also known as MLZero, is a multi-agent system for end-to-end machine learning automation.

It is especially interesting because it connects agentic workflows with the AutoGluon ecosystem. The goal is to transform raw multimodal data into trained ML solutions with little or no human intervention.

MLZero uses specialized agents and memory modules to handle perception, tool knowledge, code generation, execution history, and iterative debugging.

This direction is important because it shows where AutoML is going. The future may not be a single fit() call. It may be a team of agents that understand the dataset, choose tools, write code, execute experiments, and explain what happened.

AutoGluon Assistant is still more research-oriented than classical AutoGluon, but it is one of the strongest signals that agentic AutoML is becoming a real category.

MLE-STAR

MLE-STAR is a machine learning engineering agent from Google Research.

It combines search, code refinement, ablation studies, debugging, and ensembling. One of its key ideas is to use web search to retrieve useful model ideas and then refine specific parts of the pipeline.

This is close to how a strong human practitioner works. A human data scientist does not start from nothing. They look for similar problems, check known methods, test feature engineering ideas, run ablations, and improve the pipeline.

MLE-STAR shows that agentic systems can do some of this automatically.

It is not a simple AutoML library for everyday business users, but it is an important research signal. It shows that AutoML is moving from model selection toward autonomous machine learning engineering.

R&D-Agent

R&D-Agent from Microsoft focuses on broader data-driven research and development workflows.

It can be used for Kaggle-style machine learning engineering, data mining, quantitative research, and iterative experiment workflows. The project is important because it does not treat AutoML as a single modeling step. It treats machine learning work as an iterative research process.

This is a useful perspective. In real projects, the first model is rarely the final model. Data scientists test ideas, compare metrics, create features, remove bad features, change validation, and try again.

Agentic systems try to automate more of this loop.

MLJAR Studio AutoLab

A practical direction for agent-based AutoML is notebook-first automation.

This is where MLJAR Studio AutoLab is interesting. Instead of hiding everything inside a black-box agent, AutoLab runs autonomous machine learning experiments while keeping the process inspectable through notebooks and artifacts.

This is important because trust matters.

If an agent improves a model, users should be able to inspect what it changed. Did it create a new feature? Did it change the validation split? Did it tune hyperparameters? Did it introduce leakage? Did it compare models fairly?

Notebook-first AutoML agents can give users the best of both worlds:

automation from agents,
reproducibility from code,
transparency from notebooks,
and local execution for privacy.

This is probably one of the most practical directions for AutoML agents in business use.

You can read more about this approach in the AutoLab Experiments documentation.

Comparative table

The table below summarizes the main open-source AutoML tools in 2026.

Project	Main approach	Best for	Active in 2026?	Main limitation
MLJAR AutoML	Tabular AutoML with reports, explanations, feature engineering, tuning, stacking, and ensembling. Powered by mljar-supervised.	Inspectable local tabular ML	Yes	Not multimodal or native forecasting
AutoGluon	Broad AutoML stack with tabular, time series, multimodal, and foundation-model support	Broadest open-source coverage	Yes	More complex and less report-centric
H2O AutoML	Distributed enterprise AutoML with algorithm portfolios and stacked ensembles	Scalable tabular ML in enterprise environments	Mature / maintained	Heavier infrastructure
TPOT	Genetic programming and pipeline search	Discovering pipeline structures	Yes	Can be computationally expensive
FLAML	Resource-aware AutoML and tuning	Fast, budget-aware model search	Yes	Less artifact/report focused
Ludwig	Declarative deep learning and multimodal configuration	Low-code deep learning and LLM workflows	Yes	AutoML itself is not the main focus
AutoKeras	Neural AutoML with Keras	Small neural experiments and education	Partly	Narrower scope after recent API removals
auto-sklearn	Bayesian optimization, meta-learning, ensemble selection	Research baselines and legacy scikit-learn AutoML	No recent major release	Latest official release from 2023
AIDE ML	Agent that writes, runs, debugs, and improves ML code	Autonomous ML engineering research	Yes	Needs careful review and compute control
AutoGluon Assistant / MLZero	Multi-agent end-to-end AutoML	Agentic multimodal ML automation	Yes	Still research-oriented
MLE-STAR	Search and targeted code refinement agent	Advanced ML engineering automation	Research	Not a simple user-facing AutoML library
MLJAR Studio AutoLab	Notebook-first autonomous ML experiments	Practical local agent-based AutoML	Yes	Product workflow, not a standalone OSS library

How to choose the right AutoML tool

If your problem is tabular classification or regression, start with tabular-first tools.

For privacy-sensitive, explainable, local business modeling, MLJAR AutoML is the easiest recommendation. It gives you models, reports, explanations, and artifacts that can be reviewed later.

For maximum benchmark performance and broad task coverage, AutoGluon is a very strong choice. It is especially attractive if you may later need time series, multimodal data, or tabular foundation models.

For enterprise-scale tabular workloads, H2O AutoML is still one of the most mature options. It is a good fit when you already work with clusters and controlled deployment environments.

For fast and resource-aware search, FLAML is excellent. It is a good choice when compute budget and training time matter.

For pipeline discovery, TPOT is the interesting option. It is useful when you want AutoML to search the structure of the pipeline, not only tune model parameters.

For deep learning, multimodal workflows, or LLM-related model building, Ludwig is more relevant than classical tabular AutoML tools.

For small neural experiments in the Keras ecosystem, AutoKeras is still useful, but it should not be treated as a general-purpose tabular AutoML system.

For legacy scikit-learn AutoML baselines, auto-sklearn is still important historically, but I would not choose it as the default for a new production project in 2026.

For autonomous experimentation, look at AIDE ML, AutoGluon Assistant / MLZero, MLE-STAR, and MLJAR Studio AutoLab.

The rise of tabular foundation models

Another important trend is the arrival of foundation models for tabular data.

For years, tabular machine learning was dominated by gradient boosting: XGBoost, LightGBM, CatBoost, random forests, and ensembles. These methods are still very strong. They are not going away.

But tabular foundation models are changing the conversation, especially for small and medium datasets.

Models such as TabPFN show that pretraining can be useful for structured data. Instead of training every model completely from scratch, a foundation model can use prior knowledge learned from many synthetic or real tabular tasks.

This is especially interesting when the dataset is small. In many business problems, we do not have millions of rows. We may have a few thousand rows, hundreds of columns, missing values, and a need for fast results.

AutoGluon is already moving in this direction by adding tabular foundation models into its model portfolio. This is likely to continue.

I do not think tabular foundation models will replace gradient boosting everywhere. Large tabular datasets, strict latency requirements, custom feature engineering, and interpretability needs still make gradient boosting very strong.

But foundation models will become another tool in the AutoML toolbox.

Explainability is no longer optional

Explainability used to be treated as a nice extra feature. In 2026, it is becoming a requirement.

This is especially true in finance, healthcare, insurance, HR, and public-sector use cases. It is not enough to say that a model has good accuracy. Teams need to understand why the model makes predictions, which features matter, how stable the model is, and whether the model behaves fairly across groups.

This is one reason why local reports and model artifacts matter.

A good AutoML system should help with:

feature importance,
SHAP explanations,
learning curves,
validation metrics,
fairness metrics,
reproducibility,
and audit trails.

This is where MLJAR AutoML has a strong identity. It was built around reports and explainability from the beginning.

Agent-based systems will also need this. If an agent changes a feature engineering step or validation strategy, it must explain what it did. Otherwise, users will not trust the result.

The future of AutoML is not only automation. It is transparent automation.

Local and private AutoML matters more

Privacy is another major trend.

Many companies cannot send raw data to external cloud systems. Even when it is technically possible, it may be risky from a legal, security, or business perspective.

This is especially important in Europe, where data privacy expectations are high. For many organizations, local execution is not just a preference. It is a requirement.

Open-source AutoML has an advantage here. Tools such as mljar-supervised, FLAML, TPOT, auto-sklearn, AutoKeras, and Ludwig can run where Python runs. H2O can run in a user-controlled cluster. AutoGluon can run locally too, even though cloud services exist around it.

The same question will become important for AutoML agents.

If an agent reads data, writes code, and runs experiments, where does that happen? Is the data sent to an external API? Are notebooks saved locally? Can the user inspect every step? Can the system run with local models?

For practical business adoption, local-first and privacy-aware AutoML agents will have a strong advantage.

This is one reason why MLJAR Studio is designed as a private AI data lab that runs on your own computer.

AutoML libraries vs AutoML agents

The difference between AutoML libraries and AutoML agents is important.

An AutoML library usually starts from structured input:

automl.fit(X_train, y_train)

It receives data and a target column, then searches for a good model.

An AutoML agent starts from a broader task:

Build the best model for this dataset and explain the result.

The agent may need to inspect files, infer the target, clean data, generate code, run experiments, fix errors, create features, and save a final report.

This is a bigger problem.

AutoML libraries are more deterministic, easier to test, and easier to trust. AutoML agents are more flexible, but also more risky.

In 2026, I would not say that agents replace AutoML libraries. The best architecture is probably both:

a reliable AutoML engine underneath,
an AI agent above it to plan, test, explain, and iterate.

This combination gives users automation without losing structure.

What I would recommend in 2026

For most real-world tabular machine learning projects, I would still start with a focused tabular AutoML library.

If the work needs to be local, explainable, reproducible, and easy to hand off to another person, I would choose MLJAR AutoML. It gives a complete local workflow with reports and explanations.

If the work needs maximum task coverage or may include time series, image, text, document, or multimodal data, I would choose AutoGluon.

If the organization already uses distributed infrastructure and wants enterprise-style tabular AutoML, I would consider H2O AutoML.

If compute budget is limited, I would use FLAML.

If the goal is pipeline search or research, I would use TPOT.

If the goal is low-code deep learning or LLM workflows, I would use Ludwig.

If the goal is autonomous experimentation, I would explore AIDE ML, AutoGluon Assistant / MLZero, MLE-STAR, or MLJAR Studio AutoLab.

The important point is that there is no single best AutoML tool anymore. The category is too broad.

The right choice depends on the problem.

Final thoughts

AutoML in 2026 is more interesting than it was a few years ago.

The mature tools are better. The active projects are clearer. The older projects are easier to identify. Foundation models are entering tabular data. And agent-based systems are expanding AutoML from model training into full machine learning engineering.

But the core need has not changed.

Most teams still need to build reliable models from real data. They need to understand those models. They need to share results. They need to keep data private. They need to reproduce what happened later.

This is why focused, local, explainable AutoML is still valuable.

At the same time, agents are opening a new direction. They can automate the experiment loop, generate ideas, write code, and improve notebooks step by step. This will not replace mature AutoML engines immediately, but it will change how users interact with them.

The future of AutoML is probably not a single magic button.

It is a practical combination of:

strong AutoML engines,
transparent reports,
tabular foundation models,
local execution,
and AI agents that help run experiments.

That is the real state of AutoML in 2026.