Automated Machine Learning
Compare Frameworks

Compare AutoML frameworks on the Kaggle datasets.
The binary, multiclass classification and regression are performed.
The datasets are in the tabular format.

The comparison overview

Kaggle datasets

Competitions datasets

The competitions datasets are used in the comparison. The AutoML is trained on training dataset. The final result is from Private Leaderboard score.

Time

Training

The frameworks were trained on m5.24xlarge EC2 machines (96CPU, 384 GB RAM). The training time was set to 4 hours. (Except GCP-Tables which is using its own machine types)

Compared AutoML frameworks

The popular AutoML frameworks were checked:
Auto-Weka, Auto-Sklearn, TPOT, mljar, H2O, GCP-Tables, AutoGluon.

Datasets description

The datasets selection is the same as in the article:
N. Erickson, et al.: AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
(Except House Prices Adv. Regression, because in the article there were no results from Private Leaderboard, the competition was still ongoing).
The results from Auto-WEKA, Auto-Sklearn, TPOT, H2O, GCP-Tables, AutoGluon presented here are from above article.

Competition Task Metric Year Teams Rows Colums
Mercedes-Benz Greener Manufacturing regression R2 2017 3,823 4,209 377
Santander Value Prediction Challenge regression RMSLE 2019 4,463 4,459 4,992
Allstate Claims Severity regression MAE 2017 3,045 180,000 131
BNP-Paribas Cardif Claims Management binary log-loss 2016 2,920 110,000 132
Santander Customer Transaction Prediction binary AUC 2019 8,751 220,000 201
Santander Customer Satisfaction binary AUC 2016 5,115 76,000 370
porto-seguro-safe-driver-prediction binary Gini 2018 5,156 600,000 58
IEEE Fraud Detection binary AUC 2019 6,351 590,000 432
Walmart Recruiting Trip Type Classification multi-class log-loss 2016 1,043 650,000 7
Otto Group Product Classification Challenge multi-class log-loss 2015 3,507 62,000 94

Results

The results are presented as Percentile Rank - showing the ratio of worse submissions.
The better solutions have higher Percentile Rank values. The first place solution in the competition will get Percentile Rank = 1.

State-of-the-art Performance

MLJAR Competition winner

Better in 7 out of 10 competitions

The mljar AutoML was better than other frameworks in 7 out of 10 competitions. There were 1 draw and 2 losses. The AutoML was working perfectly for all ranges of Machine Learning tasks and datasets.

State-of-the-art MLJAR Performance

5 times in the Top-25%

The mljar AutoML without any human intervention was 5 times in the Top-25% out of the 10 Kaggle competitions. What is more, it was 3 times in the Top-10%. This is a huge accomplishment.

The AutoML Winning Code

The winning code for mljar AutoML.
There are 3 lines of code for AutoML.
The initialization of the AutoML, the fit, and prediction.

"""AutoML code """
import pandas as pd
from supervised import AutoML

# Load data
train = pd.read_csv("/your_path/train.csv")
test = pd.read_csv("/your_path/test.csv")
x_cols = train.columns[2:]

# Train AutoML
automl = AutoML(mode="Compete", total_time_limit=4*3600)
automl.fit(train[x_cols], train["target"])

# Prepare submission
sub = pd.read_csv("/your_path/sample_submission.csv")
sub["target"] = automl.predict_proba(test)[:,1]
sub.to_csv("submission.csv", index=False)

The code should be adjusted for each particular dataset.
Between datasets there are differences in feature names, metrics and ML tasks.

Check more mljar features

Golden Features

K-Means Features

Model Ensembling

ML Explainability

Automatic Documentation