Dec 07 2017 · Piotr Płoński

AutoML comparison

Automated Machine Learning (autoML) is a process of building Machine Learning models by the algorithm with no human intervention. There are several autoML packages available for building predictive models:

Update: currently there are available many AutoML packages, the list of AutoML software is available here

Datasets

In this post we compare three autoML packages (auto-sklearn, h2o and mljar). The comparison is performed on binary classification task on 28 datasets from openml. Datasets are described below.

AutoML comparison datasets

Comparison methodology

  1. Each dataset was divided into train and test sets (70% of samples for training and 30% of samples for testing). Packages were tested on the same data splits.
  2. The autoML model was trained on the train set, with 1 hour limit for training time.
  3. The final autoML model was used to compute predictions on the test set (on samples not used for training).
  4. The logloss was used to assess the performance of the model (the lower logloss the better model). The logloss was selected because is more accurate than the accuracy metric.
  5. The process was repeated 10 times (with different seeds used for splits). Final results are average over 10 repetitions.

Results

The results are presented in the table and chart below. The best approach for each dataset is bolded.

AutoML comparison results AutoML comparison plot bar

Discussion

The poor performance of the auto-sklearn algorithm can be explained with 1 hour limit for training time. Auto-sklearn is using Bayesian optimization for hyperparameters tuning which has sequential nature and requires many iterations to find a good solution. The 1-hour training limit was selected from a business perspective — in my opinion, a user that is going to use autoML package prefers to wait 1 hour than 72 hours for the result. The h2o results compared to auto-sklearn are better on almost all datasets.

The best results were obtained by mljar package — it was the best algorithm on 26 from 28 datasets. On average it was by 47.15% better than auto-sklearn and 13.31% better than h2o autoML solution.

The useful feature of mljar is a user interface, so all models after the optimization are available through a web browser (mljar is saving all models obtained during optimization).

AutoML comparison plot bar MLJAR machine learning model details

The code used for comparison is available https://github.com/mljar/automl_comparison