Automated Machine Learning

Automated Machine Learning (AutoML) is a process of applying full machine learning pipeline in automatic way. The AutoML solution can do feature preprocessing and eningeering, algorithm training and hyperparameters selection.

Build great machine learning models without coding!

Training data

The service works with structured data. It accepts CSV (Comma Separated Values) files as input. File used for training should have a target column. User uploads data file to mljar service. You can find example data files in this link.

Dataset statistics

For each data uploaded to the service the following statistics are computed:

  • Min, Max, Median, Mean, Std
  • Percent of missing values
  • Number of unique
  • Distribution

For each column (feature) you can:

  • select its usage - how it will be used in ML model. The column can be: Id column, model input, model output (a target), sample weight or exclude from analysis
  • select its type. It can be numeric, discrete or categorical.

mljar feature statistics

Machine Learning Experiment

To train machine learning model you need to create ML experiment. It is easy and done with few-clicks. Most of the parameters which can be selected are set to smart defaults. You are required to select a training data.

Available validation:
  • k-fold cross validation
  • train / validation split
  • validation with separate dataset
Available preprocessing:
  • Fill missing values with mean
  • Fill missing values with median
  • Fill missing values with minimum
  • Convert categorical to integers
  • Convert categorical to binary with one-hot encoding
Available algorithms:
  • Extreme Gradient Boosting
  • LightGBM
  • Random Forest
  • Regularized Greedy Forest
  • Extra Trees
  • k-Nearest Neighbor
  • Logistic Regression
  • Neural Network
  • Ensemble
Available tuning:
  • Optimize LogLoss or AUC for binary classification
  • Optimize MSE or MAE for regression
  • Select number of models trained
  • Select time limit for model training
mljar machine learning experiment

Machine Learning model information

The service store information about each model and its training process. You can check:

  • hyperparameters values
  • preprocessing used
  • scores in each cross validation fold
  • learning curves computed as mean on all cross validation folds

To prevent overfitting the early stopping is used on all models. The model internal architecture stored in the service is always from best iteration number.

mljar machine learning model details

Feature importance

mljar machine learning model details

You can check the importance of your features for algorithms:

  • Extreme Gradient Boosting
  • LightGBM
  • Random Forest
  • Extra Trees

Deploy Machine Learning model

There are many options how you can use your model:

  • You can compute predictions with user interface
  • You can download model and use it locally - yes, all models are yours and you can do what you want with them!
  • You can download model's code
  • You can use our REST API to access your models in the cloud

mljar machine learning model local code
mljar machine learning model deploy

The python and R support

For expert users we provide python and R interfaces over mljar REST API. You can train and access your models from the code.

from mljar import Mljar


# create MLJAR project and experiment

model = Mljar(project = 'Project 1', experiment = 'Start!')


# train models

model.fit(X, y)

Demo

Would you like to have service preview without registering? Or would you like to start building great Machine Learning models today!

Build ML models! Show me demo