MLJAR AutoML

Automated Machine Learning
Features Preprocessing

Advanced features preprocessing to save your time.
You don't need to worry any more
about data conversions to numeric
or missing values

Automatic detection of data type

The AutoML detects features data type and decides what kind of preprocessing should be used to suits the needs of the selected Machine Learning algorithm.

preprocessing decisions

Convert into numeric


Text data type

Text

Convert text into numeric with TfidfVectorizer. Get term frequency-inverse document frequency (TFIDF) for your data.


Categorical data type

Categoricals

Convert categorical features into numeric with label encoder, one-hot encoder or target encoder. Proper encoder type is automatically selected based on feature cardinality and AutoML training stage.

Datetime data type

Date & Time

Convert date and time features into numbers understood by ML algorithms. Get features link year, month, day, weekday, day of the year, hour, and difference to the earliest date automatically extracted from your data.

Handling Missing Values


No need to worry about missing values. If needed they are inputed based on their statistics values.

It also works with missing values previously unseen in the training data.

Handling Missing Values

Check more features engineering methods

Golden Features Search

K-Means Features

Features Selection