Automated Machine Learning
K-Means Features
Use K-Means clustering algorithm to enhance your data
and improve model's performance
Use K-Means clustering algorithm to enhance your data
and improve model's performance
AutoML computes the K-Means centers based on numeric features.
If needed the scaling is applied.
The information about distance to K-Means centers and center number is added to each sample.
The number of clusters is determined based on number of rows & columns in the training data.
We use Mini-Batch K-Means versions available in the scikit-learn package.
For details please check the source code in the GitHub.
Features generation step is very fast thanks to Mini-Batch version of the K-Means algorithm.
Number of clusters in the K-Means is selected automatically based on the training data properties.
Improve your Machine Learning pipeline accuracy by including K-Means features in the data.