The rules extraction from the Decision Tree can help with better understanding how samples propagate through the tree during the prediction. It can be needed if we want to implement a Decision Tree without Scikit-learn or different than Python language. Decision Trees are easy to move to any programming language because there are set of if-else
statements. I’ve seen many examples of moving scikit-learn Decision Trees into C, C++, Java, or even SQL.
MLJAR's Blog
-
Extract Rules from Decision Tree in 3 Ways with Scikit-Learn and Python
February 25, 2021 by Piotr Płoński Decision tree Scikit learn
-
Tensorflow vs Scikit-learn
October 01, 2020 by Piotr Płoński Tensorflow Scikitlearn Neuralnetwork
Have you ever wonder what is the difference between Tensorflow and Sckit-learn? Which one is better? Have you ever needed Tensorflow when you already use Scikit-learn?
-
PostgreSQL and Machine Learning
September 16, 2020 by Piotr Płoński Postgresql Automl Supervised
-
AutoML as easy as MLJar
September 12, 2020 by Jeff King Automl Supervised
If there has been an open-source library that has made me an avid machine learning practitioner and won the battle of the AutoMLs hands down it has to be MlJar. I simply can’t stop eulogizing this library because it has helped overcome my deficiency in the field of coding and programming but at the same time automating the predictive modeling flow with very little user involvement. I have taken it for a spin in a few Hackathons and am not overtly surprised to find it amongst the top performers. It saves a lot of time as you do not need Data Preprocessing and feature Engineering before feeding the dataset to the model.
-
Xgboost Feature Importance Computed in 3 Ways with Python
August 17, 2020 by Piotr Płoński Xgboost
Xgboost is a gradient boosting library. It provides parallel boosting trees algorithm that can solve Machine Learning tasks. It is available in many languages, like: C++, Java, Python, R, Julia, Scala. In this post, I will show you how to get feature importance from Xgboost model in Python. In this example, I will use
boston
dataset availabe inscikit-learn
pacakge (a regression task). -
How many trees in the Random Forest?
June 30, 2020 by Piotr Płoński Random forest
I have trained
3,600
Random Forest Classifiers (each with1,000
trees) on72
data sets (from OpenML-CC18 benchmark) to check how many trees should be used in the Random Forest. What I’ve found: -
How to visualize a single Decision Tree from the Random Forest in Scikit-Learn (Python)?
June 29, 2020 by Piotr Płoński Random forest
The Random Forest is an esemble of Decision Trees. A single Decision Tree can be easily visualized in several different ways. In this post I will show you, how to visualize a Decision Tree from the Random Forest.
-
Random Forest Feature Importance Computed in 3 Ways with Python
June 29, 2020 by Piotr Płoński Random forest
The feature importance (variable importance) describes which features are relevant. It can help with better understanding of the solved problem and sometimes lead to model improvements by employing the feature selection. In this post, I will present 3 ways (with code examples) how to compute feature importance for the Random Forest algorithm from
scikit-learn
package (in Python). -
How to save and load Random Forest from Scikit-Learn in Python?
June 24, 2020 by Piotr Płoński Random forest
In this post I will show you how to save and load Random Forest model trained with scikit-learn in Python. The method presented here can be applied to any algorithm from sckit-learn (this is amazing about scikit-learn!).
-
How to reduce memory used by Random Forest from Scikit-Learn in Python?
June 24, 2020 by Piotr Płoński Random forest
The Random Forest algorithm from scikit-learn package can sometimes consume too much memory: