I have trained 3,600
Random Forest Classifiers (each with 1,000
trees) on 72
data sets (from OpenML-CC18 benchmark) to check how many trees should be used in the Random Forest. What I’ve found:
-
How many trees in the Random Forest?
June 30, 2020 by Piotr Płoński Random forest
-
How to visualize a single Decision Tree from the Random Forest in Scikit-Learn (Python)?
June 29, 2020 by Piotr Płoński Random forest
The Random Forest is an esemble of Decision Trees. A single Decision Tree can be easily visualized in several different ways. In this post I will show you, how to visualize a Decision Tree from the Random Forest.
-
Random Forest Feature Importance Computed in 3 Ways with Python
June 29, 2020 by Piotr Płoński Random forest
The feature importance (variable importance) describes which features are relevant. It can help with better understanding of the solved problem and sometimes lead to model improvements by employing the feature selection. In this post, I will present 3 ways (with code examples) how to compute feature importance for the Random Forest algorithm from
scikit-learn
package (in Python). -
How to save and load Random Forest from Scikit-Learn in Python?
June 24, 2020 by Piotr Płoński Random forest
In this post I will show you how to save and load Random Forest model trained with scikit-learn in Python. The method presented here can be applied to any algorithm from sckit-learn (this is amazing about scikit-learn!).
-
How to reduce memory used by Random Forest from Scikit-Learn in Python?
June 24, 2020 by Piotr Płoński Random forest
The Random Forest algorithm from scikit-learn package can sometimes consume too much memory:
-
Random Forest vs Neural Network (classification, tabular data)
May 10, 2019 by Piotr Płoński Random forest Neural network
Which is better: Random Forest or Neural Network? This is a common question, with a very easy answer: it depends :) I will try to show you when it is good to use Random Forest and when to use Neural Network.
-
Random Forest vs AutoML (with python code)
May 07, 2019 by Piotr Płoński Random forest Automl
Random Forest versus AutoML you say. Hmmm…, it’s obvious that the performance of AutoML will be better. You will check many models and then ensemble them. This is true, but I would like to show you other advantages of AutoML, that will help you deal with dirty, real-life data.
-
Does Random Forest overfit?
April 05, 2019 by Piotr Płoński Random forest
When I first saw this question I was a little surprised. The first thought is, of course, they do! Any complex machine learning algorithm can overfit. I’ve trained hundreds of Random Forest (RF) models and many times observed they overfit. The second thought, wait, why people are asking such a question? Let’s dig more and do some research. After quick googling, I’ve found the following paragraph on Leo Breiman (the creator of the Random Forest algorithm) website: