Starting from version
1.0.0 our open-source Automated Machine Learning Python package
mljar-supervised is supporting fairness aware training of Machine Learning pipelines. Our AutoML can measure fairness and mitigate bias for provided sensitive features. We support three Machine Learning tasks: binary classification, multiclass classification and regression. We provide usage and implementation details in this article.
Starting from version
The US job market is filled with exciting opportunities for aspiring Data Analysts. However, landing your first job can be challenging due to the diverse range of requirements employers are looking for. In this article, we analyze data from 15,963 Data Analyst job listings. We build a Jupyter Notebook with data analysis and visualization, and serve it as an interactive web app. For example, we search for the most needed skills and show their dependency on the average yearly salary. Let’s check what the most needed skills for Data Analyst are!
December 03, 2022 by Aleksandra Płońska, Piotr Płoński Jupyter
Jupyter Notebook is a popular open-source tool for development and exploration in a data world. It started from IPython Notebook developed by Fernando Pérez and Brian Granger. Currently, the Jupyter Notebook is available as 4 different web applications: Classic Jupyter Notebook, Jupyter Lab, Jupyter RetroLab, and Jupyter Lite. Let’s look closer for differences between those Jupyter versions.
November 21, 2022 by Aleksandra Płońska, Piotr Płoński Matplotlib
Matplotlibis a powerful visualization package for Python. It is very customizable, thanks to this it is widly used in commercial and in academic use cases. In this article, I will show you 9 different ways how to set colors in
Matplotlibplots. All parts of the plot can be customized with a new color. You can set colors for axes, labels, background, title. However, not every data scientist is a graphic designer that can compose nice looking colors in a single plot, so I can show you how to use predefined Matplotlib styles to get attractive plots.
November 12, 2022 by Aleksandra Płońska, Piotr Płoński Pandas
The Pandas it’s a popular data manipulation library. The Pandas has over 15k stars on Github. It’s an open-source project that allows, among others: automatic and explicit data alignment, easy handling of missing data, Intelligent label-based slicing, indexing, and subsetting of large data sets, merging data sets, or flexible reshaping and pivoting of data sets There are 3 ways to get the row count from Pandas DataFrame. I will describe them all in this article. My preferred way is to use
df.shapeto get number of rows and columns. This method is fast and simple.
Jupyter Notebook saves files in
.ipynbformat. It is a JSON with code, Markdown, and outputs. There are many cases in which we would like to convert Jupyter Notebook to plain Python script. For example, you would like to keep Python code in the repository or would like to turn your notebook into a standalone package. I will show you 3 ways to export the Jupyter Notebook file to Python script.
November 08, 2022 by Aleksandra Płońska, Piotr Płoński Matplotlib
Matplotlibis a popular plotting library for Python. It can be used in Python scripts and Jupyter Notebooks. The plot can be displayed in a separate window or a notebook. What if you would like to save the plot to a file? In this article, I will show you how to save the
Matplotlibplot into a file. It can be done by using 14 different formats.
November 08, 2022 by Aleksandra Płońska, Piotr Płoński Python
November 04, 2022 by Aleksandra Płońska, Piotr Płoński Scikit-learn
After training of Machine Learning model, you need to save it for future use. In this article, I will show you 2 ways to save and load
scikit-learnmodels. One method is using
picklepackage, it is fast but the model can take more storage than in the second approach. The alternative is to use
joblibpackage, which can save some space on disk but is slower than the
Presentation created with Jupyter Notebook is exported to an HTML file. It is interactive, thanks to the Reveal.js library. There are several options to publish HTML presentations in the cloud.