AutoML HR Use Case

Explore the transformative impact of MLJAR AutoML on HR processes! From talent acquisition to employee retention, harness the power of automated machine learning to make smarter, data-driven HR decisions effortlessly.

This notebook was created with MLJAR Studio

MLJAR Studio is Python code editior with interactive code recipes and local AI assistant.
You have code recipes UI displayed at the top of code cells.

Documentation

How AutoML Can Help

In our case, we are using the employee attrition dataset. This dataset includes comprehensive employee information such as age, job role, job satisfaction, and monthly income. The primary prediction task is to identify which employees are likely to leave the company. With attributes like satisfaction level, work environment, and promotion history, this dataset is ideal for training models to predict employee turnover, which can be applied to various HR strategies such as retention planning and workforce management. MLJAR AutoML is transforming human resources by automating predictive analytics for employee attrition. This advanced tool simplifies the model development process, enabling effective retention strategies, optimizing workforce allocation, and enhancing overall employee satisfaction, all of which contribute to substantial business growth.

Let's begin ๐Ÿค“!

# import packages
import pandas as pd
from supervised import AutoML
from sklearn.metrics import accuracy_score

Load training data

Import the employee attrition dataset containing information about employee demographics, job roles, and departure status.

# load example dataset
train = pd.read_csv("https://raw.githubusercontent.com/pplonski/datasets-for-start/master/employee_attrition/HR-Employee-Attrition-train.csv")
# display DataFrame shape
print(f"Loaded data shape {train.shape}")
# display first rows
train.head()

Select X,y for ML training

Identify the feature variables (X), such as employee attributes, and the target variable (y), such as whether the employee left or stayed.

# create X columns list and set y column
x_cols = ["Age", "BusinessTravel", "DailyRate", "Department", "DistanceFromHome", "Education", "EducationField", "EmployeeCount", "EmployeeNumber", "EnvironmentSatisfaction", "Gender", "HourlyRate", "JobInvolvement", "JobLevel", "JobRole", "JobSatisfaction", "MaritalStatus", "MonthlyIncome", "MonthlyRate", "NumCompaniesWorked", "Over18", "OverTime", "PercentSalaryHike", "PerformanceRating", "RelationshipSatisfaction", "StandardHours", "StockOptionLevel", "TotalWorkingYears", "TrainingTimesLastYear", "WorkLifeBalance", "YearsAtCompany", "YearsInCurrentRole", "YearsSinceLastPromotion", "YearsWithCurrManager"]
y_col = "Attrition"
# set input matrix
X = train[x_cols]
# set target vector
y = train[y_col]
# display data shapes
print(f"X shape is {X.shape}")
print(f"y shape is {y.shape}")

Fit AutoML

Train the AutoML model using the fit() method to predict employee attrition.

# create automl object
automl = AutoML(total_time_limit=300, mode="Explain")
# train automl
automl.fit(X, y)

Load test data

Let's load test data. We have Target in Attrition column, so we will check accuracy of our predictions later on. We will predict it with AutoML.

# load example dataset
test = pd.read_csv("https://raw.githubusercontent.com/pplonski/datasets-for-start/master/employee_attrition/HR-Employee-Attrition-test.csv")
# display DataFrame shape
print(f"Loaded data shape {test.shape}")
# display first rows
test.head()

Compute predictions

Generate predictions on the testing data to identify the likelihood of employee turnover.

# predict with AutoML
predictions = automl.predict(test)
# predicted values
print(predictions)

Compute accuracy

We need to retrieve the true values of employee attrition to compare with our predictions. After that, we compute the accuracy score.

# select columns
true_values = test[["Attrition"]]
# display new data shape
print(f"true_values shape is {true_values.shape}")
# compute metric
metric_accuracy = accuracy_score(true_values, predictions)
print(f"Accuracy: {metric_accuracy}")

Conlusions

Using MLJAR AutoML makes predicting employee attrition easier. It automatically builds and fine-tunes models, so HR teams can quickly analyze employee data and spot those likely to leave. With AutoML, there's less need for manual data work, making it a handy tool for improving staff retention.

See you soon๐Ÿ‘‹.

Recipes used in the automl-hr-use-case.ipynb

All code recipes used in this notebook are listed below. You can click them to check their documentation.

Packages used in the automl-hr-use-case.ipynb

List of packages that need to be installed in your Python environment to run this notebook. Please note that MLJAR Studio automatically installs and imports required modules for you.