AutoML Sales Use Case

How AutoML Can Help

In this instance, a bank marketing dataset will be used. Its purpose is to forecast, based on a marketing effort, the probability that a consumer will sign up for a term deposit. 45,211 data samples of bank customers make up this extensive dataset. They include a range of attributes like age, employment, marital status, education level, and past campaign results. Our goal is to create predictive models that can precisely anticipate the chance of a customer subscribing using this dataset. This will enable banks to improve their marketing tactics and raise the efficacy of their campaigns. Banks' approaches to data analysis, customer interaction, and marketing are being completely transformed by AutoML. MLJAR AutoML provides many advantages through model development process automation that result in notable business growth.

Business Value

25%
More Accurate

Compared to conventional techniques, MLJAR AutoML can increase consumer segmentation accuracy by roughly 25%, allowing for more focused targeting and customized marketing campaigns.

20%
Higher ROIs

MLJAR AutoML's real-time campaign performance analysis can assist boost marketing ROI by about 20% by offering actionable insights that facilitate more rapid strategy revisions.

30%
Faster Production

MLJAR AutoML may speed up the generation of customized emails, advertisements, and social media posts by around 30%, hence improving efficiency and engagement.

40%
More Efficient

By using MLJAR AutoML to automate data analysis and predictive modeling, teams may increase productivity and efficiency by about 40% while concentrating more on strategic activities rather than tedious analysis.

AutoML Report

With its ability to generate comprehensive reports rich with valuable information, MLJAR AutoML provides deep knowledge into model performance, data analysis, and assessment measures. Here are a few illustrations.

Leaderboard

The metric that AutoML employed to assess model performance was logloss. The best model was determined to be 3_Default_Xgboost, as can be seen in the table and graph below.

Best model name model_type metric_type metric_value train_time
1_Baseline Baseline logloss 0.360857 1.16
2_DecisionTree Decision Tree logloss 0.290959 19.17
the best 3_Default_Xgboost Xgboost logloss 0.195108 9.18
4_Default_NeuralNetwork Neural Network logloss 0.22854 6.59
5_Default_RandomForest Random Forest logloss 0.256554 7.77
Ensemble Ensemble logloss 0.195108 2.24

AutoML Performance

AutoML Performance

Spearman Correlation of Models

The heatmap shows the pairwise Spearman correlation coefficients between multiple models. The degree and direction of the monotonic relationship between the two models' predictions are shown by each cell. Strong correlations are shown by values near to 1, and weak or no correlations are indicated by values near to 0. This heatmap makes it easier to see how well various models order data points in relation to one another.

models spearman correlation

Feature Importance

The graph shows how different features impact the model's prediction accuracy. Larger numbers denote greater significance. Each attribute is given a value based on how much it increases the prediction power of the model. This visualization helps in determining which features have the greatest impact on the model's performance, helping to focus efforts on feature selection and model interpretation.

Feature Importance across models

Install and import necessary packages

Install the packages with the command:

pip install mljar-supervised, scikit-learn

Import the packages into your code:

# import packages
from supervised import AutoML
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

Load data

Import relevant data from an OpenML dataset.

# load dataset
data = fetch_openml(data_id=1461, as_frame=True)
X = data.data
y = data.target
# display data shape
print(f"Loaded X shape {X.shape}")
print(f"Loaded y shape {y.shape}")
# display first rows
X.head()
Loaded X shape (45211, 16)
Loaded y shape (45211,)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16
0 58 management married tertiary no 2143 yes no unknown 5 may 261 1 -1 0 unknown
1 44 technician single secondary no 29 yes no unknown 5 may 151 1 -1 0 unknown
2 33 entrepreneur married secondary no 2 yes yes unknown 5 may 76 1 -1 0 unknown
3 47 blue-collar married unknown no 1506 yes no unknown 5 may 92 1 -1 0 unknown
4 33 unknown single unknown no 1 no no unknown 5 may 198 1 -1 0 unknown

Split dataframe to train/test

To split a dataframe into train and test sets, we divide the data to create separate datasets for training and evaluating a model. This ensures we can assess the model's performance on unseen data.

This step is essential when you have only one base dataset.

# split data
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.90, shuffle=True, stratify=y, random_state=42)
# display data shapes
print(f"X_train shape {X_train.shape}")
print(f"y_train shape {y_train.shape}")
print(f"X_test shape {X_test.shape}")
print(f"y_test shape {y_test.shape}")
X_train shape (40689, 16)
y_train shape (40689,)
X_test shape (4522, 16)
y_test shape (4522,)

Fit AutoML

We need to train a model for our dataset. The fit() method will handle the model training and optimization automatically.

# create automl object
automl = AutoML(total_time_limit=600, mode="Explain")
# train automl
automl.fit(X_train, y_train)

Compute predictions

Generate predictions on the test data and display the results.

# predict with AutoML
predictions = automl.predict(x_test)
# predicted values
print(predictions)
['1' '1' '1' ... '1' '1' '1']

Compute accuracy

We are computing the accuracy score and valid values (y_test) with our predictions.

# compute metric
metric_accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {metric_accuracy}")
Accuracy: 0.9040247678018576

Conclusions

Significant advantages are provided by MLJAR AutoML in sales and marketing. By automating intricate data analysis, it makes precise forecasts, focused marketing campaigns, and customized client interactions possible. AutoML can find patterns and insights that traditional methods might miss because of its efficiency in handling massive datasets. With the development of this technology, sales and marketing will be more and more impacted, which will improve consumer interaction and result in more successful business plans.

See you soon👋.