Telco Customer Churn Prediction in Python

Analyze the Telco Customer Churn dataset, engineer features, train a random forest classifier, and identify top churn drivers.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow loads the Telco Customer Churn CSV dataset and computes the overall churn rate. It encodes categorical variables, checks class balance, and prepares features for modeling. It trains a random forest classifier, reports accuracy and a classification report, and plots the top 10 feature importances to highlight churn drivers.

Who this example is for

This is for data analysts and data scientists who want a reproducible churn modeling example using a standard telco dataset. It helps practitioners practice preprocessing mixed-type tabular data and interpreting a tree-based classifier with feature importance plots.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

  • Dataset loaded with shape (7043, 21) and churn rate around 26.5%
  • Categorical encoding completed and class imbalance visualized
  • Random forest model trained with accuracy around 0.80 and a classification report
  • Top 10 churn drivers plotted, typically including tenure, MonthlyCharges, and TotalCharges

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

  • pandas
  • numpy
  • scikit-learn
  • matplotlib
  • seaborn

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

  1. 1load the telco churn dataset https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/telco-customer-churn/Telco-Customer-Churn.csv and show churn rate
  2. 2encode categorical variables and show class balance
  3. 3train a random forest classifier and show accuracy
  4. 4plot the top 10 most important features for churn prediction

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model SourceScoreTaskExecutionOutputReasoningReliabilityRun
glm-5.110/102/22/23/32/21/1Open glm-5.1 conversation
gpt-5.410/102/22/23/32/21/1Open gpt-5.4 conversation
gpt-oss:120b10/102/22/23/32/21/1Open gpt-oss:120b conversation
qwen3-coder-next10/102/22/23/32/21/1Open qwen3-coder-next conversation
gemma4:31b6/102/21/22/31/20/1Open gemma4:31b conversation
qwen3.5:397b6/102/21/22/31/20/1Open qwen3.5:397b conversation
How scoring works

Each run is graded on five dimensions:

  • Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
  • Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
  • Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
  • Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
  • Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with gpt-oss:120b

This run uses AI with a step-by-step list of prompts to perform the analysis task Telco Customer Churn Prediction in Python. Below, you can review the full conversation generated with gpt-oss:120b. After the conversation, we show detailed score cards for this run.

load the telco churn dataset https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/telco-customer-churn/Telco-Customer-Churn.csv and show churn rate
Thinking

We need to load dataset from URL, compute churn rate (percentage of customers with Churn = Yes?). Provide code that loads, shows shape, maybe head, then compute churn rate. According to instructions, first step: load and inspect: shape, columns, head. Also compute churn rate. Provide code block with loading, show shape, head, and churn rate. End with a final expression that displays churn rate perhaps. Let's produce code.

Load Telco churn data
import pandas as pd

# Load the Telco Customer Churn dataset
url = "https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/telco-customer-churn/Telco-Customer-Churn.csv"
df = pd.read_csv(url)

# Basic inspection
df.shape          # rows, columns
df.head()         # first few rows
df.columns        # column names
Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',
       'tenure', 'PhoneService', 'MultipleLines', 'InternetService',
       'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',
       'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
       'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],
      dtype='str')
Calculate churn rate
# Compute the churn rate
churn_counts = df['Churn'].value_counts()
churn_rate = churn_counts.get('Yes', 0) / len(df) * 100
churn_rate
np.float64(26.536987079369588)
Thinking

We need to decide if we have enough to answer original goal: "load the telco churn dataset and show churn rate". We have loaded dataset, computed churn_rate = 26.536987079369588 (percentage). So answer: churn rate about 26.54%. Provide final answer with the numeric. No code. Path A.

The churn rate in the Telco Customer Churn dataset is approximately 26.54 %.

encode categorical variables and show class balance
Thinking

User wants to encode categorical variables and show class balance. Currently df has many columns; we need to identify categorical columns. We'll encode using one-hot (pd.get_dummies) or label encoding for binary columns. Then show class balance (counts percentages) after encoding perhaps not needed. Provide code to separate target and features, encode features, and display class distribution.

We need to ensure we don't encode target column. We'll create X = df.drop('Churn', axis=1), y = df['Churn']. Then use get_dummies on X, with drop_first maybe. Then show class balance: y.value_counts(normalize=True)*100. Provide code.

One hot encode features
from sklearn.preprocessing import OneHotEncoder
import pandas as pd

# Separate target and features
y = df["Churn"]                     # target
X = df.drop(columns=["Churn"])      # features

# Identify categorical columns (object dtype) – exclude numeric ones
cat_cols = X.select_dtypes(include="object").columns.tolist()

# One‑hot encode categorical columns (drop_first to avoid perfect collinearity)
X_encoded = pd.get_dummies(X, columns=cat_cols, drop_first=True)

# Show the new shape after encoding
X_encoded.shape
/tmp/ipykernel_2841817/3539554223.py:9: Pandas4Warning: For backward compatibility, 'str' dtypes are included by select_dtypes when 'object' dtype is specified. This behavior is deprecated and will be removed in a future version. Explicitly pass 'str' to `include` to select them, or to `exclude` to remove them and silence this warning.
See https://pandas.pydata.org/docs/user_guide/migration-3-strings.html#string-migration-select-dtypes for details on how to write code that works with pandas 2 and 3.
  cat_cols = X.select_dtypes(include="object").columns.tolist()
(7043, 13601)
Compute class distribution
# Class (target) balance
balance = y.value_counts(normalize=True) * 100
balance
Churn
No     73.463013
Yes    26.536987
Name: proportion, dtype: float64
Thinking

Goal: encode categorical variables and show class balance. The user already executed encoding and displayed shape, and computed class balance (percentage). So we have both encoded matrix shape and class balance percentages. That's sufficient. Need to provide final answer: mention the shape after encoding and the class distribution percentages. No code. Provide final answer with comment <!--final answer-->.

The categorical features were one‑hot encoded, expanding the dataset from the original 7 043 rows to 13 601 encoded columns (shape = (7043, 13601)).

The target “Churn” class distribution is:

  • No → 73.46 %
  • Yes → 26.54 %

These percentages confirm the churn rate of roughly 26.5 % observed earlier.

train a random forest classifier and show accuracy
Thinking

We need to train a RandomForest classifier on the encoded data (X_encoded) and target y. Need to split into train/test, perhaps using train_test_split, then fit RandomForestClassifier, evaluate accuracy on test set, perhaps also show classification report.

Follow notebook steps: current variables: X_encoded, y exist. We should avoid ID column like customerID. In encoding we dropped Churn target but kept customerID, which is an identifier and likely not predictive. Should drop it. In X we had all columns except Churn, including customerID. Should remove that before encoding. But we already encoded with get_dummies; the customerID column is a string, so it got turned into many dummy columns (maybe many unique values). That's undesirable. We need to correct: encode after dropping ID column to avoid high dimension. So we can redo encoding: start over: X = df.drop(columns=['Churn', 'customerID']). Then encode categorical columns. Since we already have X_encoded variable, we might create a new X_enc2 after dropping ID columns.

Simplify: create X_clean = df.drop(columns=['Churn', 'customerID']). Then get dummies. Then split. Train RandomForestClassifier with default parameters (or set n_estimators=100, random_state=42). Compute accuracy and maybe show classification report.

Provide code step: do preprocessing and training, then output accuracy.

Since we are to iterate one step at a time, propose to create cleaned encoded features and split, train model, compute accuracy, and display. We can produce code that does all that in one cell (creating X_clean_encoded, train_test_split, model, accuracy). That's okay as one step.

We'll output accuracy (float) as final expression.

Let's write code accordingly.

Train RandomForest model
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# 1️⃣ Drop the identifier column before encoding
X = df.drop(columns=["Churn", "customerID"])

# 2️⃣ One‑hot encode remaining categorical columns
X_enc = pd.get_dummies(X, drop_first=True)

# 3️⃣ Split data (80 % train, 20 % test)
X_train, X_test, y_train, y_test = train_test_split(
    X_enc, y, test_size=0.2, stratify=y, random_state=42
)

# 4️⃣ Train a RandomForest classifier
rf = RandomForestClassifier(n_estimators=200, random_state=42, n_jobs=5)
rf.fit(X_train, y_train)

# 5️⃣ Predict & evaluate
y_pred = rf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred) * 100
accuracy
79.4180269694819
Thinking

We have output from cell-5: accuracy 79.418... So we have trained a RandomForest, predicted, and computed accuracy. That answers original goal: train a random forest classifier and show accuracy. Provide final answer with the accuracy. No code needed. Include <!--final answer-->.

The Random Forest classifier achieved an accuracy of 79.42 % on the held‑out test set.

plot the top 10 most important features for churn prediction
Thinking

We need to plot top 10 most important features from the RandomForest trained in cell-5. Use seaborn/matplotlib. We have rf feature_importances_, and X_enc columns list. Need to get importances, sort, select top 10, and plot bar chart.

We need to produce code block that creates the plot, using matplotlib/seaborn, display the chart. Provide brief explanation? The instruction: respond concisely, include code block. Since we have data, we can directly compute.

We should ensure that we use variable names from previous cells: rf, X_enc. Also need to import seaborn, matplotlib.pyplot.

We'll create DataFrame of feature importance, sort, get top 10, plot horizontal bar.

We will output the plot and also maybe display the DataFrame of top features. Provide code block with final expression maybe the plot display returns the axes object.

Let's write code.

Below is the code to extract the Random Forest feature importances, pick the 10 highest‑scoring features, and visualise them with a horizontal bar chart.

Plot top 10 feature importances
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# 1️⃣ Build a DataFrame of feature importances
feat_imp = pd.DataFrame({
    "feature": X_enc.columns,
    "importance": rf.feature_importances_
})

# 2️⃣ Sort and keep the top 10
top10 = feat_imp.sort_values(by="importance", ascending=False).head(10)

# 3️⃣ Plot
plt.figure(figsize=(8, 5))
sns.barplot(
    data=top10,
    x="importance",
    y="feature",
    palette="viridis"
)
plt.title("Top 10 Important Features for Churn Prediction")
plt.xlabel("Feature Importance (mean decrease impurity)")
plt.ylabel("")
plt.tight_layout()
plt.show()
top10
/tmp/ipykernel_2841817/1023240169.py:16: FutureWarning: 

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(
feature importance
1 tenure 0.120098
2 MonthlyCharges 0.093543
27 PaymentMethod_Electronic check 0.028786
24 Contract_Two year 0.027891
9 InternetService_Fiber optic 0.024729
12 OnlineSecurity_Yes 0.022743
18 TechSupport_Yes 0.021118
25 PaperlessBilling_Yes 0.020680
23 Contract_One year 0.019556
3 gender_Male 0.018799
Thinking

We have a user goal: "plot the top 10 most important features for churn prediction". The notebook already has a cell 6 that created a bar plot and displayed a table top10 with features and importances. The output shows the figure and a table printed (the top10 DataFrame). So the goal is achieved: plotted top 10 and displayed values. The instruction says: For ranking tasks, do not choose Path A from chart-only output; first produce a value table/list via Path B. But here we have both chart and the printed table top10. The instructions: "If the latest output is chart-only/visual-only and does not include underlying numeric/text values, choose Path B first and print the values used in the chart. For ranking tasks (for example top features...), do not choose Path A from chart-only output; first produce a value table/list via Path B." However we already have a table printed (the output shows the DataFrame). So the underlying values are present. Therefore Path A is permissible. We should provide final answer summarizing the insights: top features and that we plotted them. So we give final answer with no code.

The Random Forest model identified the following ten variables as the most influential for predicting churn (ordered by importance):

RankFeatureImportance
1tenure0.1201
2MonthlyCharges0.0935
3PaymentMethod_Electronic check0.0288
4Contract_Two year0.0279
5InternetService_Fiber optic0.0247
6OnlineSecurity_Yes0.0227
7TechSupport_Yes0.0211
8PaperlessBilling_Yes0.0207
9Contract_One year0.0196
10gender_Male0.0188

These features were visualised with a horizontal bar chart, where the length of each bar corresponds to its importance score (mean decrease impurity). The chart highlights that tenure and MonthlyCharges dominate the model’s predictive power, while variables such as payment method, contract length, and service options also contribute noticeably.

Score cards for gpt-oss:120b for Telco Customer Churn Prediction in Python

Total score: 10/10

  • Task Completion: 2/2

    Excellent

    All four requested steps were completed: dataset loaded, churn rate shown, categorical variables encoded with class balance shown, random forest trained with accuracy reported, and top 10 feature importances plotted.

  • Execution Correctness: 2/2

    Excellent

    The notebook outputs indicate the code ran successfully and produced the expected numeric results and plot. No major runtime errors are evident in the provided content.

  • Output Quality: 3/3

    Excellent

    Outputs match the requested outcomes semantically: churn rate, class balance, model accuracy, and a ranked top-10 feature importance visualization with underlying values.

  • Reasoning Quality: 2/2

    Excellent

    The workflow correctly separates target and features, encodes categorical variables, trains a random forest, and interprets feature importances. The reasoning is clear and aligned with the task.

  • Reliability: 1/1

    Excellent

    The workflow is reasonably robust and consistent, though it briefly encoded customerID before later correcting preprocessing by dropping it for modeling.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.