Telco Customer Churn Prediction in Python

Analyze the Telco Customer Churn dataset, engineer features, train a random forest classifier, and identify top churn drivers.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow loads the Telco Customer Churn CSV dataset and computes the overall churn rate. It encodes categorical variables, checks class balance, and prepares features for modeling. It trains a random forest classifier, reports accuracy and a classification report, and plots the top 10 feature importances to highlight churn drivers.

Who this example is for

This is for data analysts and data scientists who want a reproducible churn modeling example using a standard telco dataset. It helps practitioners practice preprocessing mixed-type tabular data and interpreting a tree-based classifier with feature importance plots.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

  • Dataset loaded with shape (7043, 21) and churn rate around 26.5%
  • Categorical encoding completed and class imbalance visualized
  • Random forest model trained with accuracy around 0.80 and a classification report
  • Top 10 churn drivers plotted, typically including tenure, MonthlyCharges, and TotalCharges

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

  • pandas
  • numpy
  • scikit-learn
  • matplotlib
  • seaborn

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

  1. 1load the telco churn dataset https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/telco-customer-churn/Telco-Customer-Churn.csv and show churn rate
  2. 2encode categorical variables and show class balance
  3. 3train a random forest classifier and show accuracy
  4. 4plot the top 10 most important features for churn prediction

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model SourceScoreTaskExecutionOutputReasoningReliabilityRun
glm-5.110/102/22/23/32/21/1Open glm-5.1 conversation
gpt-5.410/102/22/23/32/21/1Open gpt-5.4 conversation
gpt-oss:120b10/102/22/23/32/21/1Open gpt-oss:120b conversation
qwen3-coder-next10/102/22/23/32/21/1Open qwen3-coder-next conversation
gemma4:31b6/102/21/22/31/20/1Open gemma4:31b conversation
qwen3.5:397b6/102/21/22/31/20/1Open qwen3.5:397b conversation
How scoring works

Each run is graded on five dimensions:

  • Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
  • Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
  • Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
  • Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
  • Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with qwen3-coder-next

This run uses AI with a step-by-step list of prompts to perform the analysis task Telco Customer Churn Prediction in Python. Below, you can review the full conversation generated with qwen3-coder-next. After the conversation, we show detailed score cards for this run.

load the telco churn dataset https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/telco-customer-churn/Telco-Customer-Churn.csv and show churn rate
Load Telco churn data and compute churn rate
import pandas as pd

# Load the Telco Customer Churn dataset
url = "https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/telco-customer-churn/Telco-Customer-Churn.csv"
df = pd.read_csv(url)

# Show basic info and churn rate
print(f"Dataset shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print("\nChurn Rate:")
churn_rate = df['Churn'].value_counts(normalize=True) * 100
churn_rate
Dataset shape: (7043, 21)
Columns: ['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents', 'tenure', 'PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn']

Churn Rate:
Churn
No     73.463013
Yes    26.536987
Name: proportion, dtype: float64

The dataset has been successfully loaded, and the churn rate has been computed and displayed. The output shows that approximately 26.54% of customers churned (labeled as Yes), while 73.46% did not churn (No). This directly answers the user’s goal.

The Telco Customer Churn dataset contains 7,043 rows and 21 columns. The churn rate is 26.54% — meaning about 1 in 4 customers left the service.

encode categorical variables and show class balance

I'll encode the categorical variables and show the class balance of the target (Churn) column.

First, let’s identify the categorical columns and encode them using label encoding or one-hot encoding — label encoding is simpler and sufficient for binary targets like Churn. However, for features with more than two categories, one-hot encoding is preferred to avoid imposing false ordinality.

We’ll:

  1. Identify categorical columns (excluding customerID, which is an ID and not useful).
  2. Encode them using pd.get_dummies() (one-hot encoding).
  3. Show class balance for the Churn column (as proportions).
Prepare and encode churn dataset
# Copy the dataset to avoid modifying original
df_encoded = df.copy()

# Drop 'customerID' as it's just an identifier
df_encoded.drop(columns=['customerID'], inplace=True)

# Convert 'TotalCharges' to numeric, coercing errors to NaN
df_encoded['TotalCharges'] = pd.to_numeric(df_encoded['TotalCharges'], errors='coerce')

# Encode 'Churn' to binary (0 = No, 1 = Yes) first
df_encoded['Churn'] = df_encoded['Churn'].map({'No': 0, 'Yes': 1})

# Now one-hot encode all remaining object columns (categorical features)
categorical_cols = df_encoded.select_dtypes(include=['object']).columns
df_encoded = pd.get_dummies(df_encoded, columns=categorical_cols)

# Show class balance for Churn
class_balance = df_encoded['Churn'].value_counts(normalize=True) * 100
class_balance
/var/folders/pp/x90pfm0140d4glxshn77bthh0000gn/T/ipykernel_5641/2744803467.py:14: Pandas4Warning: For backward compatibility, 'str' dtypes are included by select_dtypes when 'object' dtype is specified. This behavior is deprecated and will be removed in a future version. Explicitly pass 'str' to `include` to select them, or to `exclude` to remove them and silence this warning.
See https://pandas.pydata.org/docs/user_guide/migration-3-strings.html#string-migration-select-dtypes for details on how to write code that works with pandas 2 and 3.
  categorical_cols = df_encoded.select_dtypes(include=['object']).columns
Churn
0    73.463013
1    26.536987
Name: proportion, dtype: float64

The latest execution successfully encoded the categorical variables and computed the class balance for Churn. The output confirms:

  • Categorical features (e.g., gender, Contract, PaymentMethod) were one-hot encoded.
  • Churn was mapped to binary values (No0, Yes1).
  • Class balance shows 73.46% of customers are non-churners (0) and 26.54% are churners (1), indicating a moderate class imbalance.

This fully satisfies the original goal: encoding categorical variables and showing class balance.

The dataset has been successfully encoded: all categorical features (except Churn) are one-hot encoded, and Churn is converted to binary (0 = No, 1 = Yes). Class balance shows ~73.5% non-churners and ~26.5% churners, indicating the target is imbalanced but not extremely skewed.

train a random forest classifier and show accuracy

We'll now train a Random Forest classifier to predict Churn using the encoded dataset (df_encoded). We'll:

  1. Split features (X) and target (y),
  2. Exclude any identifier/leakage columns (none remain),
  3. Train a RandomForestClassifier,
  4. Evaluate accuracy.

We’ll use scikit-learn’s standard pipeline.

Train Random Forest model for churn prediction
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Prepare features (X) and target (y)
X = df_encoded.drop(columns=['Churn'])
y = df_encoded['Churn']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Train Random Forest
rf = RandomForestClassifier(random_state=42)
rf.fit(X_train, y_train)

# Predict and compute accuracy
y_pred = rf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracy
0.7799858055358411

The Random Forest classifier was successfully trained, and the accuracy on the test set is 0.780 (or 78.0%), as shown in output [cell-3]. This directly fulfills the original goal of training a Random Forest classifier and showing its accuracy.

The Random Forest classifier achieved an accuracy of 78.0% on the test set (0.780), indicating reasonable predictive performance for churn prediction.

plot the top 10 most important features for churn prediction

We now have the trained Random Forest model (rf) and can extract feature importances.

Next step: Plot the top 10 most important features using matplotlib/seaborn.

Let’s:

  • Extract feature importances from the model,
  • Combine with feature names from X.columns,
  • Sort and plot top 10.
Plot top 10 feature importances
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Get feature importances and sort them
feature_importances = pd.Series(rf.feature_importances_, index=X.columns).sort_values(ascending=False)

# Get top 10 features
top_10 = feature_importances.head(10)

# Plot
plt.figure(figsize=(10, 6))
sns.barplot(x=top_10.values, y=top_10.index, orient='h', palette='viridis')
plt.title('Top 10 Feature Importances for Churn Prediction (Random Forest)')
plt.xlabel('Importance')
plt.ylabel('Feature')
plt.tight_layout()
plt.show()
/var/folders/pp/x90pfm0140d4glxshn77bthh0000gn/T/ipykernel_5641/743593330.py:13: FutureWarning: 

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(x=top_10.values, y=top_10.index, orient='h', palette='viridis')

The chart in output [cell-4] visualizes the top 10 most important features, but it only shows the plot — no explicit table of importance values is printed. Since the goal is to plot and show importance, and the plot renders successfully, we should verify the underlying numerical values to confirm clarity and completeness.

Let’s print the top 10 feature importances as a table.

Display top 10 features
top_10_features = feature_importances.head(10)
top_10_features
TotalCharges                      0.157756
tenure                            0.145376
MonthlyCharges                    0.134853
Contract_Month-to-month           0.050099
OnlineSecurity_No                 0.035071
PaymentMethod_Electronic check    0.032527
InternetService_Fiber optic       0.029281
TechSupport_No                    0.025709
SeniorCitizen                     0.020109
OnlineBackup_No                   0.018123
dtype: float64

The latest output in cell-5 provides the exact top 10 features and their importance scores (e.g., TotalCharges = 0.158, tenure = 0.145, etc.), which directly supports the original goal of plotting the top 10 most important features for churn prediction. Cell-4 already rendered the horizontal bar chart of these top 10 features, and now the table confirms the underlying values used in the plot.

✅ Goal achieved: The visualization and numeric ranking are both present and consistent.

The top 10 most important features for churn prediction, as ranked by the trained Random Forest model, are:

  1. TotalCharges (0.158)
  2. tenure (0.145)
  3. MonthlyCharges (0.135)
  4. Contract_Month-to-month (0.050)
  5. OnlineSecurity_No (0.035)
  6. PaymentMethod_Electronic check (0.033)
  7. InternetService_Fiber optic (0.029)
  8. TechSupport_No (0.026)
  9. SeniorCitizen (0.020)
  10. OnlineBackup_No (0.018)

The bar chart in cell-4 visually confirms this ranking, with the highest bars corresponding to the top features. Notably, billing-related features like TotalCharges, MonthlyCharges, and contract type (Contract_Month-to-month) dominate the importance ranking, aligning with domain intuition about customer churn drivers.

Score cards for qwen3-coder-next for Telco Customer Churn Prediction in Python

Total score: 10/10

  • Task Completion: 2/2

    Excellent

    All four requested steps were completed: dataset loading, categorical encoding with class balance, random forest training with accuracy, and top-10 feature importance plotting.

  • Execution Correctness: 2/2

    Excellent

    The code is coherent and likely runnable. The outputs show successful execution with no major errors, only non-fatal warnings.

  • Output Quality: 3/3

    Excellent

    The outputs match the expected outcomes semantically: churn rate, class balance, accuracy, and feature importance ranking are all shown, with the plot and numeric importances consistent.

  • Reasoning Quality: 2/2

    Excellent

    The workflow reasoning is clear and appropriate, including sensible preprocessing choices and a valid interpretation of the model outputs.

  • Reliability: 1/1

    Excellent

    The workflow is reasonably robust and consistent, with outputs supporting the claims and no hallucinated results.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.