E-commerce Sales Analysis in Python

Explore an e-commerce sales dataset with monthly trends, top products, category breakdowns, and average order value analysis.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow loads the Online Retail Excel dataset from a URL and summarizes its structure with shape, dtypes, and missing values. It generates exploratory visuals for monthly revenue trends and identifies the top 10 products by total revenue. It computes average order value and compares AOV across countries with a focused country-level breakdown.

Who this example is for

This is for analysts and data scientists who want a conversational, code-generating EDA template for e-commerce transaction data in Python. It helps users quickly validate data quality and produce standard sales KPIs and plots for reporting or further modeling.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

Dataset shape, dtypes, and null counts
Monthly revenue trend line chart
Top 10 products by total revenue bar chart
Average order value overall and by country
Bar chart of top countries by AOV

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

pandas
numpy
matplotlib
seaborn
openpyxl

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

1load the retail sales dataset from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/online-retail/Online-Retail.xlsx and show basic info and shape
2plot monthly revenue trend
3show top 10 products by total revenue
4what is the average order value and how does it vary by country?

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model Source	Score	Task	Execution	Output	Reasoning	Reliability	Run
glm-5.1	10/10	2/2	2/2	3/3	2/2	1/1	Open glm-5.1 conversation
gpt-5.4	10/10	2/2	2/2	3/3	2/2	1/1	Open gpt-5.4 conversation
gpt-oss:120b	10/10	2/2	2/2	3/3	2/2	1/1	Open gpt-oss:120b conversation
qwen3.5:397b	10/10	2/2	2/2	3/3	2/2	1/1	Open qwen3.5:397b conversation
gemma4:31b	9/10	2/2	2/2	3/3	1/2	1/1	Open gemma4:31b conversation
qwen3-coder-next	9/10	2/2	2/2	3/3	1/2	1/1	Open qwen3-coder-next conversation

How scoring works

Each run is graded on five dimensions:

Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with qwen3.5:397b

This run uses AI with a step-by-step list of prompts to perform the analysis task E-commerce Sales Analysis in Python. Below, you can review the full conversation generated with qwen3.5:397b. After the conversation, we show detailed score cards for this run.

load the retail sales dataset from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/online-retail/Online-Retail.xlsx and show basic info and shape

Thinking

The user wants me to load a retail sales dataset from an Excel file URL and show basic info and shape. I'll need to:

Load the Excel file using pandas
Display the shape and basic info

Let me write a single code block to do this.

I'll load the retail sales dataset from the provided URL and display its basic information.

Load online retail data

import pandas as pd

# Load the Excel file from URL
url = "https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/online-retail/Online-Retail.xlsx"
df = pd.read_excel(url)

# Display shape and basic info
df.shape
df.info()
df.head()

<class 'pandas.DataFrame'>
RangeIndex: 541909 entries, 0 to 541908
Data columns (total 8 columns):
 #   Column       Non-Null Count   Dtype         
---  ------       --------------   -----         
 0   InvoiceNo    541909 non-null  object        
 1   StockCode    541909 non-null  object        
 2   Description  540455 non-null  object        
 3   Quantity     541909 non-null  int64         
 4   InvoiceDate  541909 non-null  datetime64[us]
 5   UnitPrice    541909 non-null  float64       
 6   CustomerID   406829 non-null  float64       
 7   Country      541909 non-null  str           
dtypes: datetime64[us](1), float64(2), int64(1), object(3), str(1)
memory usage: 40.0+ MB

	InvoiceNo	StockCode	Description	Quantity	InvoiceDate	UnitPrice	CustomerID	Country
0	536365	85123A	WHITE HANGING HEART T-LIGHT HOLDER	6	2010-12-01 08:26:00	2.55	17850.0	United Kingdom
1	536365	71053	WHITE METAL LANTERN	6	2010-12-01 08:26:00	3.39	17850.0	United Kingdom
2	536365	84406B	CREAM CUPID HEARTS COAT HANGER	8	2010-12-01 08:26:00	2.75	17850.0	United Kingdom
3	536365	84029G	KNITTED UNION FLAG HOT WATER BOTTLE	6	2010-12-01 08:26:00	3.39	17850.0	United Kingdom
4	536365	84029E	RED WOOLLY HOTTIE WHITE HEART.	6	2010-12-01 08:26:00	3.39	17850.0	United Kingdom

Thinking

The user's original goal was to load the retail sales dataset and show basic info and shape. Looking at the latest notebook output, I can see:

The dataset was successfully loaded from the URL
Shape: 541,909 rows × 8 columns
Column info shows:
- InvoiceNo (object)
- StockCode (object)
- Description (object, 540,455 non-null - some missing)
- Quantity (int64)
- InvoiceDate (datetime64)
- UnitPrice (float64)
- CustomerID (float64, 406,829 non-null - some missing)
- Country (str)
First 5 rows are displayed showing sample data

This output fully satisfies the original user goal of loading the dataset and showing basic info and shape. I should provide a final answer summarizing what was found.

The dataset has been successfully loaded. Here's what we found:

Dataset Shape: 541,909 rows × 8 columns

Columns:

Column	Type	Non-Null Count
InvoiceNo	object	541,909 (100%)
StockCode	object	541,909 (100%)
Description	object	540,455 (99.7%)
Quantity	int64	541,909 (100%)
InvoiceDate	datetime64	541,909 (100%)
UnitPrice	float64	541,909 (100%)
CustomerID	float64	406,829 (75.1%)
Country	str	541,909 (100%)

Key Observations:

Data spans from December 2010 (based on InvoiceDate in head)
Two columns have missing values: Description (1,454 missing) and CustomerID (135,080 missing)
CustomerID is stored as float64, likely due to NaN values
The dataset contains transaction-level retail data with product details, quantities, prices, and customer information

plot monthly revenue trend

#10

Thinking

The user wants to plot monthly revenue trend from the retail sales dataset. I need to:

Calculate revenue (Quantity × UnitPrice) for each transaction
Group by month (from InvoiceDate)
Sum the revenue per month
Plot the trend using seaborn/matplotlib

Let me create a single code block that does this.

#11

I'll create a monthly revenue trend plot. First, I'll calculate revenue per transaction (Quantity × UnitPrice), then aggregate by month and visualize the trend.

#12

Plot monthly revenue trend

import matplotlib.pyplot as plt
import seaborn as sns

# Calculate revenue for each transaction
df['Revenue'] = df['Quantity'] * df['UnitPrice']

# Extract year-month for grouping
df['YearMonth'] = df['InvoiceDate'].dt.to_period('M')

# Aggregate revenue by month
monthly_revenue = df.groupby('YearMonth')['Revenue'].sum().reset_index()
monthly_revenue['YearMonth'] = monthly_revenue['YearMonth'].astype(str)

# Plot monthly revenue trend
plt.figure(figsize=(12, 5))
sns.lineplot(data=monthly_revenue, x='YearMonth', y='Revenue', marker='o', linewidth=2)
plt.title('Monthly Revenue Trend', fontsize=14)
plt.xlabel('Month', fontsize=12)
plt.ylabel('Revenue (USD)', fontsize=12)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

monthly_revenue

#13

#14

	YearMonth	Revenue
0	2010-12	748957.020
1	2011-01	560000.260
2	2011-02	498062.650
3	2011-03	683267.080
4	2011-04	493207.121
5	2011-05	723333.510
6	2011-06	691123.120
7	2011-07	681300.111
8	2011-08	682680.510
9	2011-09	1019687.622
10	2011-10	1070704.670
11	2011-11	1461756.250
12	2011-12	433668.010

#15

Thinking

The user's goal was to "plot monthly revenue trend". Looking at the notebook state:

Cell-1: Successfully loaded the dataset (541,909 rows × 8 columns)
Cell-2: Created the monthly revenue trend plot and displayed the underlying data

The output shows:

A figure was generated (Figure size 1200x500 with 1 Axes)
The monthly_revenue table is displayed with 13 months of data (2010-12 through 2011-12)
Revenue values range from about 433,668 USD to 1,461,756 USD

The original goal was to "plot monthly revenue trend" - this has been accomplished. We have:

A visualization (the line plot)
The underlying numeric data showing monthly revenue values

This satisfies the sufficiency heuristic: "If the goal is to show/plot/chart/visualize and latest output contains both a valid chart/figure object and explicit numeric/text values supporting the conclusion, choose Path A unless there is an explicit error."

I should provide a natural final answer summarizing the insights from the monthly revenue trend, include , and not include any Python code block.

Key insights from the data:

Data spans December 2010 to December 2011 (13 months)
Highest revenue: November 2011 with 1,461,756.25 USD
Lowest revenue: December 2011 with 433,668.01 USD (partial month likely)
Strong Q4 2011 performance (Sep-Nov showing growth)
December 2010 started at 748,957.02 USD
There's variability month-to-month but overall the business shows growth through late 2011

I'll write a concise summary of these findings.

#16

The monthly revenue trend has been successfully plotted and analyzed. Here are the key insights:

Revenue Overview (Dec 2010 - Dec 2011):

Period	Revenue (USD)
Highest	November 2011: 1,461,756.25
Lowest	December 2011: 433,668.01
Average	~798,000

Key Patterns:

Strong Q4 2011: Revenue peaked in November 2011 at 1,461,756.25 USD, nearly double the average month. September and October 2011 also showed strong performance (over 1,000,000 USD each).
Seasonal Growth: The business showed clear growth through 2011, with the second half of the year (especially Q4) outperforming the first half.
December 2011 Anomaly: The sharp drop to 433,668.01 USD in December 2011 likely indicates incomplete data for that month (data may not cover the full month).
Volatility: Month-to-month revenue varies considerably, ranging from ~430,000 to ~1,460,000 USD, which is typical for retail businesses with seasonal patterns.

The visualization shows an overall upward trend through 2011 with notable peaks in the holiday shopping season (Q4).

#17

show top 10 products by total revenue

#18

Thinking

The user wants to see the top 10 products by total revenue. I need to group by product (Description or StockCode) and sum the Revenue column, then sort and display the top 10.

I'll use the Description column to identify products, as it's more readable. I should also handle any potential issues with missing descriptions.

#19

I'll calculate total revenue by product and show the top 10.

#20

Notebook code

# Group by product description and sum revenue
product_revenue = df.groupby('Description')['Revenue'].sum().reset_index()
product_revenue = product_revenue.sort_values('Revenue', ascending=False)

# Display top 10 products
top_10_products = product_revenue.head(10)
top_10_products

#21

	Description	Revenue
1098	DOTCOM POSTAGE	206245.48
2915	REGENCY CAKESTAND 3 TIER	164762.19
3918	WHITE HANGING HEART T-LIGHT HOLDER	99668.47
2471	PARTY BUNTING	98302.98
1866	JUMBO BAG RED RETROSPOT	92356.03
2803	RABBIT NIGHT LIGHT	66756.59
2753	POSTAGE	66230.64
2439	PAPER CHAIN KIT 50'S CHRISTMAS	63791.94
244	ASSORTED COLOUR BIRD ORNAMENT	58959.73
773	CHILLI LIGHTS	53768.06

#22

Thinking

The user wants to see the top 10 products by total revenue. Looking at the notebook state, cell-3 has already executed and produced output-3 which shows exactly the top 10 products by total revenue with their revenue values.

The output shows:

DOTCOM POSTAGE: 206245.48 USD
REGENCY CAKESTAND 3 TIER: 164762.19 USD
WHITE HANGING HEART T-LIGHT HOLDER: 99668.47 USD
PARTY BUNTING: 98302.98 USD
JUMBO BAG RED RETROSPOT: 92356.03 USD
RABBIT NIGHT LIGHT: 66756.59 USD
POSTAGE: 66230.64 USD
PAPER CHAIN KIT 50'S CHRISTMAS: 63791.94 USD
ASSORTED COLOUR BIRD ORNAMENT: 58959.73 USD
CHILLI LIGHTS: 53768.06 USD

This directly answers the original user goal. The data is complete with product descriptions and revenue values. I should provide Path A - a final answer with insights from the observed outputs.

Note: I need to format currency values without $ symbol, using USD notation instead (e.g., 206245.48 USD).

#23

The top 10 products by total revenue have been calculated. Here are the results:

Rank	Product	Revenue (USD)
1	DOTCOM POSTAGE	206,245.48
2	REGENCY CAKESTAND 3 TIER	164,762.19
3	WHITE HANGING HEART T-LIGHT HOLDER	99,668.47
4	PARTY BUNTING	98,302.98
5	JUMBO BAG RED RETROSPOT	92,356.03
6	RABBIT NIGHT LIGHT	66,756.59
7	POSTAGE	66,230.64
8	PAPER CHAIN KIT 50'S CHRISTMAS	63,791.94
9	ASSORTED COLOUR BIRD ORNAMENT	58,959.73
10	CHILLI LIGHTS	53,768.06

Key Insights:

Postage dominates: "DOTCOM POSTAGE" leads by a significant margin at 206,245.48 USD, followed by "POSTAGE" at 66,230.64 USD. Combined, postage-related charges account for over 272,000 USD.
Home decor items perform well: Products like the REGENCY CAKESTAND, WHITE HANGING HEART T-LIGHT HOLDER, and RABBIT NIGHT LIGHT are among the top revenue generators, indicating strong demand for decorative home items.
Seasonal products: PARTY BUNTING and PAPER CHAIN KIT 50'S CHRISTMAS suggest party and holiday-themed items contribute significantly to revenue.
Revenue gap: There's a notable drop after the top 2 products, with the #3 product earning about 40% less than #2.

#24

what is the average order value and how does it vary by country?

#25

Thinking

The user wants to know:

Average order value (AOV) overall
How AOV varies by country

I need to:

Calculate order value per invoice (group by InvoiceNo)
Calculate the average across all orders
Group by country and calculate AOV per country

Let me write code to do this analysis.

#26

I'll calculate the average order value overall and break it down by country.

#27

Calculate average order value

# Calculate order value per invoice
order_value = df.groupby('InvoiceNo')['Revenue'].sum().reset_index()
order_value.columns = ['InvoiceNo', 'OrderValue']

# Merge with country information (take first country per invoice)
invoice_country = df.groupby('InvoiceNo')['Country'].first().reset_index()
order_value = order_value.merge(invoice_country, on='InvoiceNo')

# Calculate overall average order value
overall_aov = order_value['OrderValue'].mean()

# Calculate average order value by country
country_aov = order_value.groupby('Country')['OrderValue'].mean().reset_index()
country_aov = country_aov.sort_values('OrderValue', ascending=False)

# Display results
print(f'Overall Average Order Value: {overall_aov:.2f} USD')
print(f'\nNumber of Orders: {len(order_value)}')
print(f'\nAverage Order Value by Country (Top 15):')
country_aov.head(15)

#28

Overall Average Order Value: 376.36 USD

Number of Orders: 25900

Average Order Value by Country (Top 15):

#29

	Country	OrderValue
24	Netherlands	2818.431089
0	Australia	1986.627101
21	Lebanon	1693.880000
20	Japan	1262.165000
4	Brazil	1143.600000
28	RSA	1002.310000
30	Singapore	912.039000
9	Denmark	893.720952
25	Norway	879.086500
18	Israel	878.646667
32	Sweden	795.563261
15	Greece	785.086667
33	Switzerland	761.964189
10	EIRE	731.324500
16	Hong Kong	674.469333

#30

Thinking

The user asked about average order value and how it varies by country. Looking at the latest output (cell-4), I have:

Overall Average Order Value: 376.36 USD
Number of Orders: 25,900
Average Order Value by Country (Top 15 countries shown)

This data directly answers the user's question. The output shows:

The overall AOV
How it varies by country (with Netherlands leading at 2,818.43 USD, followed by Australia at 1,986.63 USD, etc.)

This is sufficient to provide a final answer. I should choose Path A and provide a natural final answer with the  tag, without any Python code block.

Let me format this properly with insights from the observed outputs.

#31

Based on the analysis of 25,900 orders, here are the findings on average order value:

Overall Average Order Value

376.36 USD per order across all countries.

Average Order Value by Country

The AOV varies dramatically by country. Here are the top 15 countries:

Rank	Country	Avg Order Value (USD)
1	Netherlands	2,818.43
2	Australia	1,986.63
3	Lebanon	1,693.88
4	Japan	1,262.17
5	Brazil	1,143.60
6	RSA	1,002.31
7	Singapore	912.04
8	Denmark	893.72
9	Norway	879.09
10	Israel	878.65
11	Sweden	795.56
12	Greece	785.09
13	Switzerland	761.96
14	EIRE	731.32
15	Hong Kong	674.47

Key Insights

Netherlands dominates: With an AOV of 2,818.43 USD, Netherlands is 7.5x higher than the overall average. This suggests either wholesale/B2B customers or bulk purchasing behavior.
Geographic pattern: Non-UK European countries (Netherlands, Denmark, Norway, Sweden, Switzerland) and distant markets (Australia, Japan, Brazil) show significantly higher AOVs, likely due to:
- Higher shipping costs encouraging larger orders
- Fewer but more committed customers
- Potential distributor/wholesale accounts
UK not in top 15: The United Kingdom (the home market) doesn't appear in the top 15, suggesting many smaller retail purchases that bring down the overall average to 376.36 USD.
7.5x variation: The gap between Netherlands (2,818.43 USD) and the overall average (376.36 USD) indicates highly heterogeneous customer segments across countries.

Score cards for qwen3.5:397b for E-commerce Sales Analysis in Python

Total score: 10/10

Task Completion: 2/2
Excellent
All four requested analysis steps were attempted and the notebook includes outputs for dataset info, monthly revenue, top products, and AOV by country.
Execution Correctness: 2/2
Excellent
The provided code is coherent and likely runnable: it loads the dataset, computes revenue, aggregates by month/product/country, and produces the requested outputs.
Output Quality: 3/3
Excellent
The outputs match the expected outcomes semantically: shape/dtypes/nulls are shown, a monthly revenue line chart is produced, top 10 products are listed, and AOV by country is computed with country breakdown.
Reasoning Quality: 2/2
Excellent
The workflow uses appropriate EDA logic and correctly explains the steps for each analysis task. The reasoning is clear and aligned with the requested questions.
Reliability: 1/1
Excellent
The workflow is reasonably robust and consistent, with no obvious hallucinated functions or contradictory logic in the provided content.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Download Free Trial Learn about AI Data Analyst

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.

E-commerce Sales Analysis in Python

What this AI workflow does

Who this example is for

Expected analysis outcomes

Tools and libraries used

Prompt sequence

Model Comparison

AI Analysis with qwen3.5:397b

Overall Average Order Value

Average Order Value by Country

Key Insights

Score cards for qwen3.5:397b for E-commerce Sales Analysis in Python

Try MLJAR Studio

Explore More AI Analysis Examples

Exploratory Data Analysis

Time Series

Data Analysis

Machine Learning

NLP

Finance

Statistics