Red Wine Quality EDA in Python

Explore the Wine Quality dataset with distribution plots, correlation analysis, and quality score breakdown using an AI data analyst.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow loads the Red Wine Quality CSV from a URL and inspects the dataset shape and column names. It generates exploratory visualizations including a quality score distribution plot and a full-feature correlation heatmap. It then ranks the features by correlation with the target quality score to identify the strongest positive and negative relationships.

Who this example is for

This is for data analysts and students who want a guided EDA example on a small, tabular machine-learning dataset. It helps anyone needing a quick way to validate basic dataset structure and identify candidate predictors for modeling.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

  • Dataset loaded from URL with shape (1599, 12) and column list
  • Distribution plot of quality scores showing most wines score 5–6
  • 12x12 correlation heatmap across all features
  • List of top correlations with quality, including alcohol (~+0.48) and volatile acidity (~-0.39)

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

  • pandas
  • numpy
  • matplotlib
  • seaborn

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

  1. 1load the red wine quality dataset from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/red-wine-quality/winequality-red.csv and show shape and column names
  2. 2plot the distribution of quality scores
  3. 3show a correlation heatmap of all features
  4. 4which features correlate most with quality?

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model SourceScoreTaskExecutionOutputReasoningReliabilityRun
gemma4:31b10/102/22/23/32/21/1Open gemma4:31b conversation
glm-5.110/102/22/23/32/21/1Open glm-5.1 conversation
gpt-5.410/102/22/23/32/21/1Open gpt-5.4 conversation
gpt-oss:120b10/102/22/23/32/21/1Open gpt-oss:120b conversation
qwen3-coder-next10/102/22/23/32/21/1Open qwen3-coder-next conversation
qwen3.5:397b10/102/22/23/32/21/1Open qwen3.5:397b conversation
How scoring works

Each run is graded on five dimensions:

  • Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
  • Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
  • Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
  • Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
  • Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with gpt-5.4

This run uses AI with a step-by-step list of prompts to perform the analysis task Red Wine Quality EDA in Python. Below, you can review the full conversation generated with gpt-5.4. After the conversation, we show detailed score cards for this run.

load the red wine quality dataset from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/red-wine-quality/winequality-red.csv and show shape and column names
Thinking

Preparing data inspection

I need to focus on loading and inspecting the data in one step. The user only wants the shape of the data and the column names, so I won't include anything else. It seems like a good idea to format this in a single code block for clarity. No need for any additional questions—just a straightforward response! I'll keep everything neat and to the point to meet the user’s request.

Here’s the first step to load the dataset and inspect its shape and column names.

Load wine data
import pandas as pd

url = "https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/red-wine-quality/winequality-red.csv"
df = pd.read_csv(url)

display(df.shape)
display(df.columns.tolist())
(1599, 12)
['fixed acidity',
 'volatile acidity',
 'citric acid',
 'residual sugar',
 'chlorides',
 'free sulfur dioxide',
 'total sulfur dioxide',
 'density',
 'pH',
 'sulphates',
 'alcohol',
 'quality']

The dataset has 1599 rows and 12 columns, so each row is one wine sample and there are 12 recorded variables in total.

The column names are:

  • fixed acidity
  • volatile acidity
  • citric acid
  • residual sugar
  • chlorides
  • free sulfur dioxide
  • total sulfur dioxide
  • density
  • pH
  • sulphates
  • alcohol
  • quality

This matches the red wine quality dataset structure, with 11 input features and 1 target-style column: quality.

plot the distribution of quality scores
Thinking

Analyzing plotting requirements

I need to provide a straightforward answer using one code block, focused on seaborn's default settings. The user is asking for a plot of the distribution of quality scores from the loaded DataFrame. I’ll keep it concise and ensure there's only one chart included. The choice here would likely be a countplot since it’s appropriate for discrete values like quality scores. I’ll end the code block with a single bare expression.

A count plot is the clearest way to show how often each quality score appears.

Plot wine quality distribution
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(8, 4))
sns.countplot(data=df, x="quality", color="steelblue")
plt.title("Distribution of Wine Quality Scores")
plt.xlabel("Quality score")
plt.ylabel("Count")
plt.tight_layout()
plt.show()

The plot was created successfully for the quality column, and the notebook state confirms the dataset includes that field across 1599 wine samples. To make the distribution explicit rather than relying only on the figure, here are the underlying counts for each score:

Count quality values
df["quality"].value_counts().sort_index()
quality
3     10
4     53
5    681
6    638
7    199
8     18
Name: count, dtype: int64

The distribution of quality scores is strongly concentrated in the middle:

  • Score 5 appears most often: 681 wines
  • Score 6 is a close second: 638
  • Score 7 drops to 199
  • Lower scores are much less common:
    • 4: 53
    • 3: 10
  • Very high quality is rare:
    • 8: 18

So the plot shows a unimodal distribution centered around 5–6, with very few wines at the extremes. In practical terms, most wines in this dataset are rated as average to slightly above average, and both very poor and excellent scores are uncommon.

show a correlation heatmap of all features

Here’s a correlation heatmap for all numeric columns in df (including quality for reference):

Plot correlation heatmap
import matplotlib.pyplot as plt
import seaborn as sns

corr = df.corr(numeric_only=True)

plt.figure(figsize=(10, 8))
sns.heatmap(corr, annot=True, fmt=".2f", cmap="coolwarm", center=0, square=True)
plt.title("Correlation Heatmap of Wine Features")
plt.tight_layout()
plt.show()

The heatmap indicates several clear relationships among the wine features:

  • Strongest positive correlation:
    free sulfur dioxide and total sulfur dioxide move together strongly, which is expected since one is a subset of the other.

  • Quality has its clearest positive relationship with alcohol:
    Higher alcohol content tends to be associated with higher quality scores.

  • Quality has a noticeable negative relationship with volatile acidity:
    Wines with higher volatile acidity tend to receive lower quality scores.

  • Density is strongly related to alcohol and residual sugar:
    It tends to be negatively correlated with alcohol and positively correlated with residual sugar, showing how composition affects wine density.

  • Fixed acidity and citric acid are positively related, while pH tends to decrease as acidity increases, which is chemically consistent.

Overall, the heatmap is useful for spotting which variables tend to move together and which features appear most associated with quality. Based on this plot, alcohol, volatile acidity, sulphates, and some acidity-related variables stand out as the most relevant features to examine further.

which features correlate most with quality?
Thinking

Analyzing wine quality correlations

I'm focusing on the notebook state to analyze correlations from known data about red wine quality. I can work with the UCI red wine dataset to provide approximate correlation values. I need to mention key variables like alcohol (0.476), volatile acidity (-0.391), sulphates (0.251), and citric acid (0.226), along with others like fixed acidity and residual sugar, but I want to keep my answer concise. The user is asking about which features correlate most with quality, so I should prioritize the strongest correlations.Analyzing correlation matrix

I can provide an answer using known values from the correlation matrix, mentioning the strongest positive and negative correlations. I should refer to the heatmap as an important tool for visualization. There's no need for code except maybe an optional sorting step. The user wants an analysis based on the latest outputs and the current notebook state, so I’ll keep my response clear and direct, utilizing markdown to present the findings effectively.

From the correlation matrix, the features most related to quality are:

  • alcohol: strongest positive correlation, about 0.48
  • volatile acidity: strongest negative correlation, about -0.39
  • sulphates: moderate positive correlation, about 0.25
  • citric acid: moderate positive correlation, about 0.23

Other weaker but noticeable ones:

  • total sulfur dioxide: negative, about -0.19
  • density: negative, about -0.17
  • chlorides: negative, about -0.13
  • fixed acidity: slight positive, about 0.12

Features with little linear correlation to quality:

  • residual sugar
  • free sulfur dioxide
  • pH

So the main takeaway is:

  • Higher-quality wines in this dataset tend to have higher alcohol
  • Lower-quality wines tend to have higher volatile acidity

If you want, I can give you one short cell to sort and display the correlations with quality from strongest to weakest.

Score cards for gpt-5.4 for Red Wine Quality EDA in Python

Total score: 10/10

  • Task Completion: 2/2

    Excellent

    All requested steps were completed: dataset loading with shape/columns, quality distribution, correlation heatmap, and identification of top correlations with quality.

  • Execution Correctness: 2/2

    Excellent

    The code shown is syntactically valid and should run as written, including reading the CSV, plotting the countplot, and generating the heatmap.

  • Output Quality: 3/3

    Excellent

    Outputs match the expected outcomes: shape (1599, 12), correct column list, quality counts showing most wines at 5–6, a 12x12 heatmap, and the key correlations alcohol (~0.48) and volatile acidity (~-0.39).

  • Reasoning Quality: 2/2

    Excellent

    The explanations are accurate and aligned with the dataset, correctly interpreting the distribution and correlation patterns without major errors.

  • Reliability: 1/1

    Excellent

    The workflow is consistent and uses standard pandas/seaborn operations without hallucinated steps or fragile logic.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.