Value at Risk (VaR) Analysis in Python

Compute Value at Risk (VaR), Conditional VaR (CVaR), and maximum drawdown for a stock portfolio using historical simulation.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow downloads 3 years of SPY, QQQ, and GLD price data, computes daily returns, and builds an equal-weight portfolio return series. It estimates 95% and 99% Value at Risk (VaR) using historical simulation and computes 95% Conditional VaR (CVaR/Expected Shortfall). It visualizes the return distribution with VaR and CVaR thresholds and reports maximum drawdown for the portfolio.

Who this example is for

This is for analysts and students who need a reproducible Python notebook for portfolio risk metrics from market data. It helps validate downside risk estimates and communicate them with standard plots and summary statistics.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

  • Daily equal-weight portfolio return series
  • Historical-simulation VaR estimates at 95% and 99%
  • CVaR (Expected Shortfall) at 95%
  • Maximum drawdown statistic
  • Histogram of returns with VaR/CVaR threshold lines

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

  • yfinance
  • pandas
  • numpy
  • matplotlib
  • scipy

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

  1. 1download 3 years of SPY, QQQ, GLD prices and compute daily portfolio returns assuming equal weights with yfinance
  2. 2calculate 95% and 99% Value at Risk using historical simulation
  3. 3calculate Conditional VaR (Expected Shortfall) at 95%
  4. 4plot the return distribution with VaR and CVaR thresholds marked

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model SourceScoreTaskExecutionOutputReasoningReliabilityRun
glm-5.110/102/22/23/32/21/1Open glm-5.1 conversation
gpt-oss:120b10/102/22/23/32/21/1Open gpt-oss:120b conversation
qwen3-coder-next10/102/22/23/32/21/1Open qwen3-coder-next conversation
gemma4:31b9/102/22/22/32/21/1Open gemma4:31b conversation
gpt-5.49/102/22/22/32/21/1Open gpt-5.4 conversation
qwen3.5:397b2/101/20/20/31/20/1Open qwen3.5:397b conversation
How scoring works

Each run is graded on five dimensions:

  • Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
  • Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
  • Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
  • Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
  • Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with gpt-oss:120b

This run uses AI with a step-by-step list of prompts to perform the analysis task Value at Risk (VaR) Analysis in Python. Below, you can review the full conversation generated with gpt-oss:120b. After the conversation, we show detailed score cards for this run.

download 3 years of SPY, QQQ, GLD prices and compute daily portfolio returns assuming equal weights with yfinance
Thinking

We need to guide user. First step: download data using yfinance for 3 years (maybe from today). Compute daily returns and equal-weighted portfolio returns. Should load data, show shape, columns, head. We'll write code to download SPY, QQQ, GLD adjusted close, combine into DataFrame, compute daily pct change, compute portfolio return = mean across assets (equal weights). Show maybe first few rows.

We'll provide code block and let it run.

Here’s the first thing we need to do: pull three years of daily Adjusted Close prices for SPY, QQQ and GLD, calculate each asset’s daily return, and then combine them into an equal‑weight portfolio return series.

Download adjusted close prices
import yfinance as yf
import pandas as pd

# define tickers and period
tickers = ["SPY", "QQQ", "GLD"]
# 3 years back from today
data = yf.download(tickers, period="3y", interval="1d", auto_adjust=False, progress=False)

# keep only Adjusted Close prices and rename columns for clarity
adj_close = data["Adj Close"].rename(columns=lambda x: x.upper())
adj_close.head()
Ticker GLD QQQ SPY
Date
2023-04-11 186.279999 310.325409 394.275970
2023-04-12 187.190002 307.584076 392.668884
2023-04-13 189.720001 313.607208 397.884644
2023-04-14 186.360001 313.017670 396.912689
2023-04-17 185.529999 313.282990 398.336945

Run the cell; it will display the first few rows of the adjusted‑close price matrix for the three symbols. Once we have that, we’ll compute daily returns and the equal‑weight portfolio.

Thinking

We have loaded adj_close with columns GLD, QQQ, SPY. Need to compute daily returns for each and then equal-weight portfolio returns. We have not yet computed daily returns or portfolio returns. So we need to calculate daily percent change and combine with equal weights (1/3 each). Provide code to compute daily returns and portfolio returns and display first few rows. That's one minimal code block.

We’ve successfully retrieved the three‑year adjusted‑close price series for GLD, QQQ, and SPY.
Next we’ll compute each asset’s daily simple return and then combine them into an equal‑weight portfolio (1/3 GLD + 1/3 QQQ + 1/3 SPY). The resulting series shows the daily portfolio return you asked for.

Compute equal weighted returns
# daily simple returns for each ticker
daily_ret = adj_close.pct_change().dropna()

# equal weight for the three assets
weights = [1/3, 1/3, 1/3]

# portfolio daily return = weighted sum of individual returns
port_daily_ret = daily_ret.dot(weights)

# show the first few portfolio returns
port_daily_ret.head()
Date
2023-04-12   -0.002675
2023-04-13    0.015460
2023-04-14   -0.007344
2023-04-17   -0.000006
2023-04-18    0.001532
dtype: float64
Thinking

We need to decide if we have sufficient answer. Goal: download prices and compute daily portfolio returns assuming equal weights. The notebook shows adj_close head and port_daily_ret head. That essentially satisfies the goal. Need to provide final answer: give daily portfolio returns series maybe summary. Provide insights: show that we computed returns etc. Provide final answer with series head or maybe full series? The user wants "download 3 years of SPY, QQQ, GLD prices and compute daily portfolio returns assuming equal weights". The outputs already show that we have the series, but we might provide the computed daily returns series (maybe show first few, explain). That's enough. So Path A: provide final answer. No code.

The three‑year price data for GLD, QQQ, and SPY were successfully downloaded, and their daily simple returns were calculated. By applying an equal‑weight allocation ( 1/3 GLD + 1/3 QQQ + 1/3 SPY ), the portfolio’s daily return series was obtained:

DateEqual‑weight daily return
2023‑04‑12–0.002675
2023‑04‑130.015460
2023‑04‑14–0.007344
2023‑04‑17–0.000006
2023‑04‑180.001532

The port_daily_ret variable holds the complete daily return series for the entire three‑year period, ready for any further analysis (e.g., cumulative performance, risk metrics, visualisation, etc.).

calculate 95% and 99% Value at Risk using historical simulation
Thinking

We have portfolio daily returns series port_daily_ret. Need to compute VaR at 95% and 99% using historical simulation (i.e., empirical quantiles of returns). Provide code to compute VaR as negative of appropriate percentile (e.g., VaR_95 = -np.percentile(port_daily_ret, 5)). Also could present results. Since we haven't executed code for VaR, we need to compute. According to rules: propose next step with code and stop (if no output). But we have data variable port_daily_ret present. We can compute VaR directly. Provide code that calculates and displays VaR values. End with bare expression showing VaR values. Since they ask to calculate, we can compute and display. No need to ask for output. Provide code and it will output VaR values.

Compute portfolio VaR
import numpy as np

# 95 % VaR = loss that is exceeded only 5 % of the time
var_95 = -np.percentile(port_daily_ret, 5)

# 99 % VaR = loss that is exceeded only 1 % of the time
var_99 = -np.percentile(port_daily_ret, 1)

# show the results
var_95, var_99
(np.float64(0.01285838991567934), np.float64(0.022490334062017593))
Thinking

The user wants to calculate 95% and 99% VaR using historical simulation. The notebook already computed var_95 and var_99 and displayed them as np.float64 values.

Need to provide a natural final answer with those values, in plain language, no code. Should include <!--final answer-->.

We must not repeat the output exactly as shown? We can restate them as numeric values. Provide insights: 95% VaR ~ 1.2858% loss, 99% VaR ~ 2.2490% loss.

Let's produce final answer.

The historical‑simulation VaR for the equal‑weighted SPY‑QQQ‑GLD portfolio over the past three years is:

  • 95 % VaR: 0.01286 ≈ 1.29 % loss (the portfolio loses more than this on only 5 % of trading days)
  • 99 % VaR: 0.02249 ≈ 2.25 % loss (the portfolio loses more than this on only 1 % of trading days)

These values were obtained by taking the 5th‑percentile and 1st‑percentile of the daily return distribution and converting the negative return to a positive loss figure.

calculate Conditional VaR (Expected Shortfall) at 95%
Thinking

We need to compute Conditional VaR (Expected Shortfall) at 95% using historical simulation. Use portfolio daily returns var already computed. Provide code to compute ES: average loss beyond VaR threshold. So for 95%: average of returns less than -var (or lower than VaR threshold). Since we have returns (negative for loss). ES = -np.mean(port_daily_ret[port_daily_ret <= -var_95])? Actually VaR is positive loss amount. So threshold = -var_95 (a negative return). Compute mean of returns that are below that threshold (more negative). Then take negative sign. Provide code block; final output expression.

Compute Expected Shortfall
import numpy as np

# 95 % VaR we already have (positive loss amount)
# ES = average loss conditional on a loss exceeding the VaR level
threshold = -var_95                     # this is a negative return
es_95 = -np.mean(port_daily_ret[port_daily_ret <= threshold])

es_95
np.float64(0.020015855597329227)
Thinking

We need to determine if we have sufficient output to answer the original user goal: calculate Conditional VaR (Expected Shortfall) at 95%.

We have already computed es_95 in cell-4, output is np.float64(0.020015855597329227). That's the 95% Expected Shortfall (average loss beyond VaR). So we have the answer.

We need to provide a natural final answer based on observed outputs, with <!--final answer-->. No code block. Provide insight. Use USD? Not needed as it's a percentage loss. Maybe just give as decimal or percent.

We should mention that ES = 0.0200 ≈ 2.00% loss. Provide final answer.

The 95 % Conditional Value at Risk (Expected Shortfall) for the equally weighted SPY‑QQQ‑GLD portfolio over the past three years is 0.0200, i.e., the average loss on the worst 5 % of days is about 2.0 % of the portfolio value.

plot the return distribution with VaR and CVaR thresholds marked
Thinking

We need to plot return distribution (histogram or density) and mark VaR and CVaR (ES) thresholds. Use seaborn/matplotlib. Provide code that plots histogram of port_daily_ret, adds vertical lines at -var_95 (loss) and -es_95? Wait thresholds: VaR is positive loss amount (0.01286). The actual return value threshold is -var_95 = -0.01286 (negative). ES is average loss beyond that: es_95 = 0.0200 positive, so the threshold for line is -es_95? Actually ES is average loss (positive) on worst tail; we can mark the VaR line at -var_95 and a line at -es_95 (more negative). Provide legend. Use seaborn histplot with kde. Provide code block. That is the next step.

Here’s a quick distribution plot of the portfolio daily returns with the 95 % VaR and the 95 % Conditional VaR (Expected Shortfall) highlighted.

Plot returns with VaR
import matplotlib.pyplot as plt
import seaborn as sns

# portfolio returns already in `port_daily_ret`
# VaR and ES values from previous cells
var_95  # 0.01285838991567934
es_95   # 0.020015855597329227

# Convert to the actual return levels (negative numbers)
var_level = -var_95
es_level  = -es_95

plt.figure(figsize=(8, 5))
# histogram + KDE
sns.histplot(port_daily_ret, bins=50, kde=True, color="steelblue", edgecolor="white")
# VaR line
plt.axvline(var_level, color="red", linestyle="--", linewidth=2,
            label=f"95 % VaR = {var_95:.2%}")
# ES line
plt.axvline(es_level, color="orange", linestyle="-.", linewidth=2,
            label=f"95 % CVaR = {es_95:.2%}")

plt.title("Portfolio Daily Return Distribution")
plt.xlabel("Daily Return")
plt.ylabel("Frequency")
plt.legend()
plt.tight_layout()
plt.show()
Thinking

We need to decide if current outputs are sufficient to answer the original goal: "plot the return distribution with VaR and CVaR thresholds marked". The latest output [output-5] is a Figure displaying histogram KDe and lines for VaR and CVaR. That matches the request. The user asks to plot; they have the plot. They might want insights? The task is to "plot the return distribution with VaR and CVaR thresholds marked". The code produced a figure. The output shows figure. So we have satisfied the goal. According to policy, if sufficient, we must give final answer with <!--final answer--> and no code. Provide concise answer summarizing the plot and thresholds. No extra.

The distribution of the portfolio’s daily returns has been plotted. The red dashed line marks the 95 % VaR at 1.29 % loss, and the orange dash‑dot line marks the 95 % Conditional VaR (Expected Shortfall) at 2.00 % loss, showing the average loss in the worst 5 % of days.

Score cards for gpt-oss:120b for Value at Risk (VaR) Analysis in Python

Total score: 10/10

  • Task Completion: 2/2

    Excellent

    The notebook completed the requested workflow: downloaded SPY/QQQ/GLD data, computed equal-weight portfolio returns, calculated VaR/CVaR, and produced a histogram with threshold lines.

  • Execution Correctness: 2/2

    Excellent

    The code shown is syntactically coherent and the outputs indicate it ran successfully, including the plotted figure and numeric risk metrics.

  • Output Quality: 3/3

    Excellent

    Outputs match the expected outcomes semantically: portfolio return series, VaR values, CVaR value, and a histogram with marked thresholds. The VaR magnitudes are reasonable and the plot is present.

  • Reasoning Quality: 2/2

    Excellent

    The workflow correctly uses historical simulation and equal-weight aggregation, and the explanations for VaR and CVaR are appropriate and consistent with the calculations.

  • Reliability: 1/1

    Excellent

    The approach is standard and consistent, with no obvious hallucinations or unstable logic in the provided notebook content.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.