Bitcoin Returns and Volatility Analysis

Analyze Bitcoin historical returns, compute log returns, measure drawdowns, and detect volatility regimes.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow loads 4-hour Bitcoin historical price data from a CSV URL and plots the closing price time series. It computes log returns, summarizes the return distribution with skewness and kurtosis, and visualizes the distribution. It measures drawdowns over time and uses 30-day rolling volatility to flag high- and low-volatility regimes.

Who this example is for

This is for analysts and researchers who want a reproducible notebook workflow for basic crypto return and risk diagnostics. It helps quantify tail behavior, drawdown risk, and volatility regime shifts from raw price data.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

  • BTC closing price line chart
  • Log return distribution plot with skewness and kurtosis
  • Maximum drawdown time series chart
  • 30-day rolling volatility chart with high/low regime bands

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • scipy

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

  1. 1load Bitcoin price data from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv and plot closing prices
  2. 2calculate log returns and plot distribution with kurtosis and skewness
  3. 3compute and plot maximum drawdown over time
  4. 4identify high and low volatility regimes using 30-day rolling volatility

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model SourceScoreTaskExecutionOutputReasoningReliabilityRun
gemma4:31b10/102/22/23/32/21/1Open gemma4:31b conversation
gpt-5.410/102/22/23/32/21/1Open gpt-5.4 conversation
gpt-oss:120b10/102/22/23/32/21/1Open gpt-oss:120b conversation
qwen3-coder-next9/102/22/23/31/21/1Open qwen3-coder-next conversation
glm-5.16/102/21/22/31/20/1Open glm-5.1 conversation
qwen3.5:397b6/101/22/22/31/20/1Open qwen3.5:397b conversation
How scoring works

Each run is graded on five dimensions:

  • Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
  • Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
  • Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
  • Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
  • Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with qwen3.5:397b

This run uses AI with a step-by-step list of prompts to perform the analysis task Bitcoin Returns and Volatility Analysis. Below, you can review the full conversation generated with qwen3.5:397b. After the conversation, we show detailed score cards for this run.

load Bitcoin price data from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv and plot closing prices
Thinking

The user wants me to:

  1. Load Bitcoin price data from a specific URL
  2. Plot closing prices

I should start by loading the data and doing a basic inspection first, then plot. But the user specifically asked to "load and plot", so I can do both in one step since it's a simple task. However, following my guidelines, I should keep one primary objective per response. Let me load and inspect first, then plot in the next step.

Actually, looking at the rules more carefully: "If the user explicitly asks for a combined task, I can do it in one step if it's straightforward." Loading and plotting closing prices is a common simple task. Let me do both in one code block - load, inspect briefly, and plot.

I'll use pandas to load the CSV and seaborn/matplotlib for plotting.

I'll load the Bitcoin price data and create a plot of the closing prices.

Plot Bitcoin closing price
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load Bitcoin price data
df = pd.read_csv('https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv')

# Use correct column names from output: 'Open time' for x and 'Close' for y
plt.figure(figsize=(14, 6))
sns.lineplot(data=df, x='Open time', y='Close')
plt.title('Bitcoin Closing Price (2018-2025)')
plt.xlabel('Time')
plt.ylabel('Closing Price (USD)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
Thinking

The user's original goal was to load Bitcoin price data and plot closing prices. Looking at the notebook state and outputs:

  1. The code has been executed successfully - we can see <Figure size 1400x600 with 1 Axes> in output-1, which indicates the plot was created successfully.

  2. However, the notebook state shows df has shape 51290 x 24 with columns like Row ID, Order ID, Order Date... - this appears to be a different dataset (looks like a retail/sales dataset, not Bitcoin data). This is inconsistent with the Bitcoin data that was supposed to be loaded.

  3. The plot executed without error (Figure object was created), but I should verify the actual Bitcoin data was loaded correctly and show some concrete values from the plot.

Since the goal was to "load Bitcoin price data and plot closing prices" and we have a successful plot output, I should verify the data is correct and provide some concrete insights. The plot exists but I should show actual data values to confirm the goal is met.

Let me choose Path B to display some actual Bitcoin price data values to confirm the data loaded correctly and provide concrete insights about the closing prices.

The plot was created successfully (Figure size 1400x600), but I notice the notebook state shows a different dataset structure. Let me verify the Bitcoin data loaded correctly and show actual price values to confirm the goal is met.

Load and verify Bitcoin data
# Verify Bitcoin data loaded correctly and show sample values
df = pd.read_csv('https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv')
print(f"Shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")
print(f"\nFirst 5 rows of Close prices:")
df[['Open time', 'Close']].head()
Shape: (17958, 12)
Columns: ['Open time', 'Open', 'High', 'Low', 'Close', 'Volume', 'Close time', 'Quote asset volume', 'Number of trades', 'Taker buy base asset volume', 'Taker buy quote asset volume', 'Ignore']

First 5 rows of Close prices:
Open time Close
0 2018-01-01 00:00:00.000000 13410.03
1 2018-01-01 04:00:00.000000 13570.35
2 2018-01-01 08:00:00.000000 13220.56
3 2018-01-01 12:00:00.000000 13247.00
4 2018-01-01 16:00:00.000000 13240.37
Thinking

The user's original goal was to load Bitcoin price data and plot closing prices. Looking at the notebook outputs:

  1. Cell-1: Successfully created a line plot showing Bitcoin closing prices (Figure size 1400x600 with 1 Axes)
  2. Cell-2: Verified the data loaded correctly with:
    • Shape: (17958, 12) - 17,958 rows and 12 columns
    • Columns include: 'Open time', 'Open', 'High', 'Low', 'Close', 'Volume', etc.
    • First 5 rows show Close prices ranging from 13,220.56 USD to 13,570.35 USD in early January 2018

The original goal has been achieved:

  • ✓ Data loaded from the specified URL
  • ✓ Plot created showing closing prices over time
  • ✓ Data verification shows the Close column contains valid price values

Both the visualization and underlying data values are present in the outputs. The goal is complete - no additional code is needed. I should provide a natural final answer summarizing what was accomplished.

The Bitcoin price data has been successfully loaded and visualized. The dataset contains 17,958 rows of 4-hour interval data from 2018 to 2025 with 12 columns including Open, High, Low, Close, and Volume.

The line plot displays Bitcoin closing prices over time, showing the price trajectory from early 2018 (starting around 13,410 USD) through 2025. The visualization correctly uses 'Open time' for the x-axis and 'Close' for the y-axis, with proper axis labels and rotated date ticks for readability.

calculate log returns and plot distribution with kurtosis and skewness
Thinking

The user wants me to calculate log returns from the Bitcoin price data and plot the distribution with kurtosis and skewness statistics. I'll use the Close prices from the already loaded df to calculate log returns, then create a distribution plot with the statistical measures.

Let me write code to:

  1. Calculate log returns using the Close prices
  2. Plot the distribution histogram
  3. Calculate and display skewness and kurtosis values

I'll use scipy.stats for the statistical measures and seaborn/matplotlib for the visualization.

I'll calculate log returns from the Bitcoin Close prices and plot their distribution with skewness and kurtosis statistics.

Plot log returns distribution
import numpy as np
from scipy import stats

# Calculate log returns
df['Log_Return'] = np.log(df['Close'] / df['Close'].shift(1))

# Calculate skewness and kurtosis
skewness = stats.skew(df['Log_Return'].dropna())
kurtosis = stats.kurtosis(df['Log_Return'].dropna())

# Plot distribution
plt.figure(figsize=(12, 6))
sns.histplot(df['Log_Return'].dropna(), bins=100, kde=True, color='steelblue')
plt.title('Bitcoin Log Returns Distribution')
plt.xlabel('Log Return')
plt.ylabel('Frequency')

# Add statistics text box
stats_text = f'Skewness: {skewness:.4f}\nKurtosis: {kurtosis:.4f}'
plt.text(0.02, 0.95, stats_text, transform=plt.gca().transAxes, 
         verticalalignment='top', bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))

plt.tight_layout()
plt.show()
Thinking

The user's original goal was to "calculate log returns and plot distribution with kurtosis and skewness". Looking at the notebook state:

  1. Cell-3 executed the log returns calculation and plotted the distribution with skewness and kurtosis statistics displayed in a text box on the chart.

  2. The output shows <Figure size 1200x600 with 1 Axes> indicating the plot was generated successfully.

  3. However, the actual numeric values of skewness and kurtosis are NOT visible in the output - they're only shown in the chart's text box, which we can't read from the output metadata.

According to the sufficiency heuristics:

  • "If the latest output is chart-only/visual-only and does not include underlying numeric/text values, choose Path B first to print the values used in the chart."

Since we can't see the actual skewness and kurtosis values from the output (they're embedded in the chart image), I should choose Path B to print these values explicitly so the user can see the concrete numbers.

This is a minimal code step that directly addresses the original goal by making the statistics visible.

The log returns distribution has been plotted successfully. However, the skewness and kurtosis values are only visible within the chart image. Let me print these statistics explicitly so you can see the concrete numerical results.

Notebook code
# Print log returns statistics
print(f"Log Returns Statistics:")
print(f"Count: {df['Log_Return'].dropna().count()}")
print(f"Mean: {df['Log_Return'].dropna().mean():.6f}")
print(f"Std Dev: {df['Log_Return'].dropna().std():.6f}")
print(f"Skewness: {skewness:.4f}")
print(f"Kurtosis: {kurtosis:.4f}")
print(f"\nMin: {df['Log_Return'].dropna().min():.4f}")
print(f"Max: {df['Log_Return'].dropna().max():.4f}")
Log Returns Statistics:
Count: 17957
Mean: 0.000093
Std Dev: 0.013950
Skewness: -0.4950
Kurtosis: 15.6330

Min: -0.2294
Max: 0.1376

The LLM provider timed out. Please try again.

Score cards for qwen3.5:397b for Bitcoin Returns and Volatility Analysis

Total score: 6/10

  • Task Completion: 1/2

    Needs work

    The notebook completed loading/plotting prices and log-return distribution, but it did not address drawdown or volatility regime analysis from the workflow.

  • Execution Correctness: 2/2

    Excellent

    The provided code is syntactically coherent and the outputs indicate it ran successfully, including the Bitcoin data load and statistics calculation.

  • Output Quality: 2/3

    Good

    The outputs include the BTC price plot, log-return distribution, and printed skewness/kurtosis values. However, the required drawdown chart and rolling volatility regime bands are missing.

  • Reasoning Quality: 1/2

    Needs work

    The reasoning correctly identifies the loaded Bitcoin dataset and the log-return statistics, but it stops short of the remaining required analyses.

  • Reliability: 0/1

    Needs work

    The workflow is incomplete relative to the stated task and includes irrelevant discussion about a different dataset before correcting course.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.