Bitcoin Returns and Volatility Analysis

Analyze Bitcoin historical returns, compute log returns, measure drawdowns, and detect volatility regimes.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow loads 4-hour Bitcoin historical price data from a CSV URL and plots the closing price time series. It computes log returns, summarizes the return distribution with skewness and kurtosis, and visualizes the distribution. It measures drawdowns over time and uses 30-day rolling volatility to flag high- and low-volatility regimes.

Who this example is for

This is for analysts and researchers who want a reproducible notebook workflow for basic crypto return and risk diagnostics. It helps quantify tail behavior, drawdown risk, and volatility regime shifts from raw price data.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

BTC closing price line chart
Log return distribution plot with skewness and kurtosis
Maximum drawdown time series chart
30-day rolling volatility chart with high/low regime bands

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

pandas
numpy
matplotlib
seaborn
scipy

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

1load Bitcoin price data from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv and plot closing prices
2calculate log returns and plot distribution with kurtosis and skewness
3compute and plot maximum drawdown over time
4identify high and low volatility regimes using 30-day rolling volatility

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model Source	Score	Task	Execution	Output	Reasoning	Reliability	Run
gemma4:31b	10/10	2/2	2/2	3/3	2/2	1/1	Open gemma4:31b conversation
gpt-5.4	10/10	2/2	2/2	3/3	2/2	1/1	Open gpt-5.4 conversation
gpt-oss:120b	10/10	2/2	2/2	3/3	2/2	1/1	Open gpt-oss:120b conversation
qwen3-coder-next	9/10	2/2	2/2	3/3	1/2	1/1	Open qwen3-coder-next conversation
glm-5.1	6/10	2/2	1/2	2/3	1/2	0/1	Open glm-5.1 conversation
qwen3.5:397b	6/10	1/2	2/2	2/3	1/2	0/1	Open qwen3.5:397b conversation

How scoring works

Each run is graded on five dimensions:

Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with gpt-5.4

This run uses AI with a step-by-step list of prompts to perform the analysis task Bitcoin Returns and Volatility Analysis. Below, you can review the full conversation generated with gpt-5.4. After the conversation, we show detailed score cards for this run.

load Bitcoin price data from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv and plot closing prices

Thinking

Planning data inspection process

I need to comply with the rules, so I’ll first load and inspect the dataset only if it isn’t already loaded. Since the user asked to load and plot closing prices, I'll clarify that inspection is the first step. Using pandas, I’ll show the shape, columns, and head of the dataset sequentially. This should comply with the guideline of one chart or table per step, keeping it minimal with clear outputs. I’ll make sure to do this effectively!Creating runnable code block

I need to put together one runnable code block that finishes with a single final bare expression. I can start with importing pandas and using the IPython display feature. I’ll include code to read the CSV file from a URL, parsing the 'timestamp' column. Then, I’ll create a summary DataFrame showing the number of rows and columns, display it, and show the columns of the original DataFrame. Finally, I’ll end the block with df.head(), ensuring it's clean and concise—no print statements needed!

Let’s load it and do the required first inspection before plotting.

Load and inspect data

import pandas as pd
from IPython.display import display

url = "https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv"
df = pd.read_csv(url)

display(pd.DataFrame({"metric": ["rows", "columns"], "value": [df.shape[0], df.shape[1]]}))
display(pd.DataFrame({"column": df.columns}))

df.head()

	metric	value
0	rows	17958
1	columns	12

	column
0	Open time
1	Open
2	High
3	Low
4	Close
5	Volume
6	Close time
7	Quote asset volume
8	Number of trades
9	Taker buy base asset volume
10	Taker buy quote asset volume
11	Ignore

	Open time	Open	High	Low	Close	Volume	Close time	Quote asset volume	Number of trades	Taker buy base asset volume	Taker buy quote asset volume
0	2018-01-01 00:00:00.000000	13715.65	13715.65	13155.38	13410.03	1676.204807	2018-01-01 03:59:59.999000	2.251607e+07	19438	739.518666	9.937537e+06
1	2018-01-01 04:00:00.000000	13434.98	13818.55	13322.15	13570.35	1302.214836	2018-01-01 07:59:59.999000	1.770254e+07	18061	588.516222	7.998796e+06
2	2018-01-01 08:00:00.000000	13569.98	13735.24	13001.13	13220.56	1319.755931	2018-01-01 11:59:59.999000	1.765121e+07	17200	582.668293	7.792109e+06
3	2018-01-01 12:00:00.000000	13220.56	13330.00	12750.00	13247.00	1831.933153	2018-01-01 15:59:59.999000	2.391259e+07	21357	830.314747	1.083304e+07
4	2018-01-01 16:00:00.000000	13247.00	13290.65	12940.00	13240.37	1092.337234	2018-01-01 19:59:59.999000	1.434887e+07	13178	557.453779	7.324320e+06

The data is already in the notebook and the Close column is present, so the remaining step is just to convert the time column and plot the closing prices over time.

Plot Bitcoin closing price over time

import matplotlib.pyplot as plt
import seaborn as sns

df["Open time"] = pd.to_datetime(df["Open time"])

plt.figure(figsize=(14, 5))
sns.lineplot(data=df, x="Open time", y="Close")
plt.title("Bitcoin Closing Price Over Time (4h)")
plt.xlabel("Date")
plt.ylabel("Close Price")
plt.tight_layout()
plt.show()

#10

#11

The data shown in the notebook confirms you loaded a Bitcoin 4-hour OHLCV dataset with 17,958 rows and a Close column, and the plot output shows the closing-price time series was rendered using Open time on the x-axis and Close on the y-axis. That matches the requested task: load the Bitcoin price data from the provided URL and plot closing prices.

#12

calculate log returns and plot distribution with kurtosis and skewness

#13

Let’s compute 4-hour log returns from Close and plot their distribution, with skewness and excess kurtosis shown on the chart.

#14

Plot log return distribution

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = df.sort_values("Open time").copy()
df["log_return"] = np.log(df["Close"]).diff()

r = df["log_return"].dropna()
skew = r.skew()
kurt = r.kurt()

plt.figure(figsize=(10, 5))
sns.histplot(r, bins=80, kde=True)
plt.title("Distribution of Bitcoin 4h Log Returns")
plt.xlabel("Log return")
plt.ylabel("Frequency")
plt.text(
    0.98, 0.95,
    f"Skewness: {skew:.3f}\nExcess kurtosis: {kurt:.3f}",
    transform=plt.gca().transAxes,
    ha="right",
    va="top",
    bbox=dict(boxstyle="round", facecolor="white", alpha=0.85)
)
plt.tight_layout()
plt.show()

#15

#16

Thinking

Computing stats

I see that the chart is already completed, so my next step is to compute and display the relevant stats. I want to keep the presentation minimal to prevent clutter. This way, the user gets the essential information without any unnecessary details. It's all about making it clear and straightforward for them to understand! I’m curious to see what the stats will highlight and how they’ll contribute to the overall understanding.

#17

The notebook already computed log_return from Close and rendered the distribution plot. The missing piece is that the latest output only shows the figure object, so the actual skewness and kurtosis values are not visible yet.

#18

Compute return moments

pd.DataFrame({
    "metric": ["skewness", "excess_kurtosis"],
    "value": [df["log_return"].dropna().skew(), df["log_return"].dropna().kurt()]
})

#19

	metric	value
0	skewness	-0.495069
1	excess_kurtosis	15.637679

#20

The 4-hour log returns were calculated as \log(\text{Close}_t) - \log(\text{Close}_{t-1}), and their distribution was plotted successfully.

From the observed results:

Skewness: -0.495
Excess kurtosis: 15.638

What this means:

The negative skewness indicates the return distribution has a longer/heavier left tail, so large negative returns are somewhat more pronounced than large positive returns.
The very high excess kurtosis shows the distribution is far more heavy-tailed than a normal distribution. In practice, this means extreme 4-hour moves happen much more often than a Gaussian model would suggest.
Together, these results imply Bitcoin 4-hour log returns are non-normal, with fat tails and a mild downside asymmetry.

So the distribution you plotted is consistent with a market return series that experiences occasional sharp shocks, especially on the downside.

#21

compute and plot maximum drawdown over time

#22

Thinking

Planning the maximum drawdown computation

I need to answer the user by computing and plotting the maximum drawdown over time. I’ll approach this step-by-step and use a single code block. My first step is to calculate the running max and drawdown from the Close prices since I don’t need returns directly. Although I could technically combine steps, the developer says to avoid a full pipeline unless the user asks explicitly, so I’ll stick to my primary objective here. I'll use the existing DataFrame to get the right results.Confirming drawdown calculations and plotting

It looks like the maximum drawdown means I need to create a drawdown series and possibly annotate the maximum drawdown on the plot. I’ll calculate the running peak using the formula: drawdown = Close / running_max - 1. The user wants both the computation and the plot. I should keep things concise with seaborn or matplotlib, and maybe include the maximum drawdown value as a title or annotation. I'll end with just the plot and a brief summary, eliminating unnecessary details along the way.

#23

We can compute drawdown from the closing price as the percentage drop from the running peak:

$\text{drawdown}_t = \frac{\text{Close}_t}{\max(\text{Close}_{1:t})} - 1$

This gives the full drawdown path over time and highlights the worst drawdown.

#24

Plot drawdown over time

import matplotlib.pyplot as plt
import seaborn as sns

df = df.sort_values("Open time").copy()

df["running_peak"] = df["Close"].cummax()
df["drawdown"] = df["Close"] / df["running_peak"] - 1

max_dd = df["drawdown"].min()
max_dd_date = df.loc[df["drawdown"].idxmin(), "Open time"]

plt.figure(figsize=(14, 5))
sns.lineplot(data=df, x="Open time", y="drawdown", color="crimson")
plt.axhline(0, color="gray", linestyle="--", linewidth=1)
plt.scatter([max_dd_date], [max_dd], color="black", zorder=3)
plt.title(f"Bitcoin 4h Drawdown Over Time (Max Drawdown: {max_dd:.2%})")
plt.xlabel("Date")
plt.ylabel("Drawdown")
plt.tight_layout()
plt.show()

#25

#26

Thinking

Refining drawdown insights

I’m looking at how to craft a compact table that shows max drawdown and its date. I want to compute the preceding peak date too. The drawdown chart illustrates the repeated deep declines and recoveries, but I haven’t printed the worst drawdown magnitude or date in the output yet. I think I’ll need to create a one-row table displaying the worst drawdown percentage, trough date, and prior peak date using the existing dataframe while ensuring that datetime is open. Keeping it simple is key!

#27

The drawdown plot already shows the full underwater path over time: drawdown stays at 0 when Bitcoin makes a new high and drops below 0 during declines, so the deepest trough on that line is the maximum drawdown. What’s still missing is the exact numeric worst drawdown and its timing.

#28

Find peak and trough dates

trough_idx = df["drawdown"].idxmin()
trough_date = df.loc[trough_idx, "Open time"]
peak_idx = df.loc[:trough_idx, "Close"].idxmax()
peak_date = df.loc[peak_idx, "Open time"]

pd.DataFrame({
    "peak_date": [peak_date],
    "trough_date": [trough_date],
    "max_drawdown": [df.loc[trough_idx, "drawdown"]],
    "max_drawdown_pct": [f'{df.loc[trough_idx, "drawdown"]:.2%}']
})

#29

	peak_date	trough_date	max_drawdown	max_drawdown_pct
0	2018-01-06 20:00:00	2018-12-15 12:00:00	-0.814172	-81.42%

#30

The notebook outputs answer the drawdown request.

From the computed drawdown series and the summary table:

The maximum drawdown is -81.42%
It runs from the prior peak on 2018-01-06 20:00:00
Down to the trough on 2018-12-15 12:00:00

What this means: at the worst point in the sample, Bitcoin’s 4-hour close had fallen about 81.4% below its previous all-time high up to that date.

The plot you generated is the correct “drawdown over time” visualization:

values at 0 indicate a new running high,
negative values show how far price is below the running peak,
the lowest point on that curve is the maximum drawdown.

So the observed result is that the deepest drawdown in this dataset occurred during 2018, with a peak-to-trough decline of -81.42%.

#31

identify high and low volatility regimes using 30-day rolling volatility

#32

Thinking

Calculating rolling volatility

I should focus on calculating the 30-day rolling volatility based on 4-hour log returns, which translates to a window of 180 periods (since there are 6 periods per day). For better interpretability, I might annualize this, scaling the rolling standard deviation appropriately. User wants to identify regimes, so I'll compute the rolling volatility and plot it using seaborn, with a concise explanation included. I’m keeping it straightforward, without delving into threshold calculations just yet.

#33

We already have log_return, so the next useful step is to compute a 30-day rolling volatility using the 4-hour data frequency.

Because there are 6 4-hour bars per day, a 30-day window is 180 observations. I’ll annualize it so the scale is easier to read.

#34

Plot rolling volatility

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

window = 30 * 6  # 30 days * 6 four-hour periods per day

df = df.sort_values("Open time").copy()
df["rolling_vol_30d"] = df["log_return"].rolling(window=window).std() * np.sqrt(6 * 365)

plt.figure(figsize=(14, 5))
sns.lineplot(data=df, x="Open time", y="rolling_vol_30d", color="navy")
plt.title("Bitcoin 30-Day Rolling Volatility")
plt.xlabel("Date")
plt.ylabel("Annualized volatility")
plt.tight_layout()
plt.show()

#35

#36

Thinking

Considering chart configurations

I’m thinking about how to explain the current chart, which shows volatility clustering and changing regimes. Without specific thresholds or date ranges, pinpointing precise high or low periods is tough. The code needs to calculate the 25th and 75th percentile thresholds of the 30-day rolling volatility, label them as low, high, or normal, and then group contiguous runs. I believe I could create a single DataFrame called regime_periods to summarize the longest high/low regimes, including necessary attributes like duration and average volume.Evaluating output structure

I’m considering creating a table of contiguous high and low periods sorted by duration, perhaps just the top 10. It seems like I’ll need separate low and high thresholds, using exact labels based on percentiles. However, the output should also include the numeric thresholds. Maybe I could make a compact summary DataFrame with quantiles and counts to organize this data better. Still, to identify periods, I might need a dedicated period table, though merging everything feels a bit messy.Drafting minimal code

I need to create a minimal code draft using the existing df with the rolling_vol_30d function. However, I'm aware that the first 179 rows will have NaN values, which I'll have to consider. I should think about how to handle those NaN rows in my code effectively. It's essential to ensure that the code runs smoothly while minimizing unnecessary complexity. Let's see how I can accomplish this in a streamlined manner!

#37

The 30-day volatility plot already shows clear clustering, but it doesn’t yet pin down which dates belong to high- vs low-volatility regimes. A simple way to identify them is to label:

low volatility: rolling volatility at or below the 25th percentile
high volatility: rolling volatility at or above the 75th percentile

This gives concrete regime periods and durations.

#38

Summarize volatility regimes

import numpy as np

vol = df[["Open time", "rolling_vol_30d"]].dropna().copy()

low_q = vol["rolling_vol_30d"].quantile(0.25)
high_q = vol["rolling_vol_30d"].quantile(0.75)

vol["regime"] = np.select(
    [vol["rolling_vol_30d"] <= low_q, vol["rolling_vol_30d"] >= high_q],
    ["low", "high"],
    default="normal"
)

vol["block"] = (vol["regime"] != vol["regime"].shift()).cumsum()

regimes = (
    vol[vol["regime"].isin(["low", "high"])]
    .groupby(["block", "regime"], as_index=False)
    .agg(
        start=("Open time", "min"),
        end=("Open time", "max"),
        bars=("Open time", "size"),
        avg_vol=("rolling_vol_30d", "mean"),
        peak_vol=("rolling_vol_30d", "max"),
    )
)

regimes["duration_days"] = regimes["bars"] / 6
regimes["threshold_rule"] = regimes["regime"].map({
    "low": "<= 25th percentile",
    "high": ">= 75th percentile"
})
regimes["threshold_value"] = regimes["regime"].map({
    "low": low_q,
    "high": high_q
})

regimes = regimes.sort_values(["duration_days", "start"], ascending=[False, True])

regimes[
    ["regime", "threshold_rule", "threshold_value", "start", "end", "duration_days", "avg_vol", "peak_vol"]
].head(12)

#39

	regime	threshold_rule	threshold_value	start	end	duration_days	avg_vol	peak_vol
52	low	<= 25th percentile	0.429448	2023-05-26 12:00:00	2023-11-16 08:00:00	174.000000	0.337449	0.427860
78	low	<= 25th percentile	0.429448	2025-05-09 04:00:00	2025-10-21 16:00:00	165.666667	0.322499	0.428738
0	high	>= 75th percentile	0.689032	2018-01-31 00:00:00	2018-05-12 04:00:00	100.166667	1.182936	1.798645
31	high	>= 75th percentile	0.689032	2021-04-19 12:00:00	2021-07-24 12:00:00	96.166667	0.976246	1.299133
30	high	>= 75th percentile	0.689032	2021-01-02 16:00:00	2021-04-04 12:00:00	92.000000	1.045341	1.344232
3	high	>= 75th percentile	0.689032	2018-11-24 20:00:00	2019-01-18 04:00:00	54.500000	0.843698	0.987833
45	high	>= 75th percentile	0.689032	2022-06-14 04:00:00	2022-08-06 12:00:00	53.500000	0.806373	0.929228
11	high	>= 75th percentile	0.689032	2019-06-26 20:00:00	2019-08-17 12:00:00	51.666667	0.964342	1.166747
47	low	<= 25th percentile	0.429448	2022-12-10 12:00:00	2023-01-30 12:00:00	51.166667	0.314832	0.427063
23	high	>= 75th percentile	0.689032	2020-03-12 08:00:00	2020-04-22 08:00:00	41.166667	1.313897	1.561851
76	low	<= 25th percentile	0.429448	2024-10-06 04:00:00	2024-11-12 00:00:00	37.000000	0.368835	0.429228
27	low	<= 25th percentile	0.429448	2020-07-01 20:00:00	2020-08-06 08:00:00	35.666667	0.314004	0.429043

#40

Using the 30-day rolling volatility, the regime split is already identifiable from the observed thresholds and top regime periods:

Low-volatility regime: rolling volatility ≤ 0.429448
High-volatility regime: rolling volatility ≥ 0.689032

Those cutoffs come from the 25th and 75th percentiles of the 30-day annualized rolling volatility, so they define the calmer and more turbulent parts of the sample in a consistent, data-driven way.

From the regime table:

The longest low-volatility stretch begins on 2023-05-26 12:00:00, indicating a notably calm period in the post-2022 market.
Another long low-volatility period begins on 2025-05-09 04:00:00.
The most prominent high-volatility regimes begin around:
- 2018-01-31 00:00:00
- 2018-11-24 20:00:00
- 2019-06-26 20:00:00
- 2020-03-12 08:00:00
- 2021-01-02 16:00:00
- 2021-04-19 12:00:00
- 2022-06-14 04:00:00

Interpretation:

High-volatility regimes cluster around major stress or transition periods, especially early 2018, the March 2020 crash, the 2021 boom/adjustment, and the 2022 selloff.
Low-volatility regimes appear mostly in later quieter consolidation phases, especially from mid-2023 onward.
So, based on the 30-day rolling volatility, Bitcoin alternates between short, intense turbulence episodes and longer calm consolidation periods, with the recent sample showing more extended low-volatility behavior than the early years.

In short: the notebook has successfully identified high and low volatility regimes using 30-day rolling volatility, with 0.429 and 0.689 as the practical regime boundaries and the dates above marking the key regime onsets.

Score cards for gpt-5.4 for Bitcoin Returns and Volatility Analysis

Total score: 10/10

Task Completion: 2/2
Excellent
All four requested analyses were completed: price plot, log-return distribution with skewness/kurtosis, drawdown plot, and rolling volatility regime identification.
Execution Correctness: 2/2
Excellent
The provided code is coherent and likely runnable. It loads the CSV, computes the required metrics, and produces the expected plots and summary tables without obvious syntax errors.
Output Quality: 3/3
Excellent
The outputs match the expected outcomes semantically: BTC price chart, fat-tailed return distribution with kurtosis > 3, drawdown chart with a major drawdown, and rolling volatility with regime bands/tables.
Reasoning Quality: 2/2
Excellent
The explanations correctly interpret the results, including non-normal fat tails, the meaning of drawdown, and the use of rolling volatility to define regimes.
Reliability: 1/1
Excellent
The workflow is consistent and grounded in the dataset outputs, with no major hallucinations or unsupported claims.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Download Free Trial Learn about AI Data Analyst

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.

Bitcoin Returns and Volatility Analysis

What this AI workflow does

Who this example is for

Expected analysis outcomes

Tools and libraries used

Prompt sequence

Model Comparison

AI Analysis with gpt-5.4

Score cards for gpt-5.4 for Bitcoin Returns and Volatility Analysis

Try MLJAR Studio

Explore More AI Analysis Examples

Exploratory Data Analysis

Time Series

Data Analysis

Machine Learning

NLP

Finance

Statistics