Time Series

Bitcoin Returns and Volatility Analysis

Analyze Bitcoin historical returns, compute log returns, measure drawdowns, and detect volatility regimes.

What

This AI Data Analyst workflow loads 4-hour Bitcoin historical price data from a CSV URL and plots the closing price time series. It computes log returns, summarizes the return distribution with skewness and kurtosis, and visualizes the distribution. It measures drawdowns over time and uses 30-day rolling volatility to flag high- and low-volatility regimes.

Who

This is for analysts and researchers who want a reproducible notebook workflow for basic crypto return and risk diagnostics. It helps quantify tail behavior, drawdown risk, and volatility regime shifts from raw price data.

Tools

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • scipy

Outcomes

  • BTC closing price line chart
  • Log return distribution plot with skewness and kurtosis
  • Maximum drawdown time series chart
  • 30-day rolling volatility chart with high/low regime bands

Quality Score

1/10

gpt-5.2mljar-ai

Last scored: Apr 7, 2026

  • Task Completion: 0/2

    Needs work

    The workflow expected multiple analyses (price chart, log returns distribution with kurtosis/skewness, drawdowns, rolling volatility regimes), but only attempted the first step and did not complete it due to an error.

  • Execution Correctness: 0/2

    Needs work

    The code fails with a StopIteration when trying to infer the datetime column, so it is not runnable as-is and produces no plot.

  • Output Quality: 0/3

    Needs work

    No valid outputs were produced (no BTC price chart, no return distribution stats, no drawdown chart, no rolling volatility/regime bands).

  • Reasoning Quality: 1/2

    Needs work

    The plan to load data and infer column names is reasonable, but it did not account for the actual dataset schema and lacked fallback handling when expected columns are absent.

  • Reliability: 0/1

    Needs work

    The approach is fragile (hard-coded column-name assumptions with next() and no default), leading to immediate failure on the provided dataset.

load Bitcoin price data from https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv and plot closing prices

Let’s start by loading the dataset and checking its structure so we can confirm the exact closing-price column name before plotting.

Load and inspect dataset
import pandas as pd
from IPython.display import display

url = "https://raw.githubusercontent.com/pplonski/datasets-for-start/refs/heads/master/bitcoin-historical-data/btc_4h_data_2018_to_2025.csv"

df = pd.read_csv(url)

display(pd.DataFrame({"rows": [df.shape[0]], "columns": [df.shape[1]]}))
display(pd.DataFrame({"column_name": df.columns}))
df.head()
rows columns
0 17958 12
column_name
0 Open time
1 Open
2 High
3 Low
4 Close
5 Volume
6 Close time
7 Quote asset volume
8 Number of trades
9 Taker buy base asset volume
10 Taker buy quote asset volume
11 Ignore
Open time Open High Low Close Volume Close time Quote asset volume Number of trades Taker buy base asset volume Taker buy quote asset volume Ignore
0 2018-01-01 00:00:00.000000 13715.65 13715.65 13155.38 13410.03 1676.204807 2018-01-01 03:59:59.999000 2.251607e+07 19438 739.518666 9.937537e+06 0
1 2018-01-01 04:00:00.000000 13434.98 13818.55 13322.15 13570.35 1302.214836 2018-01-01 07:59:59.999000 1.770254e+07 18061 588.516222 7.998796e+06 0
2 2018-01-01 08:00:00.000000 13569.98 13735.24 13001.13 13220.56 1319.755931 2018-01-01 11:59:59.999000 1.765121e+07 17200 582.668293 7.792109e+06 0
3 2018-01-01 12:00:00.000000 13220.56 13330.00 12750.00 13247.00 1831.933153 2018-01-01 15:59:59.999000 2.391259e+07 21357 830.314747 1.083304e+07 0
4 2018-01-01 16:00:00.000000 13247.00 13290.65 12940.00 13240.37 1092.337234 2018-01-01 19:59:59.999000 1.434887e+07 13178 557.453779 7.324320e+06 0

The data inspection already confirms the file has the expected Bitcoin OHLC structure for 4-hour intervals, and the closing price column is Close. The latest output also shows Open time is available as the timestamp, so we can now plot the closing price over time directly.

Plot Bitcoin closing price
import matplotlib.pyplot as plt
import seaborn as sns

plot_df = df.copy()
plot_df["Open time"] = pd.to_datetime(plot_df["Open time"])

plt.figure(figsize=(14, 5))
sns.lineplot(data=plot_df, x="Open time", y="Close")
plt.title("Bitcoin Closing Price Over Time (4h)")
plt.xlabel("Date")
plt.ylabel("Close price")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
Thinking

Analyzing price trends

I'm noting that the first rows show prices around 13.2k-13.6k in January 2018, so that confirms the chronological start. The plot represents the path from 2018 to 2025 and captures large swings. I can't infer all the details from the figure alone, but I can definitely say that closing prices were plotted over time. It’s important that my final answer is concise, so I’ll keep that in mind.

The Bitcoin 4-hour price data was loaded and inspected successfully enough to confirm the plot is based on the correct fields.

From the observed table output:

  • The dataset contains 17,958 rows and 12 columns, which matches a long 4-hour historical series.
  • The timestamp column is Open time.
  • The closing price column is Close.
  • The first visible records start at 2018-01-01, with closing prices around 13,410.03, 13,570.35, 13,220.56, 13,247.00, and 13,240.37, confirming the data is chronological Bitcoin market data.

From the plotting step:

  • A line chart was created using Open time on the x-axis and Close on the y-axis.
  • That means the notebook is now showing Bitcoin closing prices over time for the full 4-hour dataset from 2018 onward, which matches your request.
calculate log returns and plot distribution with kurtosis and skewness
Thinking

Assessing calculations iteratively

I need to respond step by step. The user asked to calculate log returns and plot the distribution, including skewness and kurtosis. I wonder if we can do this all at once? The primary goal is to compute the log returns first and then plot everything together. It seems there's a confusion about the data frame shape, as it's showing airline passengers instead of Bitcoin. I should check the state and use the notebook outputs as the correct reference. I’ll keep the code minimal and efficient!Preparing skewness and kurtosis calculations

I need to compute skewness and kurtosis, possibly the excess kurtosis using either pandas.kurt() or scipy.stats.kurtosis() with fisher=False. Since the user didn’t specify, I should clarify that the kurtosis could be excess. I want to avoid making assumptions about unseen outputs, but I should provide the necessary code since the user directly requested it. I'll consider using Seaborn’s histplot with KDE for the visualization. Let’s craft a brief explanation along with the code!

We already have the Bitcoin price data and the Close column, so the next step is to compute 4-hour log returns and plot their distribution. I’ll use skewness and excess kurtosis from pandas.

Plot log return distribution
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

ret_df = df.copy()
ret_df["Open time"] = pd.to_datetime(ret_df["Open time"])
ret_df = ret_df.sort_values("Open time")

ret_df["log_return"] = np.log(ret_df["Close"]).diff()
log_returns = ret_df["log_return"].dropna()

skewness = log_returns.skew()
kurtosis = log_returns.kurt()

plt.figure(figsize=(10, 5))
sns.histplot(log_returns, bins=80, kde=True)
plt.title("Distribution of Bitcoin 4h Log Returns")
plt.xlabel("Log return")
plt.ylabel("Frequency")

plt.text(
    0.98, 0.95,
    f"Skewness: {skewness:.4f}\nExcess kurtosis: {kurtosis:.4f}",
    transform=plt.gca().transAxes,
    ha="right",
    va="top",
    bbox=dict(boxstyle="round", facecolor="white", alpha=0.9)
)

plt.tight_layout()
plt.show()

If you want, the next step can be a quick interpretation of whether the distribution looks heavy-tailed or asymmetric.

Thinking

Deciding on output approach

I need to base my response on the notebook state, choosing between Path A or B according to certain heuristics. The goal is to calculate log returns and plot the distribution with kurtosis and skewness. The latest output is a chart without printed values for skewness and kurtosis, even though the chart has text annotations. Since the output is chart-only, I should choose Path B first to print the needed values and insights from the current outputs, along with a minimal runnable Python code block. Let's clarify that log returns were derived from the Close prices, sorted by Open time, and the histogram includes a KDE plot.Clarifying missing values

The missing parts are the numeric skewness and excess kurtosis values since they're only in the figure annotation and not visible in the notebook text output. I need to create a small DataFrame that includes the observation count, skewness, excess kurtosis, and possibly the mean and standard deviation, while focusing on the essential pieces – skewness, kurtosis, and a few log returns. I can assume the variables ret_df and log_returns exist. I should use pd.DataFrame to create the stats. Since pandas has already been imported, I can proceed with the code block: stats = pd.DataFrame({"metric": [...], "value": [log_returns.shape[0], log_returns.skew(), log_returns.kurt()]}). Keeping it minimal is key!

The notebook already produced the log-return distribution plot, so the visual part of your request is in place. It also shows the returns were computed correctly from Close after ordering by Open time.

What’s still missing is the actual numeric skewness and excess kurtosis values in notebook output, because they were added only as text inside the chart.

Create returns stats table
stats = pd.DataFrame({
    "metric": ["observations", "skewness", "excess_kurtosis"],
    "value": [log_returns.shape[0], log_returns.skew(), log_returns.kurt()]
})

stats
metric value
0 observations 17957.000000
1 skewness -0.495069
2 excess_kurtosis 15.637679

The log returns were computed from consecutive Close prices using \log(P_t) - \log(P_{t-1}), and the distribution was plotted successfully.

From the observed outputs:

  • Observations: 17,957 log-return values
  • Skewness: -0.4951
  • Excess kurtosis: 15.6377

What this means:

  • The negative skewness indicates the return distribution has a longer or heavier left tail, so large negative 4-hour moves are somewhat more pronounced than large positive ones.
  • The very high excess kurtosis indicates a strongly heavy-tailed distribution relative to a normal distribution. In practice, extreme returns occur much more often than a Gaussian model would suggest.
  • Together, these statistics imply Bitcoin 4-hour log returns are non-normal, with fat tails and mild downside asymmetry.

So your plot and summary statistics support the conclusion that Bitcoin returns are clustered around zero most of the time, but with occasional large jumps—especially on the downside.

compute and plot maximum drawdown over time
Thinking

Computing maximum drawdown

I need to provide an answer based on the notebook's state. The user wants to compute and plot maximum drawdown over time, which means using the loaded DataFrame with Close and Open times. I can't rush, so I should break it down and explain the process step-by-step. First, I'll compute the drawdown and create the relevant plot. I'll focus on delivering the next code block for the user without making assumptions on the output.Defining drawdown over time

I think I should define drawdown over time as the current drawdown from the rolling peak, while maximum drawdown over time can be seen as the expanding minimum drawdown. I can compute both but will only plot one chart for clarity. So, I’ll keep the code minimal and concise. I'll ensure we have the DataFrame ready, create the necessary calculations, and then plot the maximum drawdown over time using seaborn or matplotlib. Just focusing on the code should streamline the process!

We already have the price series loaded, so the next step is to compute drawdown from the running peak and then plot the worst drawdown reached so far over time.

Plot maximum drawdown over time
import matplotlib.pyplot as plt
import seaborn as sns

dd_df = df.copy()
dd_df["Open time"] = pd.to_datetime(dd_df["Open time"])
dd_df = dd_df.sort_values("Open time")

dd_df["running_peak"] = dd_df["Close"].cummax()
dd_df["drawdown"] = dd_df["Close"] / dd_df["running_peak"] - 1
dd_df["max_drawdown_so_far"] = dd_df["drawdown"].cummin()

plt.figure(figsize=(14, 5))
sns.lineplot(data=dd_df, x="Open time", y="max_drawdown_so_far")
plt.title("Bitcoin Maximum Drawdown Over Time")
plt.xlabel("Date")
plt.ylabel("Maximum drawdown")
plt.axhline(0, color="black", linewidth=1)
plt.tight_layout()
plt.show()
Thinking

Evaluating maximum drawdown output

I need to base my answer on the newest notebook state and its output. The original goal was to compute and plot maximum drawdown over time, but the latest output is just a chart with no numeric or text values. Given this, I should first choose Path B and print the values used in the chart. I also need to provide some insights from the current outputs and a runnable Python code block to gather the missing information. The chart indicates that max drawdown worsens over time, so I need to note that it starts at 0 and becomes more negative as new troughs appear. To quantify the worst level and its timing, let’s display the minimum drawdown row using new code.Summarizing drawdown data

I think creating a summary table with the worst drawdown and its corresponding date is a good idea. I also want to include the tail of the max_drawdown_so_far. To clarify the underlying values used in the chart, I’ll make a compact DataFrame containing the worst drawdown value and date, along with the last available max drawdown. It might also be useful to show a few rows where the max_drawdown_so_far changes. I need to address the mismatch in the notebook state variable indicating that the DataFrame shape is 144 x 2 with Month and #Passengers, but the cells show it loaded with bitcoin data. This inconsistency is important, so I’ll prioritize the latest outputs and context.Verifying dataset assumptions

I see that if the dataset schema or column assumptions haven't been verified in the latest outputs, I should choose Path B first to verify with minimal code. However, since the latest outputs have already used columns, it appears that the schema is verified enough from previous outputs. We might be able to ignore the variable state inconsistency. The current chart is for Bitcoin 4-hour data since the latest executed cells use that. I need to provide exactly one code block using the existing dd_df from cell-5. I’ll compute a small summary DataFrame with the date of the worst drawdown and its value in percent, along with the last max drawdown. I’ll also show a few change points without any headings.

The drawdown curve is behaving as expected: it starts at 0, drops when Bitcoin falls below a prior peak, and only moves lower when a new deeper trough is reached. To answer the goal quantitatively as well as visually, the missing piece is the exact worst drawdown and when it happened.

Find maximum drawdown date
worst_idx = dd_df["max_drawdown_so_far"].idxmin()

pd.DataFrame({
    "date": [dd_df.loc[worst_idx, "Open time"]],
    "max_drawdown": [dd_df.loc[worst_idx, "max_drawdown_so_far"]],
    "max_drawdown_pct": [dd_df.loc[worst_idx, "max_drawdown_so_far"] * 100]
})
date max_drawdown max_drawdown_pct
0 2018-12-15 12:00:00 -0.814172 -81.41717
Thinking

Deciding on drawdown insights

I need to choose between option A or B. The latest outputs show a chart of the maximum drawdown over time along with the worst drawdown date and value. My goal is to compute and plot the maximum drawdown, which will be sufficient for my response. It’s important to provide a natural final answer without any code. I should interpret the worst drawdown, emphasizing that the maximum drawdown series reached -81.42% on December 15, 2018. The chart output is visible, but I also have numeric results to consider.

The maximum drawdown over time has been computed and plotted correctly using the closing price relative to its running peak.

From the observed results:

  • The worst maximum drawdown in the series is -81.42%.
  • It occurs at 2018-12-15 12:00:00.
  • This means Bitcoin’s close had fallen about 81.4% below its highest close seen up to that point.

How to read the plot:

  • The line starts at 0 when there is no drawdown.
  • It moves downward whenever price falls further below the prior peak.
  • It stays flat when losses recover somewhat but do not exceed the previous worst drawdown.
  • New lows in the curve mark new worst drawdown records.

So, the chart answers the “over time” part, and the table gives the key summary value: the deepest drawdown in this dataset was -81.42%, reached in mid-December 2018.

identify high and low volatility regimes using 30-day rolling volatility
Thinking

Calculating Rolling Volatility

I need to respond to the new user prompt iteratively, one step at a time. My goal is to identify high and low volatility regimes using 30-day rolling volatility calculations. Since I have 4-hour data, that means I'll need 180 periods. I can utilize the log returns from a previously computed dataset. The next step will involve calculating the 30-day rolling volatility from those log returns and possibly plotting the results. Let's clarify that I’ll compute the rolling volatility next!Computing and Plotting Volatility

To identify volatility regimes, I think annualizing the rolling volatility could be really helpful. I want to compute and plot a single chart, using ret_df if it exists for conciseness. I’ll craft a code block that computes a 30-day rolling volatility based on 4-hour log returns with a 180-period window, annualized for easier comparison over time. I might also create a vol_df for the next step. At the end, I’ll ensure to include plt.show() to display the chart. Let's get started!

Next I’d compute the 30-day rolling volatility from the 4-hour log returns and plot it. Since there are 6 four-hour periods per day, a 30-day window is 180 observations.

Plot rolling volatility
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

vol_df = df.copy()
vol_df["Open time"] = pd.to_datetime(vol_df["Open time"])
vol_df = vol_df.sort_values("Open time")

vol_df["log_return"] = np.log(vol_df["Close"]).diff()

window_30d = 30 * 6
periods_per_year = 6 * 365

vol_df["rolling_vol_30d"] = vol_df["log_return"].rolling(window_30d).std()
vol_df["rolling_vol_30d_ann"] = vol_df["rolling_vol_30d"] * np.sqrt(periods_per_year)

plt.figure(figsize=(14, 5))
sns.lineplot(data=vol_df, x="Open time", y="rolling_vol_30d_ann")
plt.title("Bitcoin 30-Day Rolling Volatility (Annualized)")
plt.xlabel("Date")
plt.ylabel("Volatility")
plt.tight_layout()
plt.show()
Thinking

Deciding on volatility analysis

I need to choose between two options, A or B, while focusing on the latest output chart showing rolling volatility. The goal is to identify high and low volatility regimes using a 30-day rolling volatility. However, the chart alone isn't enough because it lacks numeric values to support those regimes. I need to provide insights from current outputs and include a minimal runnable code block. This code will help me gather the missing evidence, including thresholds and labeled regimes.Analyzing volatility outputs

I've been asked to analyze the latest outputs against the original goal and provide brief insights. The current chart shows that volatility varies over time, but without thresholds and date ranges, I can't objectively identify high and low regimes. I need to draft code to create quantile thresholds and summarize contiguous high/low regime periods. I'll use the existing rolling_vol_30d_ann in vol_df. I want to ensure the code is minimal and materially different while focusing on high/low only. I'll consider group IDs carefully.Considering output tables

I'm thinking about how to structure the output. I might need a default table, and it seems reasonable to include thresholds displayed sequentially. The instructions suggest only one chart or one table per step, but it could be possible to display multiple tables within a single code block. However, the goal is to identify regimes over time, so I’m brainstorming ways to output a compact table combining threshold values and regime periods efficiently.Deciding on table output

I'm thinking about how to structure the output clearly. I plan to display thresholds first, followed by the top regime periods, which means I might end up with two tables shown sequentially. I want to ensure that necessary libraries, like NumPy and pandas, are properly used without needing to bug the user about imports. I should focus on providing concise insights, like how the 30-day rolling volatility indicates regime shifts, even if it lacks explicit cutoff values and date ranges.

The 30-day rolling volatility plot shows clear clustering, so volatility is not constant over time. But the figure alone doesn’t yet give objective cutoffs or the exact date ranges for high- and low-volatility regimes.

Identify volatility regimes
valid_vol = vol_df.dropna(subset=["rolling_vol_30d_ann"]).copy()

low_thr = valid_vol["rolling_vol_30d_ann"].quantile(0.2)
high_thr = valid_vol["rolling_vol_30d_ann"].quantile(0.8)

valid_vol["regime"] = "mid"
valid_vol.loc[valid_vol["rolling_vol_30d_ann"] <= low_thr, "regime"] = "low"
valid_vol.loc[valid_vol["rolling_vol_30d_ann"] >= high_thr, "regime"] = "high"

valid_vol["block"] = (valid_vol["regime"] != valid_vol["regime"].shift()).cumsum()

display(pd.DataFrame({
    "low_vol_threshold": [low_thr],
    "high_vol_threshold": [high_thr]
}))

regime_periods = (
    valid_vol[valid_vol["regime"].isin(["low", "high"])]
    .groupby(["block", "regime"], as_index=False)
    .agg(
        start=("Open time", "min"),
        end=("Open time", "max"),
        observations=("Open time", "size"),
        avg_vol=("rolling_vol_30d_ann", "mean"),
    )
    .sort_values(["regime", "observations"], ascending=[True, False])
)

regime_periods.head(12)
low_vol_threshold high_vol_threshold
0 0.398552 0.719691
block regime start end observations avg_vol
0 1 high 2018-01-31 00:00:00 2018-05-10 08:00:00 590 1.191330
27 55 high 2021-04-22 16:00:00 2021-07-22 12:00:00 546 0.991578
25 51 high 2021-01-03 12:00:00 2021-04-01 12:00:00 529 1.060235
12 25 high 2019-06-26 20:00:00 2019-08-16 12:00:00 304 0.969342
4 9 high 2018-11-25 20:00:00 2019-01-06 16:00:00 252 0.882017
14 29 high 2020-03-12 08:00:00 2020-04-19 12:00:00 230 1.357526
42 85 high 2022-06-15 20:00:00 2022-07-20 12:00:00 209 0.854735
10 21 high 2019-05-19 00:00:00 2019-06-14 12:00:00 160 0.818172
35 71 high 2021-12-05 12:00:00 2021-12-26 04:00:00 125 0.743306
16 33 high 2020-05-10 00:00:00 2020-05-29 20:00:00 120 0.761400
31 63 high 2021-08-05 16:00:00 2021-08-24 20:00:00 116 0.760370
36 73 high 2022-03-09 00:00:00 2022-03-23 16:00:00 89 0.753132
Thinking

Evaluating volatility output

I need to figure out whether we have enough data based on the notebook's state and instructions. The latest outputs include a truncated thresholds and regime_periods table, focusing on high volatility. While there are visible high thresholds, the low volatility regimes are not visible since they’re likely truncated. Ultimately, it's important to identify both high and low volatility regimes, but right now it seems we might not have enough concrete insights to answer that completely.Gathering evidence with code

I need to create Path B, which requires a brief insight and a single runnable Python code block to gather the missing evidence. This code needs to be different from the latest step and should display the top low and high periods compactly, perhaps using the existing regime_periods if available. I'll make sure to include insights from the latest outputs, specifically mentioning the thresholds and observed high regime periods. The code will show the largest low and high regimes separately—all in one concise code block.

The current output already gives clear regime cutoffs: annualized 30-day rolling volatility below about 0.399 is a low-volatility regime, and above about 0.720 is a high-volatility regime. From the visible table, high-volatility clusters are persistent rather than isolated spikes, with especially long stretches in early 2018, early 2021, and mid-2021. What’s still missing for a complete answer is the corresponding low-volatility periods in a readable table.

Display high and low regimes
display(regime_periods[regime_periods["regime"] == "high"].head(10))
display(regime_periods[regime_periods["regime"] == "low"].head(10))
block regime start end observations avg_vol
0 1 high 2018-01-31 00:00:00 2018-05-10 08:00:00 590 1.191330
27 55 high 2021-04-22 16:00:00 2021-07-22 12:00:00 546 0.991578
25 51 high 2021-01-03 12:00:00 2021-04-01 12:00:00 529 1.060235
12 25 high 2019-06-26 20:00:00 2019-08-16 12:00:00 304 0.969342
4 9 high 2018-11-25 20:00:00 2019-01-06 16:00:00 252 0.882017
14 29 high 2020-03-12 08:00:00 2020-04-19 12:00:00 230 1.357526
42 85 high 2022-06-15 20:00:00 2022-07-20 12:00:00 209 0.854735
10 21 high 2019-05-19 00:00:00 2019-06-14 12:00:00 160 0.818172
35 71 high 2021-12-05 12:00:00 2021-12-26 04:00:00 125 0.743306
16 33 high 2020-05-10 00:00:00 2020-05-29 20:00:00 120 0.761400
block regime start end observations avg_vol
65 131 low 2025-05-09 16:00:00 2025-10-21 08:00:00 989 0.322000
52 105 low 2023-07-14 20:00:00 2023-11-08 20:00:00 703 0.316486
47 95 low 2022-12-10 20:00:00 2023-01-24 16:00:00 270 0.301737
20 41 low 2020-07-02 12:00:00 2020-08-02 00:00:00 184 0.296911
22 45 low 2020-10-03 20:00:00 2020-10-30 08:00:00 160 0.356190
68 137 low 2026-01-08 12:00:00 2026-02-03 12:00:00 157 0.339865
50 101 low 2023-05-26 16:00:00 2023-06-21 08:00:00 155 0.350049
62 125 low 2024-10-12 00:00:00 2024-11-05 20:00:00 150 0.350404
3 7 low 2018-10-26 04:00:00 2018-11-20 00:00:00 149 0.353708
8 17 low 2019-02-09 04:00:00 2019-02-24 08:00:00 92 0.341920
Thinking

Considering the dataset

I’m noting that there’s a suspiciously low regime in 2026, even though it’s labeled with the dataset name for 2025. I think I shouldn't get too far off track, as the output for 2026 likely results from the data extending. I could mention that a calm stretch encompasses much of mid-2025 into late 2025. Now, I’ll proceed to craft a concise answer without the need for any headings.

Using the 30-day rolling annualized volatility, the notebook has already identified both low- and high-volatility regimes in a way that answers the goal.

The regime cutoffs are data-driven:

  • low volatility: roughly below $0.399$
  • high volatility: roughly above $0.720$

From the observed regime table, Bitcoin’s high-volatility periods are not random one-off spikes; they occur in sustained clusters. The strongest examples are:

  • a long elevated-volatility stretch in early 2018
  • two major high-volatility waves in 2021, including early 2021 and again from late April into July
  • notable stress episodes around mid-2019, the March 2020 shock, and mid-2022

The low-volatility table shows the opposite pattern: calmer market conditions also persist for meaningful stretches rather than appearing briefly. The clearest examples are:

  • an extended calm period in mid- to late 2023
  • a very long low-volatility regime through much of 2025
  • shorter quiet windows in 2020, late 2022 to early 2023, and late 2024

So the main conclusion is that 30-day rolling volatility cleanly separates Bitcoin into distinct calm and turbulent regimes, and those regimes tend to be persistent over weeks to months. High-volatility regimes align with well-known market stress or rapid repricing phases, while low-volatility regimes align with consolidation periods.

Browse All Analysis Examples