Apple Stock Price Analysis in Python

Load Apple stock price data, compute moving averages, calculate daily returns, visualize volatility, and compute the Sharpe ratio.

This page shows an AI data analysis example. A workflow is a list of prompts (instructions) used to complete one analysis task step by step. We send the same prompt list to multiple AI models, then compare their notebook outputs and scores.

What this AI workflow does

This AI Data Analyst workflow pulls the last two years of AAPL OHLCV data using yfinance and organizes it into a pandas time-series DataFrame. It generates 20-day and 50-day moving averages, plots price with the moving-average overlays, and highlights crossover behavior visually. It computes daily returns, visualizes the return distribution, then calculates 30-day rolling volatility and an annualised Sharpe ratio.

Who this example is for

This is for analysts and students who want a reproducible Python notebook for basic equity time-series analysis on Apple stock. It helps users learn how to compute and interpret moving averages, returns, volatility, and risk-adjusted performance from market data.

Expected analysis outcomes

These are the results the AI workflow is expected to generate.

  • AAPL OHLCV DataFrame for ~2 years of trading days
  • Price chart with MA20 and MA50 overlays and visible crossovers
  • Daily returns series and histogram of returns
  • 30-day rolling volatility plot
  • Printed annualised Sharpe ratio

Tools and libraries used

Main Python packages and tooling used to run this AI data analysis task.

  • yfinance
  • pandas
  • numpy
  • matplotlib
  • seaborn

Prompt sequence

This is the exact list of prompts used in this workflow. The same prompt sequence is sent to each model so outputs and scores can be compared fairly.

  1. 1load AAPL stock prices for the last 2 years using yfinance
  2. 2add 20-day and 50-day moving averages and plot
  3. 3calculate daily returns and plot their distribution
  4. 4compute 30-day rolling volatility and annualised Sharpe ratio

Model Comparison

We compared several LLM models on the same analysis task. The table below shows their scores side by side. You can also open each model run to check the full conversation and notebook results.

Model SourceScoreTaskExecutionOutputReasoningReliabilityRun
gemma4:31b10/102/22/23/32/21/1Open gemma4:31b conversation
gpt-5.410/102/22/23/32/21/1Open gpt-5.4 conversation
gpt-oss:120b10/102/22/23/32/21/1Open gpt-oss:120b conversation
glm-5.19/102/22/23/31/21/1Open glm-5.1 conversation
qwen3.5:397b7/102/21/23/31/20/1Open qwen3.5:397b conversation
qwen3-coder-next6/102/21/22/31/20/1Open qwen3-coder-next conversation
How scoring works

Each run is graded on five dimensions:

  • Task Completion (0-2): how fully the workflow answered the requested analysis task and completed required steps.
  • Execution Correctness (0-2): whether generated code is correct, consistent, and likely runnable without major fixes.
  • Output Quality (0-3): how well tables, charts, and results match the expected analytical outcomes.
  • Reasoning Quality (0-2): clarity and correctness of interpretation, explanation, and conclusions.
  • Reliability (0-1): robustness of the run, including low hallucination risk and stable behavior.

The final score is the sum of these dimensions, so total score = 0-10. Higher values indicate more complete, correct, and reliable notebook results for this workflow.

AI Analysis with gpt-5.4

This run uses AI with a step-by-step list of prompts to perform the analysis task Apple Stock Price Analysis in Python. Below, you can review the full conversation generated with gpt-5.4. After the conversation, we show detailed score cards for this run.

load AAPL stock prices for the last 2 years using yfinance
Thinking

Loading stock prices step-by-step

I need to help the user load AAPL stock prices for the last two years using yfinance. First, I should load the dataset and inspect it by showing its shape, columns, and head. I might need to include an installation step in case yfinance isn't installed yet. I’ll also implement a try/except for better error handling. After loading the data, I should reset the index to display the date as a column. I'll aim to keep it minimal while allowing multiple outputs when necessary.Constructing data loading code

I realize that head() produces another table, but I need to show multiple tables sequentially using display(). It's also important to mention if the auto_adjust feature should be used. I might use yfinance.download("AAPL", period="2y") for retrieving the data. Including progress=False and actions=False could be good defaults. Now, I need to craft a concise explanation alongside the code for better clarity. I'll ensure everything is straightforward for the user!

Let’s load the last 2 years of AAPL data with yfinance and do the initial inspection only.

Download and preview AAPL data
import pandas as pd
import yfinance as yf
from IPython.display import display

aapl = yf.download("AAPL", period="2y", progress=False)

summary = pd.DataFrame({
    "metric": ["rows", "columns", "column_names"],
    "value": [aapl.shape[0], aapl.shape[1], list(aapl.columns)]
})

display(summary)
display(aapl.head())

aapl
metric value
0 rows 501
1 columns 5
2 column_names [(Close, AAPL), (High, AAPL), (Low, AAPL), (Op...
Price Close High Low Open Volume
Ticker AAPL AAPL AAPL AAPL AAPL
Date
2024-04-11 173.461334 173.877560 166.643394 166.821764 91070300
2024-04-12 174.957764 176.751438 172.638871 172.688408 101670900
2024-04-15 171.132538 175.037006 170.944249 173.778456 73531800
2024-04-16 167.852402 172.192889 166.752412 170.201022 73711200
2024-04-17 166.484863 169.110958 166.484863 168.080344 50901200
Price Close High Low Open Volume
Ticker AAPL AAPL AAPL AAPL AAPL
Date
2024-04-11 173.461334 173.877560 166.643394 166.821764 91070300
2024-04-12 174.957764 176.751438 172.638871 172.688408 101670900
2024-04-15 171.132538 175.037006 170.944249 173.778456 73531800
2024-04-16 167.852402 172.192889 166.752412 170.201022 73711200
2024-04-17 166.484863 169.110958 166.484863 168.080344 50901200
... ... ... ... ... ...
2026-04-06 258.859985 262.160004 256.459991 256.510010 29329900
2026-04-07 253.500000 256.200012 245.699997 256.160004 62148000
2026-04-08 258.899994 259.750000 256.529999 258.450012 41032800
2026-04-09 260.489990 261.119995 256.070007 259.000000 28121600
2026-04-10 260.480011 262.190002 259.023102 259.950012 28218689

501 rows × 5 columns

You now have AAPL stock prices loaded from yfinance for roughly the last 2 years.

From the notebook output:

  • The dataset contains 501 trading-day rows and 5 columns.
  • The available fields are Close, High, Low, Open, and Volume for AAPL.
  • The data is indexed by Date, which is what you want for time-series stock prices.
  • The preview shows daily observations starting around 2024-04-11, which matches a 2-year pull from the current notebook date context.

One detail visible in the output: yfinance returned the columns as a MultiIndex like (Close, AAPL), (High, AAPL), etc. That’s normal for this download format, even for a single ticker.

add 20-day and 50-day moving averages and plot
Thinking

Planning the response

I see I need to respond in one step since the user wants to add moving averages (MAs) and plot a chart. The developer guidelines suggest focusing on one main objective per response, which ideally should be a single chart or table. However, since the user asked for both, I think this could be acceptable in one code block.

Given that plotting requires computation, I'll use seaborn or matplotlib. I need to think about the DataFrame structure, possibly creating a flat DataFrame for clarity while preserving the original data. I'll also create a simple DataFrame to use for plotting with columns for 'Close', 'MA20', and 'MA50'.Crafting the code

The user wants to add 20 and 50-day moving averages and plot them, so I could create a new DataFrame, maybe called aapl_ma, or I could mutate aapl. Adding columns to a copy might be the best way to go here. I also need to note that yfinance returned a MultiIndex, which means I'll build a clean table for plotting.

I’m aiming for a single code block to keep everything tidy. This code should end with a final expression, possibly plt.show(). I'll use seaborn's sns.lineplot for the chart, focusing on clarity and conciseness while ensuring I adhere to the guidelines of one chart or table at a time.

Since aapl has MultiIndex columns, I’ll make a clean price table with Close, MA20, and MA50, then plot those three lines.

Plot moving averages
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

close = aapl[("Close", "AAPL")].rename("Close")

plot_df = pd.DataFrame({"Close": close})
plot_df["MA20"] = plot_df["Close"].rolling(20).mean()
plot_df["MA50"] = plot_df["Close"].rolling(50).mean()

plt.figure(figsize=(12, 6))
sns.lineplot(data=plot_df[["Close", "MA20", "MA50"]])
plt.title("AAPL Close Price with 20-Day and 50-Day Moving Averages")
plt.xlabel("Date")
plt.ylabel("Price")
plt.legend(["Close", "20-day MA", "50-day MA"])
plt.tight_layout()
plt.show()
Thinking

Evaluating output paths

I need to consider two choices, A and B, based on the latest outputs. The most recent output is just a chart with no accompanying numeric or text values, which complicates things. Heuristics suggest that if I have both a valid chart and supporting data, I should go with Path A, but here I'm limited. I should start with Path B to print the values used in the chart and provide minimal runnable code. I'll likely display the last few rows of the plot_df.Summarizing variable states

It looks like in the notebook state, the variables only show 10 truncated items for AAPL due to truncation. I think it's safe to recreate this from AAPL minimally. I need a final expression at the end, and I want to keep my explanation concise with insights from the latest outputs. The plot rendered without error and used the AAPL MultiIndex close series, indicating that the moving average calculations likely succeeded. I’ll need to code the last 10 rows for clarity without asking to run anything.

The plot rendered, so the 20-day and 50-day moving-average lines were successfully added to the AAPL closing price series. The only thing still missing is the underlying MA values in text form, since the latest output is just the figure.

Compute moving averages
close = aapl[("Close", "AAPL")].rename("Close")

ma_df = close.to_frame()
ma_df["MA20"] = ma_df["Close"].rolling(20).mean()
ma_df["MA50"] = ma_df["Close"].rolling(50).mean()

ma_df.tail(10)
Close MA20 MA50
Date
2026-03-27 248.800003 255.375999 260.207576
2026-03-30 246.630005 254.471500 259.980804
2026-03-31 253.789993 253.973499 259.950782
2026-04-01 255.630005 253.629000 260.133994
2026-04-02 255.919998 253.410500 260.304025
2026-04-06 258.859985 253.480499 260.518867
2026-04-07 253.500000 253.161499 260.632705
2026-04-08 258.899994 253.064999 260.707280
2026-04-09 260.489990 253.048999 260.756509
2026-04-10 260.480011 253.285000 260.842104

The 20-day and 50-day moving averages are now added and plotted against AAPL’s closing price.

From the latest values shown, the short-term average is below the longer-term average:

  • On 2026-04-10, MA20 is about 253.285
  • MA50 is about 260.842
  • The closing price is 260.480

That means the recent 20 trading days have been weaker on average than the last 50 trading days, even though the latest close has rebounded close to the 50-day average. In the last few rows, the 20-day moving average trends upward slightly, while the 50-day moving average stays higher and relatively flatter, suggesting a short-term recovery inside a still weaker medium-term trend.

The plot and the value table together fully support the moving-average view you asked for.

calculate daily returns and plot their distribution
Thinking

Determining plotting steps

I need to respond based on the current notebook state and the user’s changed prompt: calculating daily returns and plotting their distribution. My goal is to follow iterative rules while analyzing outputs. It seems I can calculate and plot in one response since the primary objective is the distribution plot after calculating the necessary series. I’ll stick to one code block and keep it concise, using seaborn or matplotlib for this. I'll finish with plt.show() if plotting is involved.Creating charts from data

I need to explain and provide code based on the notebook state with MultiIndex columns. First, I’ll compute the daily returns using close.pct_change().dropna(), and since daily returns are often expressed in percentages, I’ll multiply them by 100. I’ll create a small DataFrame or series and utilize sns.histplot with kde=True for plotting. It's essential to include a title and labels for clarity while ensuring there's only one chart in the output. I'll focus on concise code.

Next, compute AAPL daily returns from the close price and plot their distribution.

Plot AAPL returns distribution
import seaborn as sns
import matplotlib.pyplot as plt

close = aapl[("Close", "AAPL")].rename("Close")
daily_returns = close.pct_change().dropna() * 100

plt.figure(figsize=(10, 6))
sns.histplot(daily_returns, bins=40, kde=True)
plt.title("Distribution of AAPL Daily Returns")
plt.xlabel("Daily Return (%)")
plt.ylabel("Frequency")
plt.tight_layout()
plt.show()

The distribution plot is in place, so the daily returns were computed successfully from the AAPL close series. What’s still missing is the numeric evidence behind that chart, so we can describe the distribution in concrete terms rather than only visually.

Summarize daily returns
daily_returns.describe(percentiles=[0.01, 0.05, 0.25, 0.5, 0.75, 0.95, 0.99]).to_frame("daily_return_pct")
daily_return_pct
count 500.000000
mean 0.097062
std 1.783079
min -9.245603
1% -4.817001
5% -2.704837
25% -0.673727
50% 0.140025
75% 0.786808
95% 2.435546
99% 4.317378
max 15.328838

You calculated daily returns as percentage changes in AAPL’s closing price and plotted their distribution.

From the observed summary of daily_returns:

  • There are 500 daily return observations
  • The average daily return is about 0.097%
  • The median daily return is about 0.140%, which suggests a typical day is a small positive move
  • The standard deviation is about 1.78%, indicating day-to-day volatility around that scale
  • The most negative day was about -9.25%
  • The most positive day was about 15.33%

The percentile values help interpret the distribution:

  • 5% of days were worse than about -2.70%
  • 95% of days were better than about 2.44%
  • The middle 50% of days fell roughly between -0.67% and 0.79%

So the distribution is centered near zero with a slight positive tilt, while still showing some large tail moves on both sides, especially a few unusually strong positive-return days. That matches what you would expect from a histogram of stock daily returns: most observations cluster near small gains/losses, with occasional larger shocks.

compute 30-day rolling volatility and annualised Sharpe ratio
Thinking

Reviewing calculations for returns

The user hasn’t provided specific details, so I’m contemplating calculating both non-annualized and annualized figures, but one concise table may be best. I need to decide if daily returns should be in decimal or percent. The Sharpe ratio can be computed from mean/std by applying the square root of 252, but I assume a risk-free rate of 0. For rolling volatility, I'll proceed with what's needed and display the relevant outputs accordingly, while adhering to notebook rules.Deciding on output format

I think I should compute and display a 10-row tail of the rolling volatility along with a separate summary, which might lead to two tables. But maybe I could create a single DataFrame that holds both the annualized Sharpe ratio and the latest rolling volatility values. Since the user asked for both metrics, it might be best to combine them into one table for clarity, even though the notebook rules typically allow only one table per step.Considering metrics computation

I could create a Series named "metrics" that includes both the annualized Sharpe ratio and the latest 30-day rolling volatility annualized percentage. I need to remember that the full rolling series is saved as "rolling_vol_30." I want to ensure clarity, so I'll clarify that I'm assuming a 0 risk-free rate when calculating annualized volatility from daily returns and the overall annualized Sharpe ratio. I’ll keep the code minimal, focusing on the final expression metrics.

Assuming a 0 risk-free rate, this computes:

  • rolling_vol_30: 30-day annualized rolling volatility in %
  • annualised_sharpe: overall annualized Sharpe ratio
Compute volatility and Sharpe ratio
close = aapl[("Close", "AAPL")].rename("Close")
returns = close.pct_change().dropna()

rolling_vol_30 = returns.rolling(30).std() * (252 ** 0.5) * 100
rolling_vol_30 = rolling_vol_30.rename("rolling_vol_30_annualised_pct")

annualised_sharpe = (returns.mean() / returns.std()) * (252 ** 0.5)

metrics = pd.DataFrame(
    {"value": [annualised_sharpe, rolling_vol_30.iloc[-1]]},
    index=["annualised_sharpe_ratio", "latest_30d_rolling_vol_annualised_pct"]
)

metrics
value
annualised_sharpe_ratio 0.864129
latest_30d_rolling_vol_annualised_pct 21.103842
Thinking

Analyzing outputs for insights

I need to decide between answer paths A or B based on the latest output from cell 6, which provides numeric values for the annualized Sharpe ratio and the latest 30-day rolling volatility annualized percentage. My goal is to compute the 30-day rolling volatility and annualized Sharpe ratio. The current outputs are sufficient for this. I'll produce a clear final answer that includes insights from these outputs and explains their significance concerning the original goal without using Python code.

Using the AAPL daily close returns in the notebook:

  • The annualised Sharpe ratio is 0.864
  • The latest 30-day rolling volatility, annualised, is 21.10%

What that means:

  • A Sharpe ratio of 0.864 indicates the stock’s recent return per unit of volatility has been positive but moderate, assuming a 0% risk-free rate.
  • A 30-day annualised volatility of 21.10% means that, based on the most recent 30 trading days, AAPL’s price variability corresponds to about 21% yearly volatility if that recent risk level persisted.

So for the original request, the notebook now gives both the rolling risk measure and the overall risk-adjusted return measure from the observed 2-year sample.

Score cards for gpt-5.4 for Apple Stock Price Analysis in Python

Total score: 10/10

  • Task Completion: 2/2

    Excellent

    All major steps were completed: data loading, moving averages and plot, daily returns distribution, and rolling volatility with Sharpe ratio.

  • Execution Correctness: 2/2

    Excellent

    The code shown is coherent and likely runnable, with outputs matching the intended computations.

  • Output Quality: 3/3

    Excellent

    Outputs align well with the expected outcomes: an OHLCV dataframe, MA crossover plot, return distribution summary, and rolling volatility plus Sharpe ratio.

  • Reasoning Quality: 2/2

    Excellent

    The notebook explanations correctly interpret the data and the computed metrics without obvious logical errors.

  • Reliability: 1/1

    Excellent

    The workflow is consistent and uses standard methods for stock analysis, with no major hallucinations or unsafe steps.

Try MLJAR Studio

Run the same type of AI-powered data analysis on your own datasets with conversational notebooks in MLJAR Studio.

Explore More AI Analysis Examples

Discover additional workflows across categories. Each example includes prompts, conversation outputs, and model-level scoring so you can compare approaches and results.