Build RAG App using OpenAI in Python

Learn how to build a Retrieval-Augmented Generation (RAG) app using the OpenAI API in Python. This notebook explains combining knowledge retrieval and language models to create intelligent and dynamic applications.

This notebook was created with MLJAR Studio

MLJAR Studio is Python code editior with interactive code recipes and local AI assistant.
You have code recipes UI displayed at the top of code cells.

Documentation

MLJAR Studio imports all of required packages so you don't have to worry about them:

# import packages
import os
from dotenv import load_dotenv
from openai import OpenAI, AuthenticationError
from docx import Document
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

OpenAI Client connection:

# load .env file
load_dotenv()

# get api key from environment
api_key = os.environ["OPENAI_KEY"]

# create OpenAI client
def create_client(api_key):
    try:
        client = OpenAI(api_key=api_key)
        client.models.list()
        return client
    except AuthenticationError:
        print("Incorrect API")
    return None

client = create_client(api_key)

Generate embeddings from a file (PDF or DOCX) of your choice using the below recipe:

# set file path
filePath=r"../../../../Downloads/example.docx"

# read file
doc = Document(filePath)

# declare lists
chunks = []
embeddings = []

# text division
for i in range(0, len(doc.paragraphs)):
    chunk = doc.paragraphs[i].text
    chunks.append(chunk)

# create embeddings
for i in range(0, len(chunks)):
    embedding = client.embeddings.create(
        input = chunks[i],
        model = "text-embedding-3-small"
    )
    embeddings.append(embedding.data[0].embedding)

When embeddings are ready, create a query embedding and use cosine similarity to search for the best match:

# define user query
user_query = "Why mercury is called mercury?"

# generate embedding
response = client.embeddings.create(
    input = user_query,
    model = "text-embedding-3-small"
)
query_embedding = response.data[0].embedding

# find most similar id
best_match_id = cosine_similarity(np.array(embeddings), np.array(query_embedding).reshape(1,-1)).argmax()

# print most similar text
chunks[best_match_id]

After getting a response, use it as an assistant role message and send the question to the AI model:

# create a chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "you are helpful assistant"},
        {"role": "assistant", "content": chunks[best_match_id]},
        {"role": "user", "content": "Why mercury is called mercury?"},
    ],
    max_tokens=300
)

# get and print response
print(response.choices[0].message.content)

Conclusions

With MLJAR Studio you can even create a script for the RAG process with not much effort.

Follow us for more practical use cases!

Recipes used in the openai-rag-in-python.ipynb

All code recipes used in this notebook are listed below. You can click them to check their documentation.

Packages used in the openai-rag-in-python.ipynb

List of packages that need to be installed in your Python environment to run this notebook. Please note that MLJAR Studio automatically installs and imports required modules for you.

openai>=1.35.14

python-dotenv>=1.0.1

pypdf>=4.1.0

python-docx>=1.1.2

numpy>=1.26.4

scikit-learn>=1.5.1