How to generate text embeddings using OpenAI in Python

See how to generate text embeddings using OpenAI models in Python. This notebook covers generating embeddings with a specified model and the given text and printing the result.

This notebook was created with MLJAR Studio

MLJAR Studio is Python code editior with interactive code recipes and local AI assistant.
You have code recipes UI displayed at the top of code cells.

Documentation

Don't worry about the imports they will be set automatically:

# import packages
import os
from dotenv import load_dotenv
from openai import OpenAI, AuthenticationError

Create the connection with OpenAI client:

# load .env file
load_dotenv()

# get api key from environment
api_key = os.environ["OPENAI_KEY"]

# create OpenAI client
def create_client(api_key):
    try:
        client = OpenAI(api_key=api_key)
        client.models.list()
        return client
    except AuthenticationError:
        print("Incorrect API")
    return None

client = create_client(api_key)

Use below code recipe to create the embedding:

# create embedding
embedding = client.embeddings.create(
    input = "This is an example text that i want to turn into embedding.",
    model = "text-embedding-3-small"
)

You can also print the embedding if you want, but it could be a very long output. See how many elements it has:

print(len(embedding.data[0].embedding))

Conclusions

Generating text embeddings is just the tip of the iceberg in the RAG process.

See more in our other notebooks!

Recipes used in the openai-embedding.ipynb

All code recipes used in this notebook are listed below. You can click them to check their documentation.

Packages used in the openai-embedding.ipynb

List of packages that need to be installed in your Python environment to run this notebook. Please note that MLJAR Studio automatically installs and imports required modules for you.

openai>=1.35.14

python-dotenv>=1.0.1