Fill PostgreSQL database with AI using Python
Combining power of both SQL and AI we are going to fill out missing description columns in database using AI generated text.
MLJAR Studio is Python code editior with interactive code recipes and local AI assistant.
You have code recipes UI displayed at the top of code cells.
All required packages are automatically imported by MLJAR Studio for you so you don't have to worry about them.
# import packages
import psycopg
import os
from dotenv import load_dotenv
from psycopg import sql
from openai import OpenAI, AuthenticationError
Make sure you opened a connection in your notebook. To learn how to do it, check out open and test database connection example notebook.
# load credentials from .env file:
load_dotenv(override=True)
# get the credentials
def create_new_connection():
try:
conn = psycopg.connect(
dbname=os.getenv("POSTGRES_DB_NAME"),
user=os.getenv("POSTGRES_USERNAME"),
password=os.getenv("POSTGRES_PASSWORD"),
host=os.getenv("POSTGRES_HOST"),
port=os.getenv("POSTGRES_PORT"),
)
return conn
# check for errors
except psycopg.Error as e:
raise psycopg.Error(f"""
Error occurred while establishing connection:
{e}
Maybe you provided wrong credentials, use define new connection recipe to edit them.
Other option is that database server is down or you dont have the right acces to this database.
""")
# open new connection:
conn = create_new_connection()
I chose the products
table. Lets list all columns of that table using show all columns recipe
# if connection was used and closed it is reopen here
if conn.closed:
conn = create_new_connection()
# run query
with conn:
with conn.cursor() as cur:
# query db
try:
cur.execute("""
SELECT attname AS col, atttypid::regtype AS datatype
FROM pg_attribute
WHERE attrelid = %s::regclass
AND attnum > 0
AND NOT attisdropped
ORDER BY attnum;""",
("products",))
# check for errors
except psycopg.ProgrammingError as e:
raise psycopg.ProgrammingError(f"""
Problem running query:
{e}
Did you spell everything correctly?
You can use show tables and columns recipes.
""")
# print the results
print("Columns of products:")
for column in cur.fetchall():
print(f"{column}")
Lets check the contents of products table using select query recipe.
# if connection was used and closed it is reopen here
if conn.closed:
conn = create_new_connection()
# run query
with conn:
with conn.cursor() as cur:
# query db
try:
cur.execute(
sql.SQL("SELECT {columns} FROM {table}").format(
columns=sql.SQL("*"),
table=sql.Identifier("products"),
)
)
# check for errors
except psycopg.ProgrammingError as e:
raise psycopg.ProgrammingError(f"""
Problem running query:
{e}
Did you spell everything correctly?
You can use show tables and columns recipes.
""")
# print the results
for row in cur.fetchall():
print(f"{row}")
We can see that the description
column is empty. Lets fill it out. To do so we will use AI.
First we need to open a new connection to the LLM provider. Here is an example on how to do it
# load .env file
load_dotenv()
# get api key from environment
api_key = os.environ["OPENAI_KEY"]
# create OpenAI client
def create_client(api_key):
try:
client = OpenAI(api_key=api_key)
client.models.list()
return client
except AuthenticationError:
print("Incorrect API")
return None
client = create_client(api_key)
Now lets ask our question to the all mighty, super inteligent LLM.
# create a chat completion
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "I have a list of products in my database: (1, 'grass cutter', 130, None) (2, 'laptop', 1529, None) (3, 'metal detector', 502, None) (4, 'cup', 10, None) (5, 'camera', 655, None) those records are missing values in a description column i'd like you to fill it out, so there are no blanks left return only the descriptions and in a form of a python list"},
],
max_tokens=300
)
# get and print response
print(response.choices[0].message.content)
We will copy the returned list and paste it into the notebook. We will also create an ids list. Don't forget to execute this cell!
# list of descriptions generated by LLM
descriptions = [
"A tool used to cut grass in gardens and lawns efficiently.",
"A portable personal computer with a screen and keyboard.",
"A device used for detecting metal objects beneath the ground.",
"A small bowl-shaped drinking vessel used to drink liquids.",
"A device for capturing photographs and recording videos."
]
# list of id to update on,
# the range is 1 to 6 because in pytohn the upper bound is exclusive
ids = [id for id in range(1,6)]
# result is: ids = [1, 2, 3, 4, 5]
To update records in out database we will use Update selected columns
recipe. Here is a notebook explaining how it works.
# if connection was used and closed it is reopen here
if conn.closed:
conn = create_new_connection()
# run query
with conn:
with conn.cursor() as cur:
# update data
try:
query = sql.SQL(
"UPDATE {table} SET {column} = %(value)s WHERE id = %(id)s RETURNING *"
).format(table=sql.Identifier("products"), column=sql.Identifier("description"))
cur.executemany(query, params_seq=[{"id": ids[i], "value": descriptions[i]} for i in range(len(ids))], returning=True)
# check for errors
except psycopg.ProgrammingError as e:
raise psycopg.ProgrammingError(f"""
Problem running query:
{e}
Are you sure every name is spelled correctly?
You can use show all tables or columns to check db contents.
""")
# print the results
while True:
for row in cur.fetchall():
print(row)
if not cur.nextset():
break
Conclusions
MLJAR studio provides plethora of very powerful tool which when strung together empowers you to accomplish complex task in just a few clicks.
Recipes used in the postgresql-python-fill-out-db-with-ai.ipynb
All code recipes used in this notebook are listed below. You can click them to check their documentation.
Packages used in the postgresql-python-fill-out-db-with-ai.ipynb
List of packages that need to be installed in your Python environment to run this notebook. Please note that MLJAR Studio automatically installs and imports required modules for you.
psycopg[binary]>=3.2.1
python-dotenv>=1.0.1
openai>=1.35.14
Similar notebooks
List of similar Python notebooks, so you can find more inspiration ๐
Define a new PostgreSQL connection in Python
Connect to PostgreSQL database in Python
Show tables from PostgreSQL database
Show columns from table in PostgreSQL
Create new table using python and PostgreSQL
Drop table using Python and PostgreSQL
Backup table using python
Update column in table using Python
Select every column from table using python
Insert row into a table using Python
Run raw SQL query in python
Raw SQL query straight to pandas dataframe
From database to chart plotting
Update selected rows of a column in table ...
Fill PostgreSQL database with AI using Python
Transfer data from database to a Google sheet