Transfer data from database to a Google sheet

Combining two very useful cookbooks of MLJAR studio: Postgresql and Google sheets integrations, it is possible to perform complex data science operations with just a few click. Abstracting all the annoyances away, letting you focus on the important work.

This notebook was created with MLJAR Studio

MLJAR Studio is Python code editior with interactive code recipes and local AI assistant.
You have code recipes UI displayed at the top of code cells.

Documentation

All required packages are automatically imported by MLJAR Studio for you so you don't have to worry about them.

# import packages
import psycopg
import os
from dotenv import load_dotenv
from psycopg import sql
import gspread

First let's open a new Postgresql connection. To learn how to do it, check out open and test database connection example notebook.

# load credentials from .env file:
load_dotenv(override=True)

# get the credentials
def create_new_connection():
    try:
        conn = psycopg.connect(
            dbname=os.getenv("POSTGRES_DB_NAME"),
            user=os.getenv("POSTGRES_USERNAME"),
            password=os.getenv("POSTGRES_PASSWORD"),
            host=os.getenv("POSTGRES_HOST"),
            port=os.getenv("POSTGRES_PORT"),
        )
        return conn
    # check for errors
    except psycopg.Error as e:
        raise psycopg.Error(f"""
Error occurred while establishing connection: 
    {e}

Maybe you provided wrong credentials, use define new connection recipe to edit them.
Other option is that database server is down or you dont have the right acces to this database.
            """)

# open new connection:
conn = create_new_connection()

# display connection status
print("Connection status: " + conn.info.status.name)

Let's show all the tables from the database we connected to using show all tables recipe. This will enumerate all the possible tables we can operate on.

# if connection was used and closed it is reopen here
if conn.closed:
    conn = create_new_connection()

# run query
with conn:
    with conn.cursor() as cur:

        # query db
        try:
            cur.execute("""
                SELECT table_name
                FROM information_schema.tables
                WHERE table_type = 'BASE TABLE'
                AND table_schema NOT IN ('pg_catalog', 'information_schema');
            """)
        # check for errors
        except psycopg.ProgrammingError as e:
            raise psycopg.ProgrammingError(f"""
Problem running query:
    {e}

Did you spell everything correctly?
You can use show tables and columns recipes.
            """)

        # save query results to a variable
        queryResults = cur.fetchall()

        # print the results
        print("Tables:")
        for table in queryResults:
            print(f"{table}")

Let's choose the test collumn and list all of its columns using show all columns recipe.

# if connection was used and closed it is reopen here
if conn.closed:
    conn = create_new_connection()

# run query
with conn:
    with conn.cursor() as cur:

        # query db
        try:
            cur.execute("""
                    SELECT attname AS col, atttypid::regtype AS datatype
                    FROM pg_attribute
                    WHERE attrelid = %s::regclass 
                    AND attnum > 0
                    AND NOT attisdropped
                    ORDER BY attnum;""", 
                    ("test",))
        # check for errors
        except psycopg.ProgrammingError as e:
            raise psycopg.ProgrammingError(f"""
Problem running query:
    {e}

Did you spell everything correctly?
You can use show tables and columns recipes.
            """)

        # save query results to a variable
        queryResults = cur.fetchall()

        # print the results
        print("Columns of test:")
        for column in queryResults:
            print(f"{column}")

Now that we have all the necessary information about out database tables. Let's grab the proper data using run select query recipe.

# if connection was used and closed it is reopen here
if conn.closed:
    conn = create_new_connection()

# run query
with conn:
    with conn.cursor() as cur:

        # query db
        try:
            cur.execute(
                sql.SQL("SELECT {columns} FROM {table}").format(
                    columns=sql.SQL(",").join(
                        [sql.Identifier(column) for column in ["id","num","data","description"]]
                    ),
                    table=sql.Identifier("test"),
                )
            )
        # check for errors
        except psycopg.ProgrammingError as e:
            raise psycopg.ProgrammingError(f"""
Problem running query:
    {e}

Did you spell everything correctly?
You can use show tables and columns recipes.
            """)

        # save query results to a variable
        queryResults = cur.fetchall()

        # print the results
        for row in queryResults:
            print(f"{row}")

Great, now that our database data is saved in queryResults variable, it's time to connect to Google sheets account, using Google API Key.

# set filepath
filepath=r"yourGoogleSheetsAPIKey.json"

# create connection function
def connection(filepath):
    try:
        gc = gspread.service_account(filepath)
        print("Connection succeeded!")
        return gc
    except ValueError:
        print("Incorrect private key")

# connection
gc = connection(filepath)

Here we create a new spreadsheet and name it: test_data_spreadsheet.

# create spreadsheet
sh = gc.create('test_data_spreadsheet')

Let's choose the first worksheet.

# open worksheet
worksheet = sh.get_worksheet(0)

And append out data to it.

# append rows
worksheet.append_rows(queryResults)

Just to confirm, let's read data from the spreadsheet.

# get all values from worksheet
worksheet_values = worksheet.get_all_values()

# print worksheet values
print(worksheet_values)

Don't forget to share the spreadsheet to be able to see and edit it from google sheets web application. Input your email and follow the steps in the email message you should shortly receive.

# share spreadsheet
sh.share(email_address='jhon.deen@example.com', perm_type='user', role='reader')

Conclusions

MLJAR studio show its full potential when many of it numerous integrations are used together to allow you the best data science experience.

Recipes used in the postgresql-to-google-sheets.ipynb

All code recipes used in this notebook are listed below. You can click them to check their documentation.

Packages used in the postgresql-to-google-sheets.ipynb

List of packages that need to be installed in your Python environment to run this notebook. Please note that MLJAR Studio automatically installs and imports required modules for you.

psycopg>=3.2.1

python-dotenv>=1.0.1

gspread>=6.1.2