
Objective
This recipe aims to show developers and users how to get started with Premβs Generative AI Platform and build different use cases around it. Through this tutorial, we also want to show how you can easily integrate Prem with existing Open Source LLM orchestration tools like llama-index. We will give you a basic understanding of how to connect data sources (like SQL database) using llama-index and then use PremAI LLMs and Embeddings to index the data and fetch meaningful results.Setting up the project
Letβs start by creating a virtual environment and installing dependencies.Before getting started, make sure you have an account at Prem AI Platform and a valid project ID and an API Key. You also need to have at least one repository-id as a last requirement.
Prem AI offers a variety of models comprising of LLMs and Embedding models. You can check out all the available models under this section.
db_utils.py
and main.py
. In the db_utils.py
file, we will write all the utility functions to connect llama-index with our SQL database and the table of our choice. The main.py
file contains the streamlit code to provide a nice front end.
Ingesting movies data into SQL table Optional
This section will create a sample SQL table using Postgres DB. We set it up locally and use this Netflix movies dataset from Kaggle. We download the dataset (a single CSV file). We load the dataset using pandas, and then we parse each row of the dataframe and insert it into our table.
Please note
If you already have a database and some tables, you are not required to follow the steps in this section. However, if you are still interested in doing the same and reproducing the results, kindly head over to this Jupyter Notebook, where we wrote all the details regarding how to do this process on Postgres DB using SQLAlchemy.
Connecting SQL table using llama-index and PremAI
We will specifically understand some functions inside the db_utils.py file. Letβs look at the schematics to understand what is happening under the hood.
SQLDatabase
and then builds the query engine using NLSQLTableQueryEngine
. This engine is basically prompted to generate proper SQL from the table schema and its contents.
This workflow shown above does not use any embedding-based retrieving techniques. It provides an awesome abstraction for text2sql related workflows once connected with some sort of database from SQLAlchemy.
Python
Python
Writing a simple frontend using streamlit
Now letβs write a simple frontend using streamlit to apply a chat interface on top of this approach (use main.py file to see/edit the changes). Please make an additional.streamlit
folder. Inside that folder, create a file named secrets.toml
. Inside this toml file, save all the db-specific credentials. Here is an example:
toml
Python
get_all_tables_from_db
.
Python
setup_index_before_chat
function.
Python
Boilerplate Streamlit chat implementation
In this section, we are going to implement the same boilerplate chat code that we used in Chat with PDF recipe of our cookbook. You can check that out for more details on what each part of the code does.
- Get the prompt from the user, save it to a streamlit session to capture history and show the prompt in the chat UI.
- Once got the prompt, pass it to the
query_engine
and get the result. - Stream the result and show the metadata (including the SQL query written to fetch the results for debug purposes).
Streamlit chat implementation
Streamlit chat implementation
Python
Chat with multiple SQL tables Optional
In this section, we will write an additional function to replace the existing setup_index_before_chat
with a more advanced one. We will also introduce how to index multiple tables and chat with them using llama-index and Prem AI.
The workflow remains the same but with some changes.
- We first connect to our database.
- After this, we gather all the tables and initialize a
VectorStoreIndex
, which essentially maps and performs an indexing to all the available table schemas. Here, our embedding model is used to get the schema embeddings and all the metadata related to the tables. - When we pass a user query, it goes to the in-memory vector index to fetch all the nearest embeddings.
- Finally, it decodes the embedded data and puts it into the LLM context to get the SQL.
- This SQL is run inside the database to fetch proper results and then fed to another LLM in a better human-readable format.

Python
setup_index_before_chat
with setup_index_before_chat_use_embedding
function inside main.py
and then we are now able to chat with multiple tables. You can check out the full source code to run and reproduce the results and extend it to your SQL based Generative AI workflows.