Agents has been quite popular for a while. Simply we can define agents as an orchestrated workflows between different LLMs/SLMs. PremSQL Agents are mainly focussed to execute tasks related to Databases. Breifly PremSQL agents can:

  • Query to a database from user’s natural language input.
  • Analyse the database output and user query and give back a answer in natural language.
  • Plot basic charts based on user’s query.
  • Lastly anything which does not fit the above three categories, it can give you a followup on what do next.

PremSQL is a semi-autonomous agentic framework as of now. This means it needs some sort of supervision from user side to execute tasks. Hence, the degree of freedom of the agents is strictly restricted to 4 routes, which are:

  • /query: This will tell agent to go, write a SQL query and execute to the DataBase.
  • /analyse: This will analyse the dataframe and based on user’s query will give an concise natural language answer.
  • /plot: This will plot basic charts / graphs from the output dataframes and user’s input query.
  • /followup: If the user do not start with the above three markers or when the wokers (LLM / SLM) of the agent fails then it goes into this route. Here the worker tells what to do next.

PremSQL also comes with it’s inbuilt Memory. The memory is just a simple SQLite database which captures all the conversations by following some specific schema (more details below). Here is the basic architecture or workflow on how PremSQL agents operates.

Prerequisites

Before diving in, ensure you have a basic understanding of the following premsql components:

Now let’s understand some basic terminologies we use while using PremSQL agents.

Terminologies

Before procedding, we discuss some terminology conventions we use for PremSQL Agents.

  • Agent: An agent is considered to be the complete overall workflow.
  • Workers: Workers are the LLMs / SLMs which is instructed / designed to do some task.
  • Tools: Tools are the deterministic / programmable functions (which does not contain models in general) that helps the worker either to provide contexts or helps to execute some task.

PremSQL comes with a baseline agent. In other words, Baseline means the minimal agentic implementation that can do all the following tasks. However using the premsql library you can make task specific Database RAG agents or extend the existing ones.

BaseLine Agent

PremSQL comes with a minimal agentic implementation (more implementation variants will come in later versions), which can query to a DB, provide analysis over dataframes and answer user questions and plot simple graphs. This is how you use our baseline Text to SQL agent.

from premsql.agents import BaseLineAgent
from premsql.generators import Text2SQLGeneratorMLX
from premsql.executors import ExecutorUsingLangChain
from premsql.agents.tools import SimpleMatplotlibTool

text2sql_worker = Text2SQLGeneratorMLX(
    model_name_or_path="premai-io/prem-1B-SQL", experiment_name="text2sql_worker",
    type="test"
)

analyser_plotter_followup_worker = Text2SQLGeneratorMLX(
    model_name_or_path="meta-llama/Llama-3.2-1B-Instruct", experiment_name="analyser_plotter_followup_worker",
    type="test",
)

Since PremSQL is a local first open source library. So in this documentation, we are going to use Apple MLX as our inference engine. We also support other inference engines like PremAI, HuggingFace, OpenAI, Ollama. We are going to use the following two models for using our baseline agents:

  • Prem-1B-SQL: This is a 1.3B parameter model, fully fine-tuned from DeepSeek coder 1.3B specialized to perform Text to SQL tasks. This will be used for our text to SQL worker.

  • LLama-3.2-1B: For all the other tasks: analyse / plot and followup, we are going to use this model.

However you can also use Text2SQLGeneratorHF / Text2SQLGeneratorOpenAI or Text2SQLGeneratorPremAI / Text2SQLGeneratorOllama to use other open or closed source models as well. Now, let’s proceed with instantiating our BaseLine agent.

Check out our introduction page where we have used Ollama to do the same thing.

# You can also use Postgres, MySQL database as well.
db_connection_uri = "sqlite://///path/to/db.sqlite"

baseline = BaseLineAgent(
    session_name="local_db_rag",
    db_connection_uri=db_connection_uri,
    specialized_model1=text2sql_worker,
    specialized_model2=analyser_plotter_followup_worker,
    executor=ExecutorUsingLangChain(),
    auto_filter_tables=False,
    plot_tool=SimpleMatplotlibTool()
)

Arguments

  • session_name: A string identifier for the current session, likely used for tracking or logging purposes.

  • db_connection_uri: A string containing the database connection URI that specifies how to connect to the database.

  • specialized_model1: A Text2SQL model instance used primarily for SQL generation tasks. Based on the code, this is used in the text2sql_worker.

  • specialized_model2: A second Text2SQL model instance used for analysis, plotting, and followup tasks. This model is shared among multiple workers.

  • executor: An instance of BaseExecutor that handles the execution of SQL queries against the database.

  • plot_tool: An instance of BasePlotTool that provides plotting capabilities for data visualization.

Additional arguments

  • session_db_path: An optional string parameter specifying the path to store session-related data.

  • include_tables: An optional list of tables to include in the database operations, used for filtering available tables.

  • exclude_tables: An optional list of tables to exclude from database operations, used for filtering available tables.

  • auto_filter_tables: An optional boolean flag that likely controls automatic table filtering behavior.

  • route_worker_kwargs: An optional dictionary containing additional keyword arguments for different worker routes (query, analyse, plot, etc.).

Running BaseLine agent

Here is simply how you can run the agent:

For Querying to Database

output = agent(question="/query show me all the rows in votes")
# If you want to see the pandas dataframe
output.show_output_dataframe()

For Analysing over the recent output dataframe

analysis = agent("/analyse what do you understand from the table")
# If you want to see the analysis
print(analysis.analysis)
# If you want to see the reasoning over the analysis
print(analysis.reasoning)

Here your agent analyses the latest dataframe which was being output using the /query route. It searches for the latest output and feed that as the input and then provide the analysis.

For plotting charts

plot = agent("/plot the relation between creation date and vote type id") # This will generate a matplotlib plot

Here your agent analyses the latest dataframe which was being output using the /query route. It searches for the latest output and feed that as the input and then go for selecting the columns of the latest input/output dataframe and selects which columns and which plot type needs to be choosen from user’s input. Current supported plot types are: scatter, line, area, histogram and bar plot.

Everything else followup route

agent("hello")

The followup route tells you either what to do next when you provide an input which does not match with any of the following routes shown above or when the overall agentic workflow fails.

Modes of Agent run

There are three modes in which your agent can run. One way to run these agent is using this line of code:

output = agent(question="/query show me all the rows in votes")
analysis = agent(question="/analyse what do you understand from the table")
plot = agent(question="/plot the relation between creation date and vote type id")
followup = agent(question="hello")

Here you will get an output which is a Pydantic model named ExitWorkerOutput meaning, when one conversation cycle finishes it accumulates all the information and outputs it in a structural format. Here are all the information ExitWorkerOutput provides you:

class ExitWorkerOutput(BaseModel):
    """
    Comprehensive output model that combines results from all pipeline workers.
    
    Attributes:
        session_name (str): Unique identifier for the processing session
        question (str): Original user question that initiated the pipeline
        db_connection_uri (str): Database connection string used for queries
        route_taken (Literal["plot", "analyse", "query", "followup"]): Primary processing path chosen
        
        # Text2SQL Worker Results
        sql_string (Optional[str]): Generated SQL query if route_taken is "query"
        sql_reasoning (Optional[str]): Explanation of SQL generation process
        sql_input_dataframe (Optional[Dict]): Input data used for SQL generation
        sql_output_dataframe (Optional[Dict]): Results of SQL query execution
        error_from_sql_worker (Optional[str]): Any errors from SQL processing
        
        # Analysis Worker Results
        analysis (Optional[str]): Generated analysis text if route_taken is "analyse"
        analysis_reasoning (Optional[str]): Explanation of analysis process
        analysis_input_dataframe (Optional[Dict]): Data used for analysis
        error_from_analysis_worker (Optional[str]): Any errors from analysis processing
        
        # Plot Worker Results
        plot_config (Optional[Dict]): Chart configuration if route_taken is "plot"
        plot_input_dataframe (Optional[Dict]): Data used for plotting
        plot_output_dataframe (Optional[Dict]): Processed data for visualization
        image_to_plot (Optional[str]): Generated plot image (usually base64 encoded)
        plot_reasoning (Optional[str]): Explanation of plotting decisions
        error_from_plot_worker (Optional[str]): Any errors from plot generation
        
        # Followup Worker Results
        followup_route_to_take (Optional[Literal]): Suggested next action
        followup_suggestion (Optional[str]): Generated followup question/suggestion
        error_from_followup_worker (Optional[str]): Any errors from followup processing
        
        additional_input (Optional[Dict]): Any additional parameters used in processing
    """

Now if you feel this is very overwhelming then you can just toogle server_mode to be on while calling the agent which looks like this:

output = agent(question="/query show me all the rows in votes", server_mode=True)
analysis = agent(question="/analyse what do you understand from the table", server_mode=True)
plot = agent(question="/plot the relation between creation date and vote type id", server_mode=True)
followup = agent(question="hello", server_mode=True)

All of them outputs a Pydantic model named AgentOutput Which is a less granular form of ExitWorkerOutput which contains the following attributes:

class AgentOutput(BaseModel):
  """Final output model that provides a streamlined view of the pipeline results.
  
  Attributes:
      session_name (str): Unique identifier for the processing session
      question (str): Original user question that initiated the pipeline
      db_connection_uri (str): Database connection string used for queries
      route_taken (Literal["plot", "analyse", "query", "followup"]): Primary processing path chosen
      
      # Data Fields
      input_dataframe (Optional[Dict]): Input data used for processing
      output_dataframe (Optional[Dict]): Final processed data in {"columns": [...], "data": [[...]]} format
      
      # Route-specific Results
      sql_string (Optional[str]): Generated SQL query if route_taken is "query"
      analysis (Optional[str]): Analysis text if route_taken is "analyse"
      reasoning (Optional[str]): Explanation of processing decisions
      plot_config (Optional[Dict]): Chart configuration if route_taken is "plot"
      image_to_plot (Optional[str]): Generated plot image data
      
      # Followup Information
      followup_route (Optional[Literal]): Suggested next processing route
      followup_suggestion (Optional[str]): Generated followup question/suggestion
      
      # Error and Metadata
      error_from_pipeline (Optional[str]): Any errors encountered during processing
      created_at (datetime): Timestamp of result creation
    """

Please Note

When you call an agent like this: agent(...), for routes like /analyse and /plot then it assumes that it would do the analysis on the latest output dataframe which was output from /query plot. So suppose if you want an analysis from what how are the main topics of the comments table. Then you need to previously query the agent to output some dataframe that has the comments information.

However, there is another way to run agents, where you can explicitely add your “input dataframe” to do analysis or plotting. Here is how you do that:

# let's say you have some input pandas dataframe: df
analysis  = agent.run(question="some question", input_dataframe=df)

That’s it. Thats how you can explicitely provide dataframes to do some certain tasks. However this feature is not available in Agent Server

Workers inside Baseline

You can either use the BaseLine agent to run your things or you can also use each of the workers independently. Inside BaseLine there are mainly three workers as follows:

  1. Text to SQL Worker: This worker gets invoked when we start anything with /query.
  2. Analysis Worker: This worker gets invoked when we start anything with /analyse.
  3. Plot Worker: This worker gets invoked when we start anything with plot.

Now let’s explore each of the workers in Details.

Text to SQL Worker

The task of Text to SQL worker is pretty much self explainatory. This worker inside BaseLine agent get’s invoked when we write anything starting with /query. However if you use this worker independently then there is no need to do that. Here is how we define the Text to SQL worker inside BaseLine agent.

import os
from premsql.generators import Text2SQLGeneratorPremAI
from premsql.executors import SQLiteExecutor
from premsql.agents.baseline.workers import BaseLineText2SQLWorker

generator = Text2SQLGeneratorPremAI(
    model_name="gpt-4o",
    project_id=os.environ.get("PREMAI_PROJECT_ID"),
    experiment_name="testing_text2sql_worker",
    type="test",
    premai_api_key=os.environ.get("PREMAI_API_KEY")
)

text2sql_worker = BaseLineText2SQLWorker(
    db_connection_uri="",
    generator=generator,
    helper_model=generator,
    executor=SQLiteExecutor()
)

Arguments:

  • db_connection_uri (str): Database connection string to connect to the target database.
  • generator (Text2SQLGeneratorBase): Main model for generating SQL queries from text.
  • helper_model (Optional[Text2SQLGeneratorBase]): Secondary model for table filtering and error correction. Defaults to None.
  • executor (Optional[BaseExecutor]): Custom executor for running SQL queries. If None, uses default.
  • include_tables (Optional[list]): List of tables to specifically include in schema context. If None, includes all tables.
  • exclude_tables (Optional[list]): List of tables to exclude from schema context. If None, no tables are excluded.
  • auto_filter_tables (Optional[bool]): Whether to automatically filter relevant tables for each query. Defaults to False. This will call the model one more time and it is recommended to use when model is strong and you have very large number of schemas.

Now let’s understand how to run the worker.

response = text2sql_worker.run(
    question="What are the tables present inside the database"
)

Arguments:

  • question (str): The natural language query to be converted to SQL
  • additional_knowledge (Optional[str]): Extra context or information to help with query generation
  • fewshot_dict (Optional[dict]): Dictionary of example question-SQL pairs for few-shot learning
  • temperature (Optional[float]): Controls randomness in SQL generation (0.0 = deterministic, 1.0 = most random)
  • max_new_tokens (Optional[int]): Maximum number of tokens to generate in the SQL query
  • render_results_using (Optional[Literal["json", "dataframe"]]): Output format for query results
  • prompt_template (Optional[str]): Custom template for the main SQL generation prompt
  • error_handling_prompt_template (Optional[str]): Custom template for error correction prompts

This will output a Pydantic model which is as follows:

class Text2SQLWorkerOutput(BaseWorkerOutput):
    db_connection_uri: str          # Database connection string
    sql_string: str                 # Generated SQL query
    sql_reasoning: Optional[str]    # Explanation of the SQL generation process
    input_dataframe: Optional[Dict] # Input data used for generation
    output_dataframe: Optional[Dict] # Query results in format:
                                    # {
                                    #    "columns": [...],
                                    #    "data": [[...], ...]
                                    # }
    question: str                   # Original natural language question
    error_from_model: Optional[str] # Any errors encountered during execution
    additional_input: Optional[Dict] # Additional input parameters used

As you can see, by default the output_dataframe it returns is in JSON format. However you can convert it to pandas dataframe by the following method:

response.show_output_dataframe()

This will convert the json to pandas dataframe.

Analyser Worker

The Analyser Worker is responsible for analyzing data and providing insights from dataframes. In the BaseLine agent, this worker is invoked when queries start with /analyse. Here’s how to initialize the Analyser Worker:

import os
from premsql.generators import Text2SQLGeneratorPremAI
from premsql.agents.baseline.workers import BaseLineAnalyserWorker

generator = Text2SQLGeneratorPremAI(
    model_name="gpt-4",
    project_id=os.environ.get("PREMAI_PROJECT_ID"),
    experiment_name="testing_analyser_worker",
    type="test",
    premai_api_key=os.environ.get("PREMAI_API_KEY")
)

analyser_worker = BaseLineAnalyserWorker(
    generator=generator
)

Arguments:

  • generator (Text2SQLGeneratorBase): Model used for generating analysis from the data.

The Analyser Worker can process data in two ways: analyzing the entire dataframe at once, or breaking it into chunks for larger datasets. Here’s how to run the worker:

import pandas as pd

df = pd.DataFrame(...)  # Your input dataframe
response = analyser_worker.run(
    question="What insights can you provide about this data?",
    input_dataframe=df,
    do_chunkwise_analysis=True
)

Arguments:

  • question (str): The natural language question about what to analyze in the data
  • input_dataframe (pd.DataFrame): The dataframe to analyze
  • do_chunkwise_analysis (Optional[bool]): Whether to break analysis into chunks for large datasets. Defaults to False
  • chunk_size (Optional[int]): Number of rows per chunk when doing chunkwise analysis. Defaults to 20
  • max_chunks (Optional[int]): Maximum number of chunks to analyze. Defaults to 20
  • temperature (Optional[float]): Controls randomness in analysis generation (0.0 = deterministic). Defaults to 0.19
  • max_new_tokens (Optional[int]): Maximum number of tokens to generate in the analysis. Defaults to 600
  • analysis_prompt_template (Optional[str]): Custom template for the main analysis prompt
  • analysis_merger_template (Optional[str]): Custom template for merging chunk analyses
  • verbose (Optional[bool]): Whether to print detailed progress. Defaults to False

The worker returns a Pydantic model with the following structure:

class AnalyserWorkerOutput(BaseWorkerOutput):
    analysis: str                   # Generated analysis text
    input_dataframe: Optional[Dict] # Input data in format:
                                   # {
                                   #    "columns": [...],
                                   #    "data": [[...], ...]
                                   # }
    analysis_reasoning: Optional[str] # Explanation of the analysis process
    question: str                    # Original natural language question
    error_from_model: Optional[str]  # Any errors encountered during analysis
    additional_input: Optional[Dict]  # Additional input parameters used

You can convert the input_dataframe from JSON format to a pandas DataFrame using:

response.show_output_dataframe()

Chunked Analysis

For large datasets, the worker can break the analysis into chunks using do_chunkwise_analysis=True. This process:

  1. Splits the input dataframe into chunks of size chunk_size
  2. Analyzes each chunk separately
  3. Combines the chunk analyses using a merger prompt
  4. Returns a consolidated analysis

This is particularly useful when dealing with large datasets that might exceed token limits or when you want to ensure thorough analysis of all data points.

Chart Plot Worker

The Chart Plot Worker is responsible for generating visualization configurations and plotting data based on natural language queries. In the BaseLine agent, this worker is triggered when queries start with /plot. However, it can be used independently without this prefix. Here’s how to initialize the Chart Plot Worker in the BaseLine agent:

from premsql.generators import Text2SQLGeneratorPremAI
from premsql.agents.baseline.workers import BaseLinePlotWorker
from premsql.agents.tools.plot import PlotlyTool

generator = Text2SQLGeneratorPremAI(
    model_name="gpt-4",
    project_id=os.environ.get("PREMAI_PROJECT_ID"),
    experiment_name="testing_plot_worker",
    type="test",
    premai_api_key=os.environ.get("PREMAI_API_KEY")
)

plot_worker = BaseLinePlotWorker(
    generator=generator,
    plot_tool=PlotlyTool()
)

Arguments:

  • generator (Text2SQLGeneratorBase): Model for generating plot configurations from text
  • plot_tool (BasePlotTool): Tool for creating and rendering plots (e.g., PlotlyTool)

Now let’s see how to run the worker:

import pandas as pd

# Sample dataframe
df = pd.DataFrame({
    'date': ['2023-01-01', '2023-01-02', '2023-01-03'],
    'sales': [100, 150, 200]
})

response = plot_worker.run(
    question="Create a line plot showing sales over time",
    input_dataframe=df
)

Arguments:

  • question (str): Natural language description of the desired visualization
  • input_dataframe (pd.DataFrame): Data to be plotted
  • temperature (Optional[float]): Controls randomness in configuration generation (default: 0.1)
  • max_new_tokens (Optional[int]): Maximum tokens for configuration generation (default: 100)
  • plot_image (Optional[bool]): Whether to generate base64 image of plot (default: True)
  • prompt_template (Optional[str]): Custom template for plot configuration generation

The worker returns a Pydantic model with the following structure:

class ChartPlotWorkerOutput(BaseWorkerOutput):
    input_dataframe: Optional[Dict]  # Input data in format:
                                    # {
                                    #    "columns": [...],
                                    #    "data": [[...], ...]
                                    # }
    plot_config: Optional[Dict]      # Generated plot configuration
    image_plot: Optional[str]        # Base64 encoded plot image
    plot_reasoning: Optional[str]    # Explanation of plot generation
    output_dataframe: Optional[Dict] # Processed data used for plotting
    question: str                    # Original natural language question
    error_from_model: Optional[str]  # Any errors encountered
    additional_input: Optional[Dict]  # Additional input parameters used

The plot_config will contain the visualization configuration that can be used with the specified plotting tool (e.g., Plotly). The image_plot field contains a base64-encoded string of the rendered plot when plot_image=True.

Like the Text2SQL worker, you can convert the dataframes to pandas format using:

response.show_output_dataframe()

This will convert any JSON-formatted dataframe in the output to a pandas DataFrame.

Current State and Future Roadmap

PremSQL is actively evolving, with a strong foundation in reliable output generation and structured error handling. Here are some areas we’re currently working on to make PremSQL even better:

  1. Memory and Context Management

    • Current Implementation: Uses an efficient SQLite-based system for conversation tracking, with smart truncation for large datasets (first 200 rows) to maintain performance
    • Future Enhancement: Planning to add semantic search capabilities to enable more intelligent context retrieval and cross-reference previous conversations
  2. Visualization Capabilities

    • Current Implementation: Supports essential chart types with reliable rendering for common data visualization needs
    • Future Enhancement: Expanding to support more complex visualizations and comparative analysis features
  3. Retrieval and Search

    • Current Implementation: Direct and efficient table/data access
    • Future Enhancement: Integration with vector databases to enable more sophisticated semantic search and context understanding

These planned improvements will build upon PremSQL’s existing strengths in reliability and structured output handling, making it an even more powerful tool for database interactions.