Introduction
Welcome to the PremSQL library, a powerful tool for building self-hosted, end-to-end autonomous data analysis pipelines powered by Text to SQL. PremSQL offers a modular design where each component functions independently, enabling you to create fully customized workflows. Watch our Quick Demo of the latest PremSQL Agent Server and Playground:
Core Components
Each component works independently and is designed to accomplish a specific task. While we recommend exploring the components sequentially to gain a comprehensive understanding, it’s not mandatory. Since components operate independently, you can focus on those that meet your immediate needs and return later for a deeper dive into others.
PremSQL GitHub
Star the project to stay updated with our rapid development of the best local Text-to-SQL solution.
News
- [Sep 10th 2024] Initial release of PremSQL
- [Sep 10th 2024] Launch of Prem-1B-SQL (fully local Text to SQL model)
- [Oct 30th 2024] Prem-1B-SQL surpassed 5K+ downloads
- [Nov 5th 2024] Release of PremSQL Playground, Agents, and AgentServer
- [Nov 10th 2024] Release of Prem-1B-SQL Ollama model with Ollama support
Installation
Start by creating a virtual environment and installing PremSQL:
Note: We currently recommend using Python virtualenv instead of conda, as some users have reported compatibility issues with conda environments.
Note
The latest PremSQL update doesn’t include pre-installed dependencies to accommodate backend variations and maintain a lighter package. Choose your preferred backend:
For Hugging Face Transformers:
For Apple MLX backend:
For Ollama integration, first install Ollama, then install the Python client:
PremSQL is designed to be versatile and hackable, with a simple code structure and decoupled components. Here are the main ways to use it:
-
Use PremSQL’s pre-built Agent UI with our baseline agent to analyze CSVs, databases, or Kaggle datasets (as demonstrated in the demo video)
-
Leverage PremSQL as a Python library to:
-
Run the PremSQL backend API server and integrate it with your preferred programming language
Quick Start
Let’s explore how to use PremSQL’s latest baseline agent with Ollama. We’ve chosen Ollama for this guide because it’s easy to set up, requires minimal computational resources, and runs everything locally at no cost. However, you can also use Apple MLX, Hugging Face Transformers, or other supported backends.
PremSQL installation with Ollama and model downloads
First, ensure PremSQL is installed with the Ollama client. If you haven’t done so, follow the installation instructions above. We’ll use two models: Prem-1B-SQL
and Llama3.2 1B
. Download both models using these commands:
Launch PremSQL Server and Agent UI
PremSQL includes a CLI tool for managing the backend API server and Agent UI. Running premsql
in your terminal displays:
This confirms that PremSQL is installed correctly. Verify you have version 0.1.11
or higher. Launch both the backend API server and playground with:
On first run, it will execute database migrations before starting the server and Streamlit agent UI. A successful launch looks like this:
You can now use pre-built datasets, import CSVs, or import from Kaggle. Let’s try analyzing this student performance dataset from Kaggle.
Import a dataset from Kaggle
To import a Kaggle dataset into PremSQL, ensure it contains only CSV files (multiple files are supported). Simply copy the dataset ID (in this case, spscientist/students-performance-in-exams
) and paste it into the Upload csvs or use Kaggle
field in the PremSQL navigation. After submission, you’ll see:
You’ll now see a starter code template specific to your chosen backend.
Start a PremSQL analysis session
For this demo, we’ll use the Ollama starter code. Create a new file anywhere and add this code:
Run this code in your terminal within your PremSQL environment:
You should see FastAPI server output similar to:
This confirms that PremSQL is installed correctly. Verify you have version 0.1.11
or higher. Launch both the backend API server and playground with:
On first run, it will execute database migrations before starting the server and Streamlit agent UI. A successful launch looks like this:
You can now use pre-built datasets, import CSVs, or import from Kaggle. Let’s try analyzing this student performance dataset from Kaggle.
Import a dataset from Kaggle
To import a Kaggle dataset into PremSQL, ensure it contains only CSV files (multiple files are supported). Simply copy the dataset ID (in this case, spscientist/students-performance-in-exams
) and paste it into the Upload csvs or use Kaggle
field in the PremSQL navigation. After submission, you’ll see:
You’ll now see a starter code template specific to your chosen backend.
Start a PremSQL analysis session
For this demo, we’ll use the Ollama starter code. Create a new file anywhere and add this code:
Run this code in your terminal within your PremSQL environment:
You should see FastAPI server output similar to:
Copy the localhost URL (http://localhost:8162
) and paste it here:
Note
This is a starter implementation using our baseline agent. You can create custom agents with different functionalities (within data analysis scope) by extending this code. The snippet above demonstrates our baseline implementation for Autonomous Analysis agents.
You’re all set! You can now perform analysis on various data sources like CSVs, Databases and Kaggle csv datasets.
That’s how simple it is! From here, explore the many features PremSQL offers:
PremSQL Datasets
Pre-processed datasets hosted on HuggingFace for Text-to-SQL tasks. Ideal for evaluation, fine-tuning, and creating custom datasets.
PremSQL Generators
Models that transform natural language input into SQL queries based on your database schema.
PremSQL Executors
Connects to databases and executes generated SQL queries to fetch results.
PremSQL Evaluators
Evaluates Text-to-SQL models using metrics like execution accuracy and Valid Efficiency Score (VES).
PremSQL Error Handling
Creates error handling prompts and datasets to enhance inference reliability and self-correction capabilities.
PremSQL Tuner
Fine-tunes open-source models on Text-to-SQL datasets with custom evaluation methods for optimal performance.
PremSQL Agents
End-to-end agentic workflows for querying, analyzing, and visualizing database insights using natural language. Supports custom implementations for specialized use cases.
PremSQL Playground
A ChatGPT-like interface specialized for database interactions. Deploy PremSQL agents with customized configurations for an interactive experience.
Why PremSQL? The Vision
PremSQL is focused on creating local Text-to-SQL workflows. In many scenarios, organizations need to maintain data privacy while leveraging generative AI solutions for productivity and innovation. PremSQL addresses this need by keeping your data entirely local.
Key Use Cases:
- Interactive database querying and analysis
- RAG systems with database integration
- Intelligent SQL autocompletion
- Self-hosted AI-powered data analysis
- Autonomous agentic pipelines with secure database access
How is it different?
While many libraries excel at building general AI workflows, they often present a steep learning curve for customization. PremSQL simplifies this process, giving you complete control over your data while seamlessly integrating with existing LangChain, Llama-Index, or DSPy workflows.
Join Our Community
We invite you to participate in our open-source initiative! Your contributions, feedback, and issue reports are crucial to our growth. For more information on how to contribute, please check our contributing guidelines.
Stay connected and follow our GitHub repository for the latest updates and improvements!
Was this page helpful?