Welcome to the PremSQL library, a powerful tool for building self-hosted, end-to-end Text-to-SQL pipelines. PremSQL offers a modular design where each component functions independently, enabling you to create fully customized workflows.

PremSQL GitHub

Star the project to stay updated with our rapid development for the best local Text-to-SQL solution.

News

  • [Sep 10th 2024] First release of PremSQL
  • [Sep 10th 2024] First release of Prem-1B-SQL (fully local Text to SQL model)

Core Components

All the components works indepedently and aims to achieve a single task.It is recommended to check out each of the components sequentially to get an overall idea and how to use them to the fullest.

Why PremSQL? The Vision

PremSQL focuses on creating local Text-to-SQL workflows. Many times, you don’t want to share data with third-party systems, but still need to build generative AI solutions for productivity and innovation. PremSQL is designed to solve this by keeping your data local.

Use Cases:

  • Database Q&A
  • RAG with database integration
  • SQL autocompletion
  • AI-powered, self-hosted data analysis
  • Autonomous agentic pipelines with database access

How is it different from LangChain, Llama-Index, or DSPy?

While these libraries excel in building general AI workflows, they often come with a steep learning curve for customization. PremSQL simplifies this, providing full control over your data and a smooth integration with existing LangChain, Llama-Index, or DSPy workflows.

Getting Started

Install PremSQL with:

pip install -U premsql

Here’s a quick starter code to chat with a sample database:

from premsql.pipelines import SimpleText2SQLAgent
from premsql.generators import Text2SQLGeneratorHF
from premsql.executors import SQLiteExecutor

dsn_or_db_path = "./data/db/california_schools.sqlite"

agent = SimpleText2SQLAgent(
    dsn_or_db_path=dsn_or_db_path,
    generator=Text2SQLGeneratorHF(
        model_or_name_or_path="premai-io/prem-1B-SQL",
        experiment_name="simple_pipeline",
        device="cuda:0",
        type="test"
    ),
)

question = "please list the phone numbers of the direct charter-funded schools that are opened after 2000/1/1"

response = agent.query(question)
response["table"]

Explore more detailed tutorials and learn about PremSQL’s offerings and future plans below.

Roadmap

We are excited to announce the successful rollout of the first release of the PremSQL library. Alongside the release, we are committed to continuously improving the existing documentation to enhance the overall developer experience.

  • Synthesizer Component:
    A significant feature of PremSQL is the synthesizer component, designed to generate synthetic datasets from private data. This capability allows for fine-tuning smaller language models, enabling fully private text-to-SQL workflows that safeguard sensitive data.

  • Agentic Pipelines with Function-Calling Features:
    Future releases will incorporate advanced agentic methods with new features, including graph plotting capabilities, natural language analysis, and other enhancements to increase the system’s versatility and power.

  • Training Better Small Language Models:
    We are actively training small language models tailored specifically to PremSQL’s unique requirements. These models will be continually refined and optimized, ensuring they become more efficient and effective in handling designated tasks.

  • Optimization of Generators and Executors:
    Efforts are underway to optimize existing components, such as generators and executors, to enhance their robustness. Planned improvements include parallel processing, significantly speeding up generation and execution times, making the overall system more efficient.

  • Stability and UI Enhancements:
    As we move forward, we aim to include comprehensive stability tests for the entire library. A simple UI will also be rolled out to further improve user interaction and accessibility.

We invite you to join us in our open-source initiative! Your contributions, feedback, and issue submissions are invaluable in helping us grow. For more details on how to contribute, please refer to our contributing guidelines.

Stay tuned and follow our GitHub repository for the latest updates and improvements!