Before you get started

Use these cards to navigate through this repositories guide depending on your use case. We suggest you go through each section to get a better understanding of how to use Prem’s repositories.

Repositories are Prem’s way of utlizing easy to use RAG (Retrieval-Augmented Generation) with a model of your choice.

RAG is a powerful technique that combines the strengths of retrieval-based systems and generative models to enable assistants to provide accurate and contextually relevant responses based on the content of your documents.

Also in plain english, RAG gives your models the ability to interact with your documents, gain knowledge and use it to generate responses based on the content of your documents.

This guide will show you how to create RAG-based assistant without having to worry about the underlying infrastructure like integrating a vector database or setting up a retrieval pipeline.

All you need to do is provide your documents and focus on the fun part, building your custom models and putting shipping your apps.

image of the repositories page.

Prerequisites

  • Collect the documents that you’d like the AI model of your choice to interact with.

  • Get a basic understanding of Prem’s APIs and SDKs.

You can also connect services like AWS S3, Cloudeflare R2 or a self hosted Qdrant vector database to a repository.

Make sure you have access to the credentials of the service you want to connect to.

Keep in mind you can only connect one serivce to a repository. We will cover how to connect these serrvices in this guide.

Quick-Start: Manual Document Upload

Here is a step by step guide on how to upload documents to a repository manually.

Its really simple and only takes a few steps. Just drag and drop your documents or go through your file explorer/finder to upload your documents and we will take care of the rest.

1

Create A Repository and Upload Your Documents

The max file size is: 10MB.

Prem supports file extensions such as: pdf, txt, docx, md, sh, bash, zsh, js, ts, py, tf, tfvars, and tfstate.

Ensure that the documents are clean, well-structured, and free of any formatting issues.

  • Use the sidebar to navigate to the Repositories page. Click New Repository to create a new place to store your documents.
Create a new repository

You can also create a Repository using the SDK or the API.

read the SDK and API documentation for more information.

const response = await client.repositories.create({
    name: "test-repository",
    organization: "your-org-name",
    description: "This is a test repository"
}) 
  • Upload your documents to a repository using the Prem web app (click the add document button), API or using the SDK. You can either upload documents individually or in batches. Make sure to provide meaningful names or identifiers for your documents.
upload a new document to a repository

You can also upload documents using the SDK or the API.

const response = await client.repository.document.create({
    repository_id: 3,
    file: "./testfile.txt" // This could be a .txt, .pdf, .docx file.
})

console.log(response)
  • Once the documents are uploaded, Prem will automatically index them to enable fast and efficient retrieval.

The indexing process may take some time depending on the size and number of documents.

You can view the indexes of a document by clicking on each document in the repository as shown in the image below.

View the indexes of a document
2

Choose a Model and Configure Retrieval Settings

You have three ways to configure retrival settings. In the Lab for testing, in the Launchpad for production and in your code using the SDK or API for custom use cases.

Once you’ve uploaded your documents, you can choose a model and configure the retrieval settings. Head over to the Lab or the Launchpad if you’re ready to give a model you’re planning to put in production the knowledge of the doucments.

You can edit the following retrieval settings in the Lab,Launchpad or in your code using the SDK or API:

  • Limit: Sets the maximum number of top-matching documents to retrieve based on similarity.

  • Similarity: Measures how closely a query matches document embeddings. Higher values indicate greater similarity, ranging from 0 to 1.

Here’s a quick overview of the differences between the three ways to configure retrieval settings:

Configure in Lab

Test out how your model performs with the documents in the Lab playground. Click Repo and add a repository to each model you’d like to test. You can add up to 4 repositories to each model.

add a repository to the lab playground

Configure on the Launchpad

Head over to the Launchpad and click on the Add Repository button. Its right underneath the system prompt text area. You can also edit the retrieval settings in the Launchpad under Linked repositories. You can add up to 4 repositories to the model you are launching.

add a repository to a launchpad

Configure in Code

Use the SDK or API to configure the retrieval settings in your code. Its really simple and only takes a few lines of code. You can add up to 4 repositories to a model.

const messages = [
    { role: "user", content: "Which is Jack's pet name?" },
]

const repositories = {
  ids: [REPOSITORY_ID, ...],
  similarity_threshold: 0.65,
  limit: 3
}

// Create completion
const response = await client.chat.completions.create({
  project_id,
  messages,
  repositories,
  stream: false
})

console.log(response.choices[0].message.content)
// "Jack's pet name is Sparky."

console.log(response.document_chunks)
// E.g. [{ repository_id=4, document_id=14, chunk_id=15, document_name="pets_and_their_owners.txt", similarity_score=0.67, content="..." }, ...]

3

Testing and Evaluation

Before deploying your configured model, it’s crucial to thoroughly test and evaluate its performance in the Lab.

  • Test Queries: Prepare a set of test queries that cover different scenarios and edge cases. Send these queries to your assistant and evaluate the quality and relevance of the generated responses.

  • Iterative Refinement: Based on the evaluation results, iteratively refine your assistant by adjusting the retrieval settings, fine-tuning the generative model, or improving the document collection.

You can see which documents were used as a verified source at the bottom of the model’s response.
View the verified source of a response when using a repository in the lab
4

Deployment and Integration

Once you’re satisfied with the performance of your models, you can deploy them with the Launchpad and integrate your application with the Prem SDKs or API.

Advanced Guide

Advanced Configurations

When creating a repository, you can configure the following advanced settings:

  • Connector: Connect your repository to an external service like AWS S3, or Cloudflare R2.

  • Vector Store Type: Choose between Prem’s Vector DB or your own self hosted Qdrant instance.

  • Strategy: Choose between Default RAG or Hybrid RAG strategies.

Default RAG and Hybrid RAG are two different strategies for using a repository. Here’s a quick overview of the differences:

Default RAG (Tradtional)

  • Single retrieval step before generation
  • Simple implementation
  • Best for straightforward Q&A

Hybrid RAG

  • Multiple retrieval steps and/or methods
  • Can combine different retrieval approaches
  • Better for complex queries
  • More robust but complex implementation
  • Examples: recursive retrieval, self-querying, mixed knowledge sources
  • Embedding Model: Choose between the following Embedding Models to use for your repository:

    • text-embedding-3-large
    • text-embedding-3-small
    • text-embedding-ada-002

Learn more about embeddings here.

An embedding⁠ is a sequence of numbers that represents the concepts within content such as natural language or code. Embeddings make it easy for machine learning models and other algorithms to understand the relationships between content and to perform tasks like clustering or retrieval. They power applications like knowledge retrieval in both ChatGPT and the Assistants API, and many retrieval augmented generation (RAG) developer tools.

Connecting A Repository to AWS S3

You need the following parameters to connect your repository to AWS S3:

You can find the connector name, bucket name, access key and secret key in the AWS S3 Console.

  • Connector Name: Your AWS S3 connector name.

  • Bucket Name: Your AWS S3 bucket name.

  • Access Key: Your AWS S3 access key.

  • Secret Key: Your AWS S3 secret key.

Connect a repository to AWS S3

Connecting A Repository to Cloudflare R2

You need the following parameters to connect your repository to Cloudflare R2:

You can find the connector name, bucket name, access key and secret key in the Cloudflare R2 Dashboard.

  • Connector Name: Your Cloudflare R2 connector name.

  • Bucket Name: Your Cloudflare R2 bucket name.

  • Account ID: Your Cloudflare R2 account ID.

  • R2 Access Key: Your Cloudflare R2 access key.

  • R2 Secret Key: Your Cloudflare R2 secret key.

Connect a repository to Cloudflare R2

Using your own self hosted Qdrant Vector DB

You need the following parameters to connect your repository to your self hosted Qdrant Vector DB:

  • API URL: Your Qdrant API URL.

  • API Key: Your Qdrant API key.

You only need the API key if a password is required to access the Qdrant instance.

Connect a repository to your self hosted Qdrant Vector DB