POST
/
v1
/
chat
/
completions
curl --request POST \
  --url https://premai.io/v1/chat/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "project_id": 123,
  "session_id": "<string>",
  "repositories": {
    "ids": [
      123
    ],
    "limit": 3,
    "similarity_threshold": 0.5
  },
  "messages": [
    {
      "role": "user",
      "content": "<string>"
    }
  ],
  "model": "<string>",
  "system_prompt": "<string>",
  "frequency_penalty": 2,
  "logit_bias": {},
  "max_tokens": 1,
  "presence_penalty": 0,
  "response_format": {},
  "seed": 123,
  "stop": "<string>",
  "stream": true,
  "temperature": 1,
  "top_p": 123,
  "tools": [
    {}
  ],
  "user": "<string>"
}'
{
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "user",
        "content": "<string>"
      },
      "finish_reason": "<string>"
    }
  ],
  "created": 123,
  "model": "<string>",
  "provider_name": "<string>",
  "provider_id": "<string>",
  "document_chunks": [
    {
      "repository_id": 123,
      "document_id": 123,
      "chunk_id": 123,
      "document_name": "<string>",
      "similarity_score": 123,
      "content": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "trace_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a"
}

Authorizations

Authorization
string
header
required

Body

project_id
integer
required

The ID of the project to use.

messages
object[]
required

A list of messages comprising the conversation so far.

session_id
string

The ID of the session to use. It helps to track the chat history.

repositories
object

Options for Retrieval Augmented Generation (RAG). Will override launched model settings

model
string

ID of the model to use. See the model endpoint compatibility table for details.

system_prompt
string

The system prompt to use.

frequency_penalty
number

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency.

Required range: 2 < x < 2
logit_bias
object | null

JSON object that maps tokens to an associated bias value from -100 to 100.

max_tokens
integer | null

The maximum number of tokens to generate in the chat completion.

Required range: x > 0
presence_penalty
number

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.

Required range: -2 < x < 2
response_format
object | null

An object specifying the format that the model must output.

seed
integer | null

This feature is in Beta. If specified, our system will make a best effort to sample deterministically.

stop
string | null

Up to 4 sequences where the API will stop generating further tokens.

stream
boolean

If set, partial message deltas will be sent, like in ChatGPT.

temperature
number | null

What sampling temperature to use, between 0 and 2.

Required range: 0 < x < 2
top_p
number | null

An alternative to sampling with temperature, called nucleus sampling.

tools
object[]

A list of tools the model may call. Currently, only functions are supported as a tool.

user
string | null

A unique identifier representing your end-user.

Response

200
application/json
choices
object[]
required

A list of chat completion choices. Can be more than one if n is greater than 1.

created
integer
required

The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.

model
string
required

The model to generate the completion.

provider_name
string
required

The name of the provider that generated the completion.

provider_id
string
required

The ID of the provider that generated the completion.

usage
object
required

The usage statistics for the completion.

trace_id
string
required

The trace ID of the completion.

document_chunks
object[]

Chunks used to improve the completion