Post v1chatcompletions
Creates a model response for the given chat conversation. Supports streaming with SSE, documentation here.
Authorizations
Body
The ID of the project to use.
A list of messages comprising the conversation so far.
The ID of the session to use. It helps to track the chat history.
Options for Retrieval Augmented Generation (RAG). Will override launched model settings
ID of the model to use. See the model endpoint compatibility table for details.
The system prompt to use.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency.
2 < x < 2
JSON object that maps tokens to an associated bias value from -100 to 100.
The maximum number of tokens to generate in the chat completion.
x > 0
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
-2 < x < 2
An object specifying the format that the model must output.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically.
Up to 4 sequences where the API will stop generating further tokens.
If set, partial message deltas will be sent, like in ChatGPT.
What sampling temperature to use, between 0 and 2.
0 < x < 2
An alternative to sampling with temperature, called nucleus sampling.
A list of tools the model may call. Currently, only functions are supported as a tool.
A unique identifier representing your end-user.
Response
A list of chat completion choices. Can be more than one if n is greater than 1.
The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.
The model to generate the completion.
The name of the provider that generated the completion.
The ID of the provider that generated the completion.
The usage statistics for the completion.
The trace ID of the completion.
Chunks used to improve the completion
Was this page helpful?