Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
An array of messages comprising the conversation so far. Must contain at least one message. System messages are only allowed as the first message.
1
The identifier of the model to use for generating completions. This can be a model ID or an alias.
A value between -2.0 and 2.0 that penalizes new tokens based on their frequency in the text so far. Higher values decrease the likelihood of the model repeating the same tokens.
-2 <= x <= 2
The maximum number of tokens to generate in the completion. If null, will use the model's maximum context length. This is the maximum number of tokens that will be generated.
x > 0
A value between -2.0 and 2.0 that penalizes new tokens based on whether they appear in the text so far. Higher values increase the likelihood of the model talking about new topics.
-2 <= x <= 2
A seed value for deterministic sampling. Using the same seed with the same parameters will generate the same completion.
One or more sequences where the API will stop generating further tokens. Can be a single string or an array of strings.
If true, partial message deltas will be sent as server-sent events. Useful for showing progressive generation in real-time.
Controls randomness in the model's output. Values between 0 and 2. Lower values make the output more focused and deterministic, higher values make it more random and creative.
0 <= x <= 2
An alternative to temperature for controlling randomness. Controls the cumulative probability of tokens to consider. Lower values make output more focused.
0 <= x <= 1
Specifies the format of the model's output. Use "json_schema" to constrain responses to valid JSON matching the provided schema.
A list of tools the model may call. Each tool has a specific function the model can use to achieve specific tasks.
Controls how the model uses tools. "none" disables tools, "auto" lets the model decide, or specify a particular tool configuration.
none
Response
chat completion response
A unique identifier for this chat completion response. Can be used for tracking or debugging.
An array of completion choices. Each choice represents a possible completion for the input prompt, though currently only one choice is typically returned.
The Unix timestamp (in seconds) indicating when this completion was generated by the API.
The specific model used to generate this completion. This will be the model's full identifier string.
The type of object returned, always "chat.completion" for chat completion responses.
chat.completion
A unique identifier for the system state that generated this response. Useful for tracking model behavior across requests.
Statistics about token usage for this request and response. May be omitted in error cases or when not available.