Configuration via Portal
You can modify your configuration in the Intent Overview page.
Passing parameters into your Chat Completion request will override what you set in the Portal.
Parameters Available in Portal
Setting this to
Client
will enable provider calls client-side through the SDK, using your provider keys. Only available for OpenAI models.Server-side inference will tell Maitai to make the request to the provider from our servers.Enable Evaluations to monitor your LLM output for faults.
Corrections are only available for Server inference, and require Evaluations to be turned on.
Safe mode prioritizes accuracy over latency. Only available when Apply Corrections is enabled.
Primary AI model to be used for inference. Models are grouped by provider and sorted alphabetically, with Maitai models (recommended) appearing first.
Optional fallback model to use if the primary model is not available or has degraded performance.
Strategy to use when falling back to secondary model. Options:
reactive
: Falls back on primary model failure- If a timeout is specified, it will initiate the fallback request after the timeout period
timeout
: Falls back after specified timeout periodfirst_response
: Uses whichever model responds first- If a timeout is specified, it will only use the fallback model after the timeout period
Timeout in seconds before falling back to secondary model. Only applicable when using the “timeout” fallback strategy.
Controls randomness in the model’s output. Range 0-2, where:
- Lower values (e.g., 0): More deterministic output
- Higher values (e.g., 2): More random output
Sequences where the API will stop generating further tokens.
Include the log probabilities of the output tokens.
How many completions to generate for each prompt.
The maximum number of tokens to generate in the completion.
Range -2 to 2. Increases the model’s likelihood to talk about new topics.
Range -2 to 2. Decreases the model’s likelihood to repeat the same line verbatim.