PUT /chat/completions/response
Option A (recommended): Use Maitai SDK with client-side inference + storage
If you’re okay with Maitai wrapping your provider client, you can run inference locally and have the SDK automatically store the request/response in Maitai. Key flags:server_side_inference = false(don’t route inference through Maitai)evaluation_enabled = false(store-only; no evaluation)
Option B: Bring your own client, and upload request+response for storage only
If you already have an LLM response from your own client, you can upload the request/response pair directly to Maitai for indexing.Required fields
chat_completion_request.application_ref_namechat_completion_request.action_type(this is the “Intent” / “ApplicationAction”)chat_completion_request.session_id(strongly recommended so the traffic threads into Sessions)chat_completion_request.params(the model inputs; at minimum includemessagesandmodel)chat_completion_response(OpenAI-stylechat.completionresponse shape)
cURL example
Notes
- This endpoint is storage/indexing only. It does not run inference, evaluations, or corrections.
- Maitai stores these as PROD traffic and marks inference location as CLIENT server-side.