How to Use Response Contract Evidence to Harden LLM API Failover

Last reviewed: 2026-07-15

Direct answer

Do not promote an LLM API failover route because one request returned HTTP 200. Promote it only after the exact endpoint and model you will use satisfy the response fields your application reads, and after controlled error requests are classified correctly. A bounded CometAPI validation used eight contract requests plus one authenticated model-catalog preflight. The non-streaming contracts passed. The two streaming checks were negative samples: both returned HTTP 200 and a complete SSE envelope, but neither returned content.

That distinction matters. Transport success proves that a server responded; it does not prove that the output is usable. A failover contract should separately assert HTTP status, object type, completion state, content, and error fields. This guide validates two specific CometAPI route contracts; it does not compare CometAPI reliability with other vendors.

Who this is for

This guide is for backend and platform engineers who need evidence before routing production traffic between Chat Completions and Responses endpoints. It focuses on small, repeatable contract checks rather than uptime, latency, price, quota, or provider-wide quality claims.

Key takeaways

A Responses request using gpt-5-nano-2025-08-07 returned HTTP 200, object response, status completed, and passed the content assertion.
A non-streaming Chat Completions request using gpt-4.1-nano returned HTTP 200, object chat.completion, finish reason stop, and passed the content assertion.
Invalid authorization returned HTTP 401 with code, message, and type in the error object.
Omitting messages or input returned HTTP 400 with error fields code, message, param, and type; the observed code was invalid_request.
Two Chat SSE calls returned three chunks and a terminal [DONE], but zero content and finish_reason=length at a 64-token cap. Treat that as a failed content contract, not a successful stream.

The exact observations above are recorded in the sanitized eight-case contract evidence artifact . It contains status, structure, usage totals, and pass/fail classifications only; it excludes credentials, raw request IDs, full prompts, and full responses. The record also discloses that the catalog preflight made total HTTP traffic nine requests, one above the intended eight-request ceiling; no further CometAPI requests were made.

Contract details to verify

Contract area	Value or assertion	Evidence
API base and endpoint paths	`https://api.cometapi.com/v1`; `/chat/completions`; `/responses`	Chat reference and Responses reference
Authorization	`Authorization: Bearer ${COMETAPI_KEY}`; never log the value	CometAPI docs
Required request fields	Chat: `model`, `messages`; Responses: `model`, `input`	Endpoint references above
Success fields	Chat: object, choice, content, finish reason; Responses: object, status, output text	Endpoint references plus the sanitized run
Error fields	Status class and present `code`, `message`, `type`; `param` may be present	Sanitized run
Streaming	Parse chunks and `[DONE]`, then separately require usable content and an accepted finish state	Chat reference plus the sanitized negative sample
Model IDs	Confirm against `/v1/models` immediately before the test	Model catalog and sanitized run preflight; availability beyond that point is not asserted
Rate limits and billing	Not asserted; verify in the account and current product documentation	No claim made by this bounded contract test

Run the happy-path contracts

Load the key from a secret source, then run the Responses check:

curl --fail-with-body --silent --show-error \
  https://api.cometapi.com/v1/responses \
  -H "Authorization: Bearer ${COMETAPI_KEY}" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-5-nano-2025-08-07",
    "input": "Reply exactly: OK",
    "max_output_tokens": 64
  }'

For a Chat Completions route, use a separate assertion path:

curl --fail-with-body --silent --show-error \
  https://api.cometapi.com/v1/chat/completions \
  -H "Authorization: Bearer ${COMETAPI_KEY}" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-4.1-nano",
    "messages": [{"role": "user", "content": "Reply exactly: OK"}],
    "max_completion_tokens": 64
  }'

The saved evidence contained structure only, not full prompts, generated text, credentials, request IDs, or price data:

{
  "responses": {
    "http_status": 200,
    "object": "response",
    "status": "completed",
    "content_assertion": "passed"
  },
  "chat": {
    "http_status": 200,
    "object": "chat.completion",
    "finish_reason": "stop",
    "content_assertion": "passed"
  }
}

Assert the error contract

Use a deliberately invalid credential only for the authorization check. Never modify or expose the real key:

curl --silent --show-error \
  https://api.cometapi.com/v1/chat/completions \
  -H 'Authorization: Bearer intentionally-invalid' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-5-nano-2025-08-07",
    "messages": [{"role": "user", "content": "Reply exactly: OK"}],
    "max_completion_tokens": 8
  }'

The observed result was HTTP 401 with an error object containing code, message, and type. A missing required request field produced HTTP 400, added param, and used code=invalid_request. Assertions should check field presence and status class, not exact message text.

Treat streaming as a negative sample

The streaming probe used Chat Completions with stream=true and a 64-token cap. Both runs returned HTTP 200, three SSE chunks, and [DONE]. Both also returned zero content and finish_reason=length.

Streaming assertion	Observed	Verdict
HTTP response accepted	200 in both calls	Transport passed
SSE framing completed	3 chunks and `[DONE]`	Framing passed
Content delta present	0 content characters	Failed
Completion state usable	`length` at cap 64	Failed

Do not route production traffic on the strength of this streaming result. Investigate model behavior, token budgeting, and parser expectations, then rerun a bounded check before enabling the route.

Promotion assertion table

Contract area	Required assertion	Observed evidence	Promotion decision
Responses success	200, `response`, `completed`, usable content	Passed	Eligible for a canary
Chat success	200, `chat.completion`, `stop`, usable content	Passed	Eligible for a canary
Authorization failure	401 and parseable error fields	Passed	Client can classify auth failure
Missing input	400, `invalid_request`, `param` present	Passed	Client can classify request failure
Chat streaming	SSE completion plus usable content	Content failed twice	Hold streaming failover

Failure modes and boundaries

Stop if a model or endpoint differs from the intended production route.
Do not convert a 200 response into a pass until the required output field is non-empty.
Do not retry 400 or 401 checks to make them green; fix the request or credential boundary.
Do not infer account balance, quota, latency, uptime, or general model quality from this eight-case validation. Published token prices only establish that the conservative token-cost upper bound was below the one-dollar budget; the final billed amount was not inspected.
Keep full prompts, full responses, credentials, and request IDs out of shared evidence.

For the authorization-specific sequence, continue with Check CometAPI Authorization Before Fallback Routing . For the final routing decision, use Build a CometAPI Fallback Evidence Checklist .

Sources checked

CometAPI documentation — accessed 2026-07-15 for the API base and navigation.
Chat Completions reference — accessed 2026-07-15 for request and response fields.
Responses reference — accessed 2026-07-15 for the Responses contract.
CometAPI model catalog — accessed 2026-07-15 before selecting the bounded test models.
Sanitized eight-case contract evidence artifact — accessed 2026-07-15 for bounded status, structure, error-field, usage, and streaming-negative evidence.

Reader next step

Compare CometAPI models , then create or review an API key before running the checks from a controlled client.

FAQ

Does HTTP 200 prove a failover route works?

No. It proves transport acceptance. The output object, completion state, and content assertion must also pass.

Should the streaming route be enabled now?

No. Two bounded calls completed their SSE envelopes but returned no content. Keep streaming failover disabled until a new test produces usable content and the parser assertions pass.