Source pack
This is an in-place refresh of the existing CometAPI fallback runbook URL. The slug is preserved exactly, and this draft does not create a new URL.
| Source | How it is used in this draft |
|---|---|
| Existing CometAPI fallback runbook page being refreshed | Historical context for the refresh target and prior operator intent. Not treated as the primary API contract. |
| CometAPI documentation home | Entry point for verifying the current base URL, authentication guidance, documentation navigation, and account-level documentation. |
| CometAPI text chat API documentation | Primary source to verify the chat completion request contract, response contract, endpoint path, and documented behavior before deployment. |
| CometAPI support help center | Source to verify support and escalation paths when failures cannot be resolved through client-side fallback. |
Source coverage status: pending_review. Exact endpoint paths, auth header formats, model identifiers, rate limits, and billing details must be verified from the linked CometAPI documentation before this runbook is treated as production-final.
Intent brief
- Audience: SREs, platform engineers, backend engineers, and AI application owners who route chat completion requests through CometAPI and need controlled failover behavior.
- User intent: Learn how to design a practical fallback runbook for CometAPI chat completions without assuming undocumented endpoint, auth, rate-limit, model, or billing details.
- Operational job: Decide when to retry, when to fail over to a validated fallback model or provider path, when not to fail over, and what to log so incidents can be reconstructed.
- Out of scope: Current pricing, provider rankings, benchmark claims, exact model availability, or guaranteed uptime claims. Verify those directly in current CometAPI materials and account configuration.
CometAPI chat completions fallback runbook
Last reviewed: 2026-06-06.
Who this is for: This runbook is for operators responsible for chat completion reliability when CometAPI is part of the production request path.
A fallback runbook should not be a blind “try another model” rule. For chat completions, failover changes latency, cost exposure, answer style, safety behavior, and sometimes the contract your downstream code expects. Treat the CometAPI text chat API documentation as the source of truth for the request and response contract, and use the CometAPI documentation home to verify the current API entry points before any production test.
Key takeaways
- Build fallback around classified signals, not a single catch-all exception handler.
- Do not fail over invalid client requests, bad authentication, schema mistakes, or other defects that another model path will not fix.
- Keep primary and fallback paths contract-compatible: prompt format, response parsing, streaming policy, tool use, schema requirements, and token budget must be validated.
- Record every attempt under one trace so the system stores one final selected answer and can explain why earlier attempts were abandoned.
- Verify endpoint paths, auth headers, request fields, response fields, error behavior, rate-limit behavior, and billing assumptions from CometAPI sources before enabling automated fallback.
Concise definition
A chat completions fallback runbook is an operator-approved sequence for detecting request-path degradation, classifying the failure, deciding whether fallback is allowed, calling a pre-validated alternate path, validating the returned answer, and recording the final outcome.
For CometAPI, the runbook should begin with contract verification against the CometAPI text chat API documentation, then move into client-side reliability controls such as timeouts, retry budgets, circuit breakers, and alert routing.
Contract details to verify
Do not copy these values from memory or from another provider’s examples. Verify each item against the linked CometAPI source before rollout.
| Contract area | What to verify before production use | Primary source |
|---|---|---|
| Endpoint paths | Verify the current CometAPI base URL and the documented chat completion path. Do not hard-code an endpoint path from an old client, blog post, or internal note without checking the current docs. | CometAPI text chat API documentation |
| Auth headers | Verify the exact auth header name, scheme, API key placement, and whether any account or organization headers are required. | CometAPI documentation home |
| Request fields | Verify the required model selector field, message payload structure, role/content representation, streaming options, tool options, response-format options, and any unsupported fields. | CometAPI text chat API documentation |
| Response fields | Verify the assistant output location, finish or stop indicators, usage metadata if exposed, request identifiers if exposed, and how streaming responses differ from non-streaming responses. | CometAPI text chat API documentation |
| Error behavior | Verify documented error object shape, status-code semantics, retryable versus non-retryable conditions, and whether upstream provider errors are surfaced distinctly. | CometAPI text chat API documentation |
| Rate-limit or billing assumptions | Verify numeric limits, reset behavior, usage accounting, billing fields, and whether failed, retried, or fallback attempts can create chargeable usage. Do not infer these from response names alone. | CometAPI support help center |
Monitoring signals that should drive fallback
Use this table as a starting classifier. The action column is intentionally conservative: fallback should protect users from transient degradation, not hide client bugs.
| Signal class | What to inspect | Fallback posture | Operator action |
|---|---|---|---|
| Transport timeout | Client timeout, connection failure, TLS/DNS failure, gateway timeout, or no response body | Usually eligible after a small retry budget, if the request is safe to repeat | Check client timeout, network path, and recent regional failures. Use tuned thresholds, not universal constants. |
| HTTP 5xx-style failure | Server-side or gateway-style failure returned by the API path | Often eligible when the request was otherwise valid | Record status, error body, model path, and trace ID. Open or extend the circuit breaker if failures cluster. |
| Rate-limit-like response | Status, headers, or error body indicating throttling or quota pressure | Eligible only if the fallback path has separately verified capacity and billing behavior | Verify exact rate-limit semantics from CometAPI docs or support materials before automating. |
| Auth or account error | Authentication failure, permission failure, disabled key, or account state issue | Not eligible for ordinary fallback | Page the owning team. A second model call usually repeats the same account-level failure. |
| Client validation error | Invalid request shape, unsupported field, oversize payload, malformed messages, or schema mismatch | Not eligible | Fix the caller. Add preflight validation and contract tests against the current docs. |
| Model or route unavailable | Error indicates the selected model/path cannot currently serve the request | Eligible only if the fallback model/path has passed compatibility tests | Switch to a validated fallback path and notify owners if the condition persists. |
| Latency SLO breach | p95/p99 latency exceeds your own service objective over a tuned window | Sometimes eligible | Prefer circuit-breaker or adaptive routing. Avoid per-request hedging unless duplicate outputs and cost are acceptable. |
| Empty, truncated, or unparsable output | Response returns but cannot satisfy downstream contract | Eligible only after output validation confirms the primary answer is unusable | Retry or fallback with the same prompt contract. Store the invalid sample for debugging. |
| Partial streaming failure | Stream begins, then terminates early or violates parser expectations | Handle carefully; fallback may produce duplicate visible content | Decide whether users can see partial output. For user-visible streams, fallback after first token may be unsafe. |
Runbook sequence
1. Verify the CometAPI contract before coding fallback
Before enabling fallback logic, confirm the live contract from the CometAPI text chat API documentation:
- base URL and chat endpoint path;
- authentication header format;
- required request fields;
- optional request fields your app uses;
- response body paths your parser depends on;
- error-body shape and status semantics;
- streaming behavior, if your application streams;
- usage, rate-limit, and billing signals, if exposed.
Store these verified values in configuration, not scattered through business logic.
2. Define primary and fallback compatibility
A fallback candidate is not valid just because it can answer text. It must satisfy the same operational contract:
- accepts the same prompt structure or has a tested prompt adapter;
- supports your required response format;
- fits your token budget;
- meets your latency objective under load;
- is approved for the same data-handling and safety requirements;
- produces output your downstream parser can consume.
If any of those are unknown, mark the fallback candidate as not production eligible until validated.
3. Classify the first failure
On every failed or degraded primary attempt, classify the signal before taking action:
- Retryable transient: timeout, temporary server-side failure, or route unavailability.
- Capacity or quota pressure: rate-limit-like behavior that may or may not be safe to route around.
- Client defect: invalid request, bad payload shape, unsupported field, or parser mismatch.
- Auth or account defect: invalid key, missing permission, account state, or policy block.
- Output defect: response arrived but fails your contract validation.
Only the first, selected second, and selected fifth categories should normally reach fallback. Client defects and auth/account defects should go to owner action, not another model call.
4. Apply a bounded retry and fallback budget
Use a request-level budget that limits total time, total attempts, and duplicate generation risk. Example values must be tuned to your service objective and cost profile; they are not universal facts.
A conservative policy might be:
- one primary attempt;
- one retry only for clearly transient errors;
- one fallback attempt only after the request remains within the user-facing deadline;
- no fallback for non-idempotent tool calls or side effects unless the tool layer has its own idempotency controls.
The important rule is that the budget applies to the whole user request, not separately to each model path.
5. Execute a documented canary call
Use a sanitized canary that confirms your configured endpoint, auth header, request body, and parser. Replace all placeholders with values verified from the current CometAPI documentation. Do not paste production secrets into terminals or incident tickets.
TRACE_ID="<TRACE_ID_FROM_YOUR_OBSERVABILITY_SYSTEM>"
curl --request POST \
--url "<COMETAPI_BASE_URL_FROM_DOCS><COMETAPI_CHAT_PATH_FROM_DOCS>" \
--header "<AUTH_HEADER_FROM_DOCS>" \
--header "Content-Type: application/json" \
--max-time "<CLIENT_TIMEOUT_SECONDS_TO_TUNE>" \
--data-binary @- <<JSON
{
"<MODEL_FIELD_FROM_DOCS>": "<VALIDATED_MODEL_ID>",
"<MESSAGES_FIELD_FROM_DOCS>": [
{
"<ROLE_FIELD_FROM_DOCS>": "<SYSTEM_OR_DEVELOPER_ROLE_VALUE_FROM_DOCS>",
"<CONTENT_FIELD_FROM_DOCS>": "Return one sentence confirming the fallback canary path for trace ${TRACE_ID}."
},
{
"<ROLE_FIELD_FROM_DOCS>": "<USER_ROLE_VALUE_FROM_DOCS>",
"<CONTENT_FIELD_FROM_DOCS>": "Canary request. Do not include secrets."
}
]
}
JSON
Before using this in automation, confirm the request field names, role values, and endpoint values from the CometAPI text chat API documentation.
6. Validate the fallback response before returning it
A fallback answer should pass the same output checks as the primary answer:
- response parsed successfully;
- required content field exists;
- finish or stop condition is acceptable;
- structured output validates, if required;
- safety or policy checks pass;
- answer is not an empty placeholder;
- trace metadata links the final answer to all attempts.
If the fallback response fails validation, return a controlled application-level failure instead of silently presenting an unreliable answer.
7. Persist one final outcome
For each user request, store one final outcome and the full attempt chain:
- request trace ID;
- primary model/path identifier used by your configuration;
- fallback model/path identifier used by your configuration;
- status and latency for each attempt;
- error class and sanitized error body;
- final selected response source;
- whether a retry or fallback consumed extra budget;
- operator-facing reason code.
This makes incident review possible without exposing prompts or secrets unnecessarily.
Practical validation steps before enabling fallback
Run these tests in staging or a controlled canary environment before production rollout.
- Contract check: Compare your configured base URL, chat path, auth header, request fields, and parser paths against the current CometAPI docs.
- Primary success canary: Send a minimal chat completion request through the primary path and validate the response parser.
- Fallback success canary: Send the same safe canary through the fallback path and compare parser compatibility.
- Timeout injection: Force the primary request to exceed your client timeout and confirm the fallback budget behaves as expected.
- Server-error simulation: Stub or simulate a server-side failure and confirm only eligible errors trigger fallback.
- Auth failure simulation: Use a deliberately invalid credential in a non-production environment and confirm fallback does not run.
- Invalid request simulation: Send a malformed request in a safe test and confirm the client reports a contract failure instead of failing over.
- Rate-limit drill: If your account and docs expose rate-limit behavior, verify alerting and fallback rules without assuming undocumented header names.
- Streaming drill: If streaming is enabled, test failures before the first token and after partial output. Define which cases are allowed to fallback.
- Cost and usage review: Reconcile attempt counts against usage or billing records available to your account. If details are unclear, use the CometAPI support help center to confirm.
Decision rules to encode
Use explicit reason codes instead of opaque exception strings.
| Reason code | When to set it | Default action |
|---|---|---|
primary_timeout | Primary path exceeded the tuned client timeout | Retry or fallback if budget remains |
primary_transient_error | Primary returned a server-side or gateway-style transient error | Retry once or fallback based on circuit state |
primary_rate_limited | Primary returned verified rate-limit-like behavior | Fallback only if alternate capacity and billing are verified |
client_contract_error | Request shape, field, or parser contract is invalid | Do not fallback; fix caller |
auth_or_account_error | Credential, permission, or account-level issue | Do not fallback; escalate |
fallback_validation_failed | Fallback returned an unusable or unparsable answer | Return controlled failure and alert |
circuit_open | Recent failures exceed your tuned breaker threshold | Route to fallback until half-open probe succeeds |
Thresholds such as “two failures” or “three minutes” should be treated as examples to tune against your traffic, SLO, and cost tolerance. They are not CometAPI contract values unless current documentation explicitly says so.
Alerting and escalation
Create separate alerts for:
- sustained primary-path failure rate;
- fallback-path failure rate;
- fallback volume above normal baseline;
- auth or account errors;
- parser or schema failures;
- latency SLO breach;
- usage or spend anomaly after retries/fallback;
- streaming interruption rate, if streaming is used.
Escalate documentation or account questions through the CometAPI support help center. Use the existing refreshed page URL as the historical operator note, but use the current CometAPI docs for live contract decisions.
Related operational resources
FAQ
Should every chat completion error trigger fallback?
No. Fallback is appropriate for selected transient, capacity, route, or output-validation failures. It is usually wrong for invalid requests, broken auth, unsupported fields, or account-level problems.
Should I retry before falling back?
Often, yes, but only within a bounded request budget. A retry can clear a short transient issue; fallback can protect users when the primary path remains unhealthy. Both can increase latency and usage, so record them separately.
Can I fallback after a streaming response has started?
Be careful. Once the user has seen partial output, a fallback answer may duplicate or contradict the visible stream. Define a policy for pre-first-token failure, mid-stream interruption, and post-stream parser failure.
Can I hard-code the CometAPI chat endpoint from an old example?
No. Verify the current base URL and chat path from the CometAPI text chat API documentation before deploying. This refresh intentionally avoids hard-coding endpoint paths that were not quoted in the provided source pack.
What should I log for each fallback event?
Log a trace ID, attempt number, configured model/path identifier, status class, latency, sanitized error body, fallback reason code, final selected response source, and whether the request consumed retry or fallback budget. Do not log secrets or sensitive prompt content unless your data policy allows it.
How do I confirm billing or rate-limit assumptions?
Do not infer them from generic API behavior. Check your account materials, current CometAPI documentation, and the CometAPI support help center for the specific behavior that applies to your usage.
Sources checked
| Source | Access date | Purpose |
|---|---|---|
| Existing CometAPI fallback runbook page being refreshed | 2026-06-06 | Confirmed this is an in-place refresh target and preserved the provided slug rather than creating a new URL. |
| CometAPI documentation home | 2026-06-06 | Used as the entry point to verify current CometAPI documentation, API setup, and account-level guidance. |
| CometAPI text chat API documentation | 2026-06-06 | Used as the primary source to verify chat completion endpoint, request, response, and error-contract details before implementation. |
| CometAPI support help center | 2026-06-06 | Used to identify where operators should verify support, escalation, billing, and rate-limit questions that are not safe to infer. |
If CometAPI is part of your production routing plan, validate the contract against the sources above, run the canary sequence, and only then enable automated fallback. Start with CometAPI when you are ready to evaluate the platform against your own reliability requirements.