Timeouts are product decisions as much as infrastructure settings. A chat UI, batch enrichment task, and internal review workflow can all use the same API while needing very different failure behavior.

Start with the user workflow

Decide how long the user or background job can wait before the response is no longer useful. This should set the outer timeout budget before provider-specific retry logic is added.

Bound retries explicitly

Retries should have a maximum count, a maximum elapsed time, and a clear list of retryable failures. Without those limits, retry logic can increase cost and make outages harder to recover from.

Track timeout outcomes

Log whether the request succeeded, retried, fell back, or stopped. These counters are the fastest way to see whether reliability work is improving the real workflow.