-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AI gateway] Request timeouts for fallback providers #19391
base: production
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 files reviewed, 1 total issue(s) found.
You can use the Universal endpoint to contact every provider. The payload is expecting an array of message, and each message is an object with the following parameters: | ||
|
||
- `provider` : the name of the provider you would like to direct this message to. Can be OpenAI, workers-ai, or any of our supported providers. | ||
- `endpoint`: the pathname of the provider API you’re trying to reach. For example, on OpenAI it can be `chat/completions`, and for Workers AI this might be [`@cf/meta/llama-3.1-8b-instruct`](/workers-ai/models/llama-3.1-8b-instruct/). See more in the sections that are specific to [each provider](/ai-gateway/providers/). | ||
- `authorization`: the content of the Authorization HTTP Header that should be used when contacting this provider. This usually starts with “Token” or “Bearer”. | ||
- `headers`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kathayl, I think this is accurate, but fact check me here :)
- `query`: the payload as the provider expects it in their official API. | ||
|
||
## cURL example | ||
|
||
The following example shows a simple setup with a primary model and a [fallback](/ai-gateway/configuration/fallbacks/) option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to add more cross-links over to fallbacks page
@@ -41,6 +41,10 @@ entries: | |||
general_definition: |- | |||
Header to [bypass caching for a specific request](/ai-gateway/configuration/caching/#skip-cache-cf-aig-skip-cache). | |||
- term: cf-aig-request-timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adds automatically to headers glossary page
|
||
If that fails, then the gateway will timeout and move to the fallback `@cf/meta/llama-3.1-8b-instruct-fast` model. This model has 3000 milliseconds - determined by the request-level `cf-aig-request-timeout` value - to complete the request and provide an answer. | ||
|
||
```bash title="Request" collapse={36-50} {2,11,13-15} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the collapsible bit here is distracting, happy to remove.
Also, switched the example a bit so it made sense to me... but I also could just be hallucinating stuff. Happy to flip it back to the original.
Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com>
Deploying cloudflare-docs with Cloudflare Pages
|
Summary
New feature, request timeouts for fallback providers.
Updated several pages in AI gateway docs + changelog entry.