Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use separate API tokens #37

Closed
maxheld83 opened this issue Dec 4, 2020 · 2 comments
Closed

use separate API tokens #37

maxheld83 opened this issue Dec 4, 2020 · 2 comments
Labels
Milestone

Comments

@maxheld83
Copy link
Contributor

maxheld83 commented Dec 4, 2020

Aside from security best practices, there may be another reason to use separate API tokens for separate people and, especially, services: rate limits.

Crossref states for Metadata Plus that:

Rate limiting of the API is primarily on a per access token basis. If a method allows, for example, for 75 requests per rate limit window, then it allows 75 requests per window per access token. This number can depend on the system state and may need to change. If it does, Crossref will publish it in the response headers.

The problem with us (people) and several services (Azure, gha) sharing the same token is that one of these users might (accidentally) exhaust the rate limit at the expense of another user or machine.
This can easily happen, because it is generally ok to go up to the rate limit on any individual machine or service.
As a result, seemingly unrelated services or other users queries may break in an intermittent fashion, which could be quite surprising and hard to debug.

This is unlikely to be an issue initially, but may well become one eventually and should be addressed head on with at least a token per user and service.

Depending on our scaling Azure may even need several tokens, or the shiny app must ask Azure how many instances there currently are and then share/divide the rate limit accordingly.

As a (slow) workaround, falling back to the open api might help #36

@maxheld83
Copy link
Contributor Author

another alternative would be (if licensing/cost prohibit additional tokens), as we previously considered, to upload the dumps into our own db (BigQuery or similar) and to run the queries against that.
Then we can administer our own access credentials and rate limits.
However, this would carry quite a lot of overhead, so hopefully we won't need that.

@maxheld83
Copy link
Contributor Author

closing this in favor of #233

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant