Where is the best layer to add throttling to avoid being marked for abuse?

GitHub API has some guidelines to follow when accessing their API to avoid hitting abuse rate limit:

https://developer.github.com/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits

An application should follow those rules. Some of them can be enforced in a general/automated way via throttling, I think, and so they can be factored out into a library (rather than having to be done manually inside the application layer). Specifically:
1. > Make requests for a single user or client ID serially. Do not make requests for a single user or client ID concurrently.
   
   If we can assume a single `*github.Client` maps to a single user or client ID, then this can be done by using a mutex or semaphore or similar.
2. > If you're making a large number of `POST`, `PATCH`, `PUT`, or `DELETE` requests for a single user or client ID, wait at least one second between each request.
   
   Again, if we can assume a single `*github.Client` maps to a single user or client ID, then this can be done by keeping track of the last `POST`, `PATCH`, `PUT`, or `DELETE` request time, and sleeping if neccessary to make the next one happen at least one second later.
3. > When you have been limited, wait the number of seconds specified in the `Retry-After` response header.
   
   We can detect abuse rate limit response (it's documented [here](https://developer.github.com/v3/#abuse-rate-limits)), read the number of seconds specified in the `Retry-After` response header, and not allow making network requests until that time has passed (probably returning `*RateLimitError` error, or an abuse-specific version thereof, right away for all requests).

I see two places where this can be done:
1. Possibly inside go-github library. Inside `Client` itself. Probably as a configurable option.
2. Or outside of this library. As a higher-level wrapper layer around go-github library.

So far, go-github has been a relatively "low-level" API client, meaning it helps you out with type safety and provides structured output, but aside from that, it maps relatively closely to 1:1 to making HTTP requests to GitHub API.

It has taken some steps to be a smarter/higher-level client where it could be done so transparently, without blocking. For example, after #277, `RateLimitError` type was added, and after #347, go-github is smart enough to track rate limit reset time and not make network calls when the rate limit is known to still be exceeded, returning `RateLimitError` right away.

So far it has also been a "return-right-away" API, where none of the calls are artificially throttled on the client side, they just return an error if there's a rate limit problem. So it seems adding throttling to the go-github client itself may be out of place, and would belong in a higher-level wrapper around go-github.

On the other hand, if done as an option that can be controlled, perhaps this is the best place to do these things, since it means more people can use the GitHub API correctly with less work from their side, by default. The user of the github client could make calls concurrently, but they would block and get serialized by the client as to follow the GH API guidelines. It would effectively shift some of the throttling that would otherwise be done on GitHub server side, to the go-github library client side.

@willnorris, @gmlewis, what are your thoughts on this? Where is the best place for this functionality?

Related issues: #152, #153, #277, #347, #304.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where is the best layer to add throttling to avoid being marked for abuse? #431

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Where is the best layer to add throttling to avoid being marked for abuse? #431

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions