This is tricky, as we'd have to think carefully about stability and also about doing as few queries as possible to support usage as a generator while still offering good performance.
We could do something along the lines of...
- A scan request creates a ResultGenerator object with the following attributes:
a. a job queue with the complete set of continuation tokens
b. a results queue with a maximum number of elements (say, 10,000)
c. a kill signal
d. some number W of threaded workers
- Each worker makes their request and feeds into the queue. If the queue is full, they enter a holding pattern, sleeping for short intervals while waiting for room on the queue and also for their kill signal.
- Values are taken off the queue by a generator
- Workers pull the next continuation token and the next request when there is room on the results queue but they have no more results to report
- When the job queue is exhausted the process is done
- On ResultGenerator cleanup, the kill_signal is issued, so that garbage collection terminates all query workers.
This is tricky, as we'd have to think carefully about stability and also about doing as few queries as possible to support usage as a generator while still offering good performance.
We could do something along the lines of...
a. a job queue with the complete set of continuation tokens
b. a results queue with a maximum number of elements (say, 10,000)
c. a kill signal
d. some number W of threaded workers