Search before asking
Motivation
Currently the FileIO interface only supports listing all files / directories under a given path at a time. As a consequence callers of FileIO, e.g. ObjectRefresh, can only choose to load the entire catalog of files into memory, which may lead to poor performance and OOM.
Solution
Introduce paged list API like the following:
Pair<FileStatus[], String> listFilesPaged(
Path path, boolean recursive, long pageSize, @Nullable String continuationToken)
This should allow implementations to take advantage of batched list APIs that are commonly seen in object stores, e.g. ListObjectsV2 with continuation token.
Anything else?
No response
Are you willing to submit a PR?
Search before asking
Motivation
Currently the
FileIOinterface only supports listing all files / directories under a given path at a time. As a consequence callers ofFileIO, e.g.ObjectRefresh, can only choose to load the entire catalog of files into memory, which may lead to poor performance and OOM.Solution
Introduce paged list API like the following:
This should allow implementations to take advantage of batched list APIs that are commonly seen in object stores, e.g. ListObjectsV2 with continuation token.
Anything else?
No response
Are you willing to submit a PR?