feat(client): Add put_path for lazy file descriptor opening#415
feat(client): Add put_path for lazy file descriptor opening#415
Conversation
Fixes FS-325 Opening tokio::fs::File eagerly via put_file and pushing many such builders into a ManyBuilder causes all file descriptors to be held open simultaneously — easily exhausting macOS's default ulimit of 256. Introduces Session::put_path, which accepts a path and defers the open until the operation is actually submitted. classify() uses tokio::fs::metadata (no fd) to size-check path bodies, and maybe_compress (now async) opens the file at the point of consumption, within the active concurrency window. Also routes Stream bodies through the individual endpoint (unknown size) and applies the batch size check to Buffer bodies.
Semver Impact of This PR🟡 Minor (new features) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨
Internal Changes 🔧
🤖 This preview updates automatically when you update the PR. |
| let op = BatchOperation::Insert { | ||
| key, | ||
| metadata, | ||
| body: PutBody::File(file), | ||
| body, | ||
| }; |
There was a problem hiding this comment.
not sure you need the copy. i think you can move the original op argument, or assign a name to the current match case with @:
match op {
insert @ BatchOperation::Insert {
...
} => {
...
if size.is_some_and(|s| s <= MAX_BATCH_PART_SIZE as u64) {
Classified::Batchable(op) // I think you can move the original argument here
} else {
Classified::Individual(insert) // or you can reference a named pattern
}
}
}
There was a problem hiding this comment.
Note that this doesn't copy data. It's a destructure + move on the stack that is fully optimized out by the compiler.
Using insert @ ... is sadly not possible since we move the key out in the error case. The insert variable's type is the enum, not the variant. We would have to bind the individual fields we read with ref like so:
BatchOperation::Insert {
ref key,
ref metadata,
ref body,
}The problem then becomes that we need to clone key in the error case. I had this at first, and to avoid the clone creates even worse code.
clients/rust/src/many.rs
Outdated
There was a problem hiding this comment.
i think this comment is stale now
Opening
tokio::fs::Fileeagerly viaput_fileand pushing many builders into aManyBuilderholds all file descriptors open simultaneously — easily exhausting macOS's default ulimit of 256.Session::put_pathaccepts a path and defers the open until the operation is submitted.classify()usestokio::fs::metadata(no fd) to size-check path bodies;maybe_compress(now async) opens the file at the point of consumption, within the active concurrency window.Also fixes two related issues found along the way:
Streambodies were routed through the batch endpoint despite having unknown size; they now always go individualBufferbodies were not size-checked and could exceedMAX_BATCH_PART_SIZEFixes FS-325