GCS client library migration in Java SDK - part 3#37900
GCS client library migration in Java SDK - part 3#37900shunping wants to merge 4 commits intoapache:masterfrom
Conversation
shunping
commented
Mar 21, 2026
- Add open and create methods for GcsUtilV2.
- Add bridge channel classes to ensure we keep the function signature as much as we can.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request, part three of the GCS client library migration, focuses on enabling fundamental read and write capabilities using the updated Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
f1b68ae to
67c42da
Compare
|
Assigning reviewers: R: @ahmedabu98 for label java. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
cc'ed @BenWhitehead |
| return new GcsSeekableByteChannel( | ||
| blob.getStorage().reader(blob.getBlobId(), sourceOptions), blob.getSize()); |
There was a problem hiding this comment.
I would change this to the following to avoid unnecessary ByteBuffer allocations.
| return new GcsSeekableByteChannel( | |
| blob.getStorage().reader(blob.getBlobId(), sourceOptions), blob.getSize()); | |
| ReadChannel reader = blob.getStorage().reader(blob.getBlobId(), sourceOptions); | |
| // disable internal buffering, and make the channel non-blocking | |
| reader.setChunkSize(0); | |
| return new GcsSeekableByteChannel(reader, blob.getSize()); |
| if (options.getExpectFileToNotExist()) { | ||
| writeOptionList.add(BlobWriteOption.doesNotExist()); | ||
| } |
There was a problem hiding this comment.
This should be flushed out more. If there isn't a precondition[1] present when the writer is created, some internal rpcs won't be able to be automatically retried. And example of the other branch can be seen in this code sample for create from array https://github.com/googleapis/java-storage/blob/ba5daed0c1d306f821cc26549142ff0bcfb80cbb/samples/snippets/src/main/java/com/example/storage/object/UploadObjectFromMemory.java#L53-L64
|
|
||
| @Override | ||
| public int read(ByteBuffer dst) throws IOException { | ||
| int count = reader.read(dst); |
There was a problem hiding this comment.
After making the channel non-blocking, you'll probably want to add the following change if you need to always fill the provided buffer as much as possible:
| int count = reader.read(dst); | |
| int count = StorageChannelUtils.blockingFillFrom(dst, reader); |
| assertEquals(0, channel.position()); | ||
|
|
||
| // Read content into ByteBuffer | ||
| ByteBuffer buffer = ByteBuffer.allocate((int) expectedSize); |
There was a problem hiding this comment.
For tests, I generally recommend allocating larger buffers than the actual size of the expected download. By doing so, you ensure that the EOF you receive is where you expect and not that you simply have a common subrange.
| int bytesRead = 0; | ||
| while (buffer.hasRemaining()) { | ||
| int read = channel.read(buffer); | ||
| if (read == -1) { | ||
| break; | ||
| } | ||
| bytesRead += read; | ||
| } |
There was a problem hiding this comment.
Use our util so that your test is more clearly focused on the logic of the data than the movement of that data.
| int bytesRead = 0; | |
| while (buffer.hasRemaining()) { | |
| int read = channel.read(buffer); | |
| if (read == -1) { | |
| break; | |
| } | |
| bytesRead += read; | |
| } | |
| int bytesRead = StorageChannelUtils.blockingFillFrom(buffer, channel); |
| MessageDigest digest = MessageDigest.getInstance("SHA-256"); | ||
| digest.update(buffer); | ||
| byte[] hashBytes = digest.digest(); | ||
|
|
||
| // Convert bytes to Hex String | ||
| StringBuilder sb = new StringBuilder(); | ||
| for (byte b : hashBytes) { | ||
| sb.append(String.format("%02x", b)); | ||
| } | ||
| String actualHash = sb.toString(); |
There was a problem hiding this comment.
nit: i'd make this a helper method to make it easier to read the test, since the test isn't actually testing sha256 computation and encoding.
| @Test | ||
| public void testWriteAndRead() throws IOException { | ||
| final String bucketName = "apache-beam-temp-bucket-12345"; | ||
| final GcsPath targetPath = GcsPath.fromComponents(bucketName, "test-object.txt"); |
There was a problem hiding this comment.
I would make this object name unique so that another test, or a past test doesn't interfere with the expectation that this test should create a new object.
| public void testWriteAndRead() throws IOException { | ||
| final String bucketName = "apache-beam-temp-bucket-12345"; | ||
| final GcsPath targetPath = GcsPath.fromComponents(bucketName, "test-object.txt"); | ||
| final String content = "Hello, GCS!"; |
There was a problem hiding this comment.
The default chunk size for an upload using the writer is 16MiB. This means that if creating an object with less than the chunk size there will be 1 rpc to create the upload, and 1 rpc to upload and finalize the bytes. If verifying chunk size is important, you'll want this content to be larger.
| } | ||
|
|
||
| // Verify content | ||
| assertEquals(content, readContent.toString()); |
There was a problem hiding this comment.
nit: I generally prefer asserting bytes rather than strings, that way non-printed characters or similar looking characters aren't sources of confusion in any assertion failure.