Conversation
8a45f14 to
dbc1af6
Compare
src/main/java/com/google/devtools/build/lib/remote/chunking/FastCDCChunker.java
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/FastCDCChunker.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/FastCDCChunker.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/chunking/FastCDCChunker.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/GrpcCacheClient.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobUploader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobUploader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/merkletree/MerkleTreeComputer.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobUploader.java
Outdated
Show resolved
Hide resolved
789ab23 to
4349030
Compare
src/main/java/com/google/devtools/build/lib/remote/chunking/FastCDCChunker.java
Outdated
Show resolved
Hide resolved
6e3f676 to
e795b34
Compare
e795b34 to
e02742a
Compare
51d119f to
aaeb1b9
Compare
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
src/test/java/com/google/devtools/build/lib/remote/chunking/FastCDCChunkerTest.java
Outdated
Show resolved
Hide resolved
9d508a6 to
23b5f14
Compare
This is needed to run `bb print` to see Split/Splice calls, to test bazelbuild/bazel#28437 BB print just reads the grpc log directly from whats in this file, so updating this file is sufficient.
|
@tjgq @fmeum If possible (depends on timing), I would like to include this in 8 and 9 LTS versions. It seems like 8.6 is close, so maybe we can target 8.7. It doesn't patch cleanly on 8.X, so I'll need to create a separate PR once its ready: please let me know if you'd like me to do that. Thanks again for all the help with this! |
Now anyone can try out chunking using Buildbuddy to test: 1. Sign up for a trial/free account at https://www.buildbuddy.io/ 2. Get a token with write access 3. Use Bazel Fork from bazelbuild/bazel#28437 4. Build! ``` USE_BAZEL_VERSION="tyler-french/9.1.0-cdc" bazel build //... \ --disk_cache= \ --experimental_remote_cache_chunking \ --remote_header=x-buildbuddy-cdc-enabled=true \ --check_direct_dependencies=off \ --remote_cache=grpcs://remote.buildbuddy.io ```
src/main/java/com/google/devtools/build/lib/remote/common/RemoteCacheClient.java
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/options/RemoteOptions.java
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Show resolved
Hide resolved
src/test/java/com/google/devtools/build/lib/remote/chunking/testdata/SekienAkashita.jpg
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/options/RemoteOptions.java
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/GrpcCacheClient.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java
Outdated
Show resolved
Hide resolved
9da9f90 to
d84edef
Compare
|
@tjgq I have seen some flakiness on Windows for: |
Thanks for the heads up - I'll see what I can do about it. |
tjgq
left a comment
There was a problem hiding this comment.
LGTM. I'll import this one myself.
|
Rebased again |
|
@bazel-io fork 8.7.0 |
|
@bazel-io fork 9.1.0 |
|
This will be submitted momentarily, but with some changes relative to the state of this PR. The most important modification is that We also noticed two opportunities for optimization (as a followup):
|
TLDR: This PR enables support for content-defined chunking (FastCDC) for large uploads/downloads to remote cache, saving ~40% on storage and upload bandwidth, and making builds faster by deduplicating similar artifacts across builds.
RELNOTES[NEW]: Added
--experimental_remote_cache_chunkingflag to read and write large blobs to/from the remote cache in chunks. Requires server support.Motivation
Actions like
GoLinkandCppLinkproduce very large output files that are often similar between builds. A small source change can cause a cache miss, wasting storage, bandwidth, and time on nearly-identical artifacts.Content-Defined Chunking (CDC) addresses this by splitting files at content-determined cut points. Because cut points are derived from the file content itself, small changes, even ones that shift bytes around, tend to affect only a few chunks. This makes action outputs effectively incremental: even though the action must re-run, the upload, download, and storage costs shrink dramatically.
Results
Benchmarked across the last 50 commits of the BuildBuddy repo (server and client on the same host):
Key takeaways:
Additional benefits: better load balancing across distributed clusters (fewer long-running RPCs) and more granular retries on unstable networks.
Try It Out
Anyone can try chunking today using BuildBuddy:
How It Works
Write path:
FindMissingBlobsto identify which chunks the server already has.SpliceBlobto register the blob-to-chunks mapping on the server.Read path:
SplitBlobto get the chunk list for this blob.If
--disk_cacheis enabled, previously downloaded chunks are served locally.