Skip to content

[remote exec] record metrics on every image pull#11361

Merged
dan-stowell merged 6 commits intomasterfrom
bbdan-oci-metrics-dash
Feb 19, 2026
Merged

[remote exec] record metrics on every image pull#11361
dan-stowell merged 6 commits intomasterfrom
bbdan-oci-metrics-dash

Conversation

@dan-stowell
Copy link
Contributor

I find myself wanting to understand success/error counts and latencies for all image pulls before enabling the OCIFetcher experiment. Let's gather those metrics!
Up next: a dashboard for OCI image pulls

dan-stowell and others added 5 commits February 19, 2026 19:07
Add a Prometheus histogram to track container image fetches on executors:
- buildbuddy_remote_execution_image_fetch_duration_usec

Counts are available via the histogram's _count suffix, so a separate
counter is unnecessary.

Labels:
- isolation: container isolation type (podman, oci, firecracker, etc.)
- registry: eTLD+1 of the image registry domain
- status: ok or error
- on_disk: whether the image was already present on the executor
- has_creds: whether credentials were provided for the fetch
- trigger: execution (normal action) or warmup

Instrumentation covers:
1. PullImageIfNecessary - called for every action execution, knows on-disk
   status from IsImageCached() check
2. Warmup path in runner - always records on_disk=false since warmup
   intentionally bypasses the cache check

RegistryETLDPlusOne parses the registry host from image ref strings without
depending on go-containerregistry, using golang.org/x/net/publicsuffix for
eTLD+1 computation.

Co-authored-by: Shelley <shelley@exe.dev>
Co-authored-by: Shelley <shelley@exe.dev>
… parsing

Replace manual host extraction logic with gcrname.ParseReference, which
handles implicit docker.io defaults, tags, digests, and port stripping.
The publicsuffix eTLD+1 reduction is retained for the final step.

Empty/unparseable refs now return [UNKNOWN] instead of docker.io, and
IP-address registries continue to return [IP_ADDRESS].

Co-authored-by: Shelley <shelley@exe.dev>
The IP-address / eTLD+1 / [UNKNOWN] logic was duplicated between
httpclient's metricsTransport and oci.RegistryETLDPlusOne. Extract it
as httpclient.HostLabel and call it from both places.

Co-authored-by: Shelley <shelley@exe.dev>
Co-authored-by: Shelley <shelley@exe.dev>
@dan-stowell dan-stowell merged commit 9080422 into master Feb 19, 2026
8 of 14 checks passed
@dan-stowell dan-stowell deleted the bbdan-oci-metrics-dash branch February 19, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants