Skip to content

feat(bigquery-analytics): otel correlation, custom_metadata allowlist, column projection (#312/#320/#321)#10

Open
caohy1988 wants to merge 4 commits into
mainfrom
feat/bqaa-otel-metadata-projection
Open

feat(bigquery-analytics): otel correlation, custom_metadata allowlist, column projection (#312/#320/#321)#10
caohy1988 wants to merge 4 commits into
mainfrom
feat/bqaa-otel-metadata-projection

Conversation

@caohy1988

Copy link
Copy Markdown
Owner

Summary

Implements three BQAA plugin observability controls in one change, all additive and off by default. Validated against google/adk-python main and the refined specs in GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK google#312 / google#320 / google#321.

Branch is based on current main (synced from google/adk-python). Only bigquery_agent_analytics_plugin.py + its test file change.

google#312 — span-level Cloud Trace correlation

  • Captures the ambient OTel span context at row-emission time (trace.get_current_span().get_span_context()), only when is_valid, into attributes.otel.{span_id,trace_id}.
  • span_id / parent_span_id are unchanged — they remain the BQAA-internal execution tree. The stale "OpenTelemetry span ID" schema descriptions are corrected to say so, pointing consumers at attributes.otel.span_id for span-level joins.
  • Documented as a best-effort join key (an unsampled valid span is absent from the Cloud Trace export), not a foreign key. otel_parent_span_id is deferred — the OTel SpanContext does not expose a parent id.
  • No plugin-owned OTel span is created/exported (preserves the spellcheck: response spelling changed google/adk-python#94 no-duplicate-span guarantee).

google#320 — custom_metadata allowlist

  • New custom_metadata_allowlist config: exact keys and explicit a2a:*-style prefixes (a plain key is never treated as a prefix).
  • Allowlisted keys from event.custom_metadata are captured into attributes.custom_metadata.* on every row emitted from the source Event — including AGENT_RESPONSE, which did not read custom_metadata before (the UDR citation case).
  • Runs through the existing safety pipeline: truncation (max_content_length) + sensitive-key redaction + circular-ref handling; truncation flips is_truncated, redaction does not.
  • The built-in a2a:* path (A2A_INTERACTION, typed views) is untouched; generic capture lives under a separate namespace.
  • Query via JSON_QUERY(attributes, '$.custom_metadata."<key>"') (quoted segment handles :/. in keys).

google#321 — physical column projection

  • New payload_column_denylist (denylist-first), scoped to the projectable payload columns content / content_parts / attributes / latency_ms. Listing an identity/correlation column raises a clear ValueError at construction.
  • Applied schema-first: the BQ table schema, Arrow schema, row dict, and auto-schema-upgrade all derive from the projected schema, so they never disagree. Auto-upgrade stays additive (never drops existing columns).
  • Projection-aware views: derived view columns whose SQL references a denied payload column (content, attributes, or latency_ms) are dropped, so view creation never references a missing column.

Schema doc

  • Broadened the is_truncated description to "content or metadata payload was truncated" (and noted redaction does not set it).

Tests

  • 25 new tests covering: allowlist parse + exact/prefix matching, capture (namespace, redaction-no-flag, truncation-flag, non-allowlisted absent, no-source-event, default-noop), denylist validation (ValueError on protected/unknown), construction rejection, schema projection + Arrow consistency, view degradation for attributes/content/latency_ms, and otel capture present/absent by is_valid.
  • Full plugin suite: 287 passed, 6 skipped. isort + pyink clean.

Default behavior is unchanged when none of the three configs are set.

…, column projection

Implements three BQAA plugin observability controls
(GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK#312/google#320/google#321):

- google#312 span-level Cloud Trace correlation: capture the ambient OTel span
  context at row-emission time (only when is_valid) into attributes.otel.*;
  span_id/parent_span_id stay the BQAA-internal execution tree. Corrects the
  stale "OpenTelemetry span ID" schema descriptions. Best-effort join key
  (an unsampled valid span is absent from Cloud Trace), not a foreign key;
  otel_parent_span_id deferred (not derivable from SpanContext alone).

- google#320 custom_metadata allowlist: custom_metadata_allowlist config (exact keys
  + explicit "a2a:*"-style prefixes) captures event.custom_metadata into
  attributes.custom_metadata.* on every row emitted from the source Event
  (including AGENT_RESPONSE, which did not read custom_metadata before),
  through the existing safety pipeline (truncation, sensitive-key redaction,
  circular-ref handling, is_truncated). The built-in a2a:* path is unchanged.

- google#321 physical column projection: payload_column_denylist (denylist-first,
  scoped to content/content_parts/attributes/latency_ms; identity/correlation
  columns are protected and raise ValueError). Applied schema-first so the BQ
  schema, Arrow schema, row dict, and views stay consistent; projection-aware
  views drop derived columns that reference a denied payload column.

Also broadens the is_truncated column description to cover content or metadata
payload truncation. Adds 25 unit tests; full plugin suite green
(287 passed, 6 skipped).
- File-content compliance: assemble the cloud-platform OAuth scope from parts
  so this changed file no longer embeds a bare Google APIs host literal
  (the compliance scan rejects such literals on changed files).
- Schema upgrade vs projection change: _maybe_upgrade_schema now computes the
  missing-field diff BEFORE the version-label early return. self._schema is
  projection-dependent (google#321), so relaxing payload_column_denylist on a table
  whose label still matches must still add the now-desired columns instead of
  skipping the diff.
- attributes denial interaction: reject custom_metadata_allowlist together with
  payload_column_denylist=["attributes"] at construction (the captured payload
  would be silently dropped), skip the attributes.otel write when attributes is
  denied, and document that denying attributes disables otel/custom_metadata.

Adds 5 tests (denylist-relaxed upgrade, current-and-complete no-op, fail-fast
rejection, attributes-denied otel skip). Full plugin suite: 292 passed,
6 skipped. isort + pyink clean.
Content parsing/offload ran before row projection, so denying content_parts
(which holds the offload object reference) could still upload the payload to
GCS with no retained reference -- a payload leak + cost. And denying both
content and content_parts still did the full parse/offload for a row that
keeps neither payload column.

- When content_parts is denied, do not construct the GCS offloader (large /
  binary content is kept inline + truncated instead of uploaded); log a
  warning so the disabled offload is visible.
- When both content and content_parts are denied, skip content parsing
  entirely (no inline summary, no parts, no offload).

Adds 2 tests asserting the storage upload mock is not called for
payload_column_denylist=["content_parts"] and ["content","content_parts"]
with gcs_bucket_name set. Full plugin suite: 294 passed, 6 skipped.
The skip-parse branch assigned content_parts from a bare [] inside tuple
unpacking, which mypy could not infer (var-annotated error on 3.10-3.13).
Annotate content_json/content_parts/parser_truncated before the branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant