Skip to content

spellcheck: response spelling changed#94

Closed
selvasudar wants to merge 1 commit into
google:mainfrom
selvasudar:main
Closed

spellcheck: response spelling changed#94
selvasudar wants to merge 1 commit into
google:mainfrom
selvasudar:main

Conversation

@selvasudar

Copy link
Copy Markdown

No description provided.

@google-cla

google-cla Bot commented Apr 11, 2025

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@selvasudar selvasudar closed this Apr 11, 2025
@selvasudar selvasudar reopened this Apr 11, 2025
@selvasudar selvasudar closed this by deleting the head repository Apr 11, 2025
caohy1988 added a commit to caohy1988/adk-python that referenced this pull request Jun 1, 2026
…le#94)

When Agent Engine telemetry is enabled
(GOOGLE_CLOUD_AGENT_ENGINE_ENABLE_TELEMETRY=true) and Cloud Trace
export is configured on the global OTel tracer provider, the
BigQuery Agent Analytics plugin causes every instrumented operation
to appear as TWO spans in Cloud Trace — the framework's real span
plus a duplicate plugin-owned span. Reported in
GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK#94 (also at
haiyuan-eng-google#94) by a regular BQAA + Agent Engine user.

Root cause
----------
TraceManager.push_span() called tracer.start_span(...) purely as an
ID carrier, so the plugin could populate BigQuery span_id /
parent_span_id columns. The author was aware of one related class
of pitfall (re-parenting the framework's spans — fixed by not
attaching the plugin span to the ambient context, see google#4561), but
that guard does not stop the plugin span from going through the
configured SpanProcessor → BatchSpanProcessor → Cloud Trace exporter.
Once Agent Engine wires the global provider to Cloud Trace, every
push/pop pair is exported.

The plugin already tracked every parent/child relationship on its
own contextvar-backed _SpanRecord stack — the OTel span object was
incidental to correctness.

Fix (scoped to TraceManager methods + _SpanRecord)
-------------------------------------------------
* _SpanRecord no longer holds an OTel span object. It carries
  span_id, trace_id, owns_span, start_time_ns, first_token_time.
* push_span generates a 16-hex span_id directly and inherits
  trace_id by precedence: top of internal stack → ambient OTel span
  → freshly generated 32-hex. No tracer.start_span call.
* attach_current_span extracts trace_id/span_id from the ambient
  span (the existing path the framework already supports) and
  stores them as plain strings — no longer holds the OTel object.
* pop_span / clear_stack drop the .end()/.start_time machinery
  since there is no OTel span to end.
* get_trace_id / get_start_time read directly from the record.

The signatures of push_span / attach_current_span / pop_span /
clear_stack / get_trace_id / get_start_time are unchanged.
Module-level `tracer = trace.get_tracer(...)` is retained for ABI
compat (currently unused by plugin code; can be removed in a
follow-up if no external consumers are identified).

attach_current_span() is otherwise untouched — it only observed
the ambient span; it never created one. That path was already
correct and remains so.

Cross-system correlation
------------------------
BigQuery rows still carry trace_id, inherited from the ambient
Agent Engine / Runner span when present. Joining `agent_events` to
Cloud Trace by trace_id continues to work end-to-end.

Tests
-----
Three existing tests that were directly asserting the bug
(test_otel_integration, test_otel_integration_real_provider,
test_clear_stack_ends_owned_spans) are rewritten as inverted
regression guards:
  * test_push_pop_does_not_call_tracer_start_span
  * test_push_pop_does_not_export_spans_through_real_provider
  * test_clear_stack_does_not_export_spans
Each asserts that the corresponding code path does NOT export an
OTel span via an InMemorySpanExporter wired to a real provider, or
does NOT invoke tracer.start_span via a mock spy.

Four new tests added to lock in the lifecycle / inheritance
contracts the plugin must keep:
  * test_push_span_inherits_ambient_trace_id — when an ambient
    OTel span exists (the Agent Engine pattern), the plugin's
    trace_id matches it.
  * test_llm_request_response_share_span_id_contract — the
    paired LLM_REQUEST / LLM_RESPONSE rows share one span_id (the
    documented BQAA join contract).
  * test_tool_starting_completed_share_span_id_contract — same
    invariant for the tool lifecycle pair.
  * test_streaming_llm_response_shares_span_id_until_final_contract
    — multiple partial LLM_RESPONSE callbacks reuse the same
    span_id and only the final fire pops the span. Prevents a
    future "dedupe" of streaming rows from breaking the contract.

226/229 plugin tests pass (6 skipped for unrelated optional
deps); pyink + isort clean.

Refs:
  - haiyuan-eng-google/BigQuery-Agent-Analytics-SDK#94 (reported by a customer)
  - google#4561 (prior fix for span-hierarchy re-parenting)
  - google#4645 (prior fix for trace_id fracture)
caohy1988 added a commit to caohy1988/adk-python that referenced this pull request Jun 1, 2026
…le#94) (#5)

When Agent Engine telemetry is enabled
(GOOGLE_CLOUD_AGENT_ENGINE_ENABLE_TELEMETRY=true) and Cloud Trace
export is configured on the global OTel tracer provider, the
BigQuery Agent Analytics plugin causes every instrumented operation
to appear as TWO spans in Cloud Trace — the framework's real span
plus a duplicate plugin-owned span. Reported in
GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK#94 (also at
haiyuan-eng-google#94) by a regular BQAA + Agent Engine user.

Root cause
----------
TraceManager.push_span() called tracer.start_span(...) purely as an
ID carrier, so the plugin could populate BigQuery span_id /
parent_span_id columns. The author was aware of one related class
of pitfall (re-parenting the framework's spans — fixed by not
attaching the plugin span to the ambient context, see google#4561), but
that guard does not stop the plugin span from going through the
configured SpanProcessor → BatchSpanProcessor → Cloud Trace exporter.
Once Agent Engine wires the global provider to Cloud Trace, every
push/pop pair is exported.

The plugin already tracked every parent/child relationship on its
own contextvar-backed _SpanRecord stack — the OTel span object was
incidental to correctness.

Fix (scoped to TraceManager methods + _SpanRecord)
-------------------------------------------------
* _SpanRecord no longer holds an OTel span object. It carries
  span_id, trace_id, owns_span, start_time_ns, first_token_time.
* push_span generates a 16-hex span_id directly and inherits
  trace_id by precedence: top of internal stack → ambient OTel span
  → freshly generated 32-hex. No tracer.start_span call.
* attach_current_span extracts trace_id/span_id from the ambient
  span (the existing path the framework already supports) and
  stores them as plain strings — no longer holds the OTel object.
* pop_span / clear_stack drop the .end()/.start_time machinery
  since there is no OTel span to end.
* get_trace_id / get_start_time read directly from the record.

The signatures of push_span / attach_current_span / pop_span /
clear_stack / get_trace_id / get_start_time are unchanged.
Module-level `tracer = trace.get_tracer(...)` is retained for ABI
compat (currently unused by plugin code; can be removed in a
follow-up if no external consumers are identified).

attach_current_span() is otherwise untouched — it only observed
the ambient span; it never created one. That path was already
correct and remains so.

Cross-system correlation
------------------------
BigQuery rows still carry trace_id, inherited from the ambient
Agent Engine / Runner span when present. Joining `agent_events` to
Cloud Trace by trace_id continues to work end-to-end.

Tests
-----
Three existing tests that were directly asserting the bug
(test_otel_integration, test_otel_integration_real_provider,
test_clear_stack_ends_owned_spans) are rewritten as inverted
regression guards:
  * test_push_pop_does_not_call_tracer_start_span
  * test_push_pop_does_not_export_spans_through_real_provider
  * test_clear_stack_does_not_export_spans
Each asserts that the corresponding code path does NOT export an
OTel span via an InMemorySpanExporter wired to a real provider, or
does NOT invoke tracer.start_span via a mock spy.

Four new tests added to lock in the lifecycle / inheritance
contracts the plugin must keep:
  * test_push_span_inherits_ambient_trace_id — when an ambient
    OTel span exists (the Agent Engine pattern), the plugin's
    trace_id matches it.
  * test_llm_request_response_share_span_id_contract — the
    paired LLM_REQUEST / LLM_RESPONSE rows share one span_id (the
    documented BQAA join contract).
  * test_tool_starting_completed_share_span_id_contract — same
    invariant for the tool lifecycle pair.
  * test_streaming_llm_response_shares_span_id_until_final_contract
    — multiple partial LLM_RESPONSE callbacks reuse the same
    span_id and only the final fire pops the span. Prevents a
    future "dedupe" of streaming rows from breaking the contract.

226/229 plugin tests pass (6 skipped for unrelated optional
deps); pyink + isort clean.

Refs:
  - haiyuan-eng-google/BigQuery-Agent-Analytics-SDK#94 (reported by a customer)
  - google#4561 (prior fix for span-hierarchy re-parenting)
  - google#4645 (prior fix for trace_id fracture)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant