Android-studio61 · pull · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026
diff --git a/docs/integrations/assets/datadog-observability.png b/docs/integrations/assets/datadog-observability.png
diff --git a/docs/integrations/assets/datadog.png b/docs/integrations/assets/datadog.png
diff --git a/docs/integrations/datadog.md b/docs/integrations/datadog.md
@@ -0,0 +1,97 @@
+---
+catalog_title: Datadog
+catalog_description: Develop, evaluate, and monitor LLM applications
+catalog_icon: /integrations/assets/datadog.png
+catalog_tags: ["observability"]
+---
+
+# Datadog Observability for ADK
+
+<div class="language-support-tag">
+    <span class="lst-supported">Supported in:</span>
+    <span class="lst-python">Python</span>
+</div>
+
+[Datadog LLM
+Observability](https://www.datadoghq.com/product/llm-observability/) helps AI
+engineers, data scientists, and application developers quickly develop,
+evaluate, and monitor LLM applications. Confidently improve output quality,
+performance, costs, and overall risk with structured experiments, end-to-end
+tracing across AI agents, and evaluations.
+
+## Overview
+
+Datadog LLM Observability can [automatically instrument and trace your agents
+built on Google
+ADK](https://docs.datadoghq.com/llm_observability/instrumentation/auto_instrumentation?tab=python#google-adk),
+allowing you to:
+
+- **Observe agent executions and interactions** - Automatically capture every
+  agent run, tool call, and code execution within your agents
+- **Capture LLM calls and responses** made with the underlying Google GenAI SDK
+- **Debug issues** by providing error rates, token usage and cost, and
+  out-of-the-box evaluations on your LLM calls and tool usage
+
+## Prerequisites
+
+Sign up for a [Datadog account](https://www.datadoghq.com/) if you do not have
+one and [get your API
+key](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys).
+
+## Installation
+
+Install the required packages:
+
+```bash
+pip install ddtrace
+```
+
+## Setup
+
+### Create an Application using the Google ADK
+
+If you do not have an application using the Google ADK, follow the steps in the
+[ADK Getting Started Guide](https://google.github.io/adk-docs/get-started/) to
+create a sample ADK agent.
+
+### Configure Environment Variables
+
+You will need to specify an ML Application name in the following environment
+variables. An ML Application is a grouping of LLM Observability traces
+associated with a specific LLM-based application. See [ML Application Naming
+Guidelines](https://docs.datadoghq.com/llm_observability/instrumentation/sdk?tab=python#application-naming-guidelines)
+for more information on limitations with ML Application names.
+
+```shell
+export DD_API_KEY=<YOUR_DD_API_KEY>
+export DD_SITE=<YOUR_DD_SITE>
+export DD_LLMOBS_ENABLED=true
+export DD_LLMOBS_ML_APP=<YOUR_ML_APP_NAME>
+export DD_LLMOBS_AGENTLESS_ENABLED=true
+export DD_APM_TRACING_ENABLED=false  # Only set this if you are not using Datadog APM
+```
+
+These variables must be exported before running your application so the
+following `ddtrace-run` command can use them, as opposed to putting them in the
+agent's `.env` file.
+
+### Run Your Application
+
+Once you have configured your environment variables, you can run your
+application and start observing your LLM-based applications.
+
+```shell
+ddtrace-run adk run my_agent
+```
+
+## Observe
+
+Navigate to the [Datadog LLM Observability Traces
+View](https://app.datadoghq.com/llm/traces) to see the traces generated by your
+application.
+
+![datadog-observability.png](./assets/datadog-observability.png)
+
+## Support and Resources
+- [Datadog LLM Observability](https://www.datadoghq.com/product/llm-observability/)
+- [Datadog Support](https://docs.datadoghq.com/help/)
diff --git a/docs/observability/index.md b/docs/observability/index.md
@@ -7,12 +7,11 @@ agents, you may need these features to help debug and diagnose their
 in-process behavior. Basic input and output monitoring is typically
 insufficient for agents with any significant level of complexity.
 
-Agent Development Kit (ADK) provides configurable
-[logging](/observability/logging/)
-functionality for monitoring and debugging agents. However, you may
-need to consider more advanced
-[observability ADK Integrations](/integrations/?topic=observability)
-for monitoring and analysis.
+Agent Development Kit (ADK) provides built-in observability through
+[logging](/observability/logging/), [metrics](/observability/metrics/), and
+[traces](/observability/traces/) to help you monitor and debug your agents.
+However, you may need to consider more advanced [observability ADK
+Integrations](/integrations/?topic=observability) for monitoring and analysis.
 
 !!! tip "ADK Integrations for observability"
     For a list of pre-built observability libraries for ADK, see

diff --git a/docs/observability/logging.md b/docs/observability/logging.md
@@ -61,6 +61,7 @@ The available log levels for the `--log_level` option are:
 | **`ERROR`** | A serious error that prevented an operation from completing. | <ul><li>Failed API calls to external services (e.g., LLM, Session Service).</li><li>Unhandled exceptions during agent execution.</li><li>Configuration errors.</li></ul> |
 
 **Note:** It is recommended to use `INFO` or `WARNING` in production environments. Only enable `DEBUG` when actively troubleshooting an issue, as `DEBUG` logs can be very verbose and may contain sensitive information.
+
 ---
 
 ## Configuring Logging in Go
@@ -91,20 +92,20 @@ import (
 
 func main() {
 	ctx := context.Background()
-	
+
 	// Initialize telemetry with prompt content logging enabled
-	tp, err := telemetry.New(ctx, 
+	tp, err := telemetry.New(ctx,
 		telemetry.WithGenAICaptureMessageContent(true),
 		// Add other options like WithOtelToCloud(true) for GCP export
 	)
 	if err != nil {
 		// handle error
 	}
 	defer tp.Shutdown(ctx)
-	
+
 	// Register as global OTel providers
 	tp.SetGlobalOtelProviders()
-	
+
 	// Your ADK agent code follows...
 }
 ```
@@ -144,15 +145,19 @@ By reading the logger name, you can immediately pinpoint the source of the log a
 
 **Scenario:** Your agent is not producing the expected output, and you suspect the prompt being sent to the LLM is incorrect.
 **Steps:**
+
 1.  **Enable DEBUG Logging:** In your `main.py`, set the logging level to `DEBUG` as shown in the configuration example.
     ```python
     logging.basicConfig(
         level=logging.DEBUG,
         format='%(asctime)s - %(levelname)s - %(name)s - %(message)s'
     )
     ```
+
 2.  **Run Your Agent:** Execute your agent's task as you normally would.
+
 3.  **Inspect the Logs:** Look through the console output for a message from the `google.adk.models.google_llm` logger that starts with `LLM Request:`.
+
     ```log
     ...
     2025-07-10 15:26:13,778 - DEBUG - google_adk.google.adk.models.google_llm - Sending out request, model: gemini-flash-latest, backend: GoogleLLMVariant.GEMINI_API, stream: False
@@ -195,6 +200,7 @@ By reading the logger name, you can immediately pinpoint the source of the log a
     I have rolled a 6 sided die, and the result is 2.
     ...
     ```
+
 4.  **Analyze the Prompt:** By examining the `System Instruction`, `contents`, `functions` sections of the logged request, you can verify:
     -   Is the system instruction correct?
     -   Is the conversation history (`user` and `model` turns) accurate?

diff --git a/docs/observability/metrics.md b/docs/observability/metrics.md
@@ -0,0 +1,93 @@
+# Agent activity metrics
+
+<div class="language-support-tag">
+  <span class="lst-supported">Supported in ADK</span><span class="lst-python">Python v1.32.0</span>
+</div>
+
+Agent Development Kit (ADK) provides built-in, vendor-neutral metrics collection to help you understand the performance, cost, and usage patterns of your agents. While logs provide a detailed narrative of *what* happened, metrics give you aggregated, quantitative data to answer *how often* and *how fast* things are happening.
+
+## Metrics philosophy
+
+ADK's approach to metrics is designed to be lightweight, standardized, and entirely agnostic to your choice of monitoring backend.
+
+*   **OpenTelemetry Semantic Conventions:** ADK implements the OpenTelemetry (OTel) [Semantic Conventions for GenAI](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-metrics.md). This ensures that metrics are recorded under standard, predictable attribute and metric names.
+*   **OTLP Wire Format:** ADK emits data using the standard OTLP format, ensuring that your metrics will seamlessly integrate into any OTel-compatible backend (e.g., Prometheus, Datadog, SigNoz, Google Cloud Monitoring).
+*   **Cost and Performance Focused:** Metrics are significantly less costly and more performant than logs or traces when performing analytics over large swathes of data. ADK tracks the most critical signals for LLM applications: token consumption, request latency, and tool execution reliability.
+*   **Vendor-Neutral Export:** ADK does not lock you into a specific metrics pipeline. You instantiate standard OTel meter providers and export data wherever your infrastructure demands.
+
+---
+
+## Metrics schema
+
+When metrics are enabled, ADK automatically instruments the agent's lifecycle, workflow steps, and tool executions based on the OpenTelemetry GenAI Semantic Conventions. The following core metrics are emitted:
+
+| Metric Name | Type | Description | Key Attributes (Dimensions) |
+| :--- | :--- | :--- | :--- |
+| **`gen_ai.agent.invocation.duration`** | Histogram | The total time taken for an agent to process a prompt and return a response. | `gen_ai.agent.name`, `error.type` |
+| **`gen_ai.tool.execution.duration`** | Histogram | The execution latency of individual tools called by the agent. Useful for spotting slow external APIs. | `gen_ai.tool.name`, `error.type` |
+| **`gen_ai.agent.request.size`** | Histogram | The size or complexity of the incoming request sent to the agent. | `gen_ai.agent.name` |
+| **`gen_ai.agent.response.size`** | Histogram | The size or complexity of the final response generated by the agent. | `gen_ai.agent.name` |
+| **`gen_ai.agent.workflow.steps`** | Histogram | Tracks the number of iterative steps or reasoning loops an agent takes to complete a workflow. | `gen_ai.agent.name` |
+
+---
+
+## Metrics export setup
+
+### Metrics export in ADK Web
+
+If you are running your agent using the `adk web` or `adk api_server` CLI commands, you can configure metrics export.
+
+
+#### OTLP export
+
+To export metrics to an OTLP-compatible backend, set the standard OTel environment variables:
+
+```bash
+export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT="http://your-collector:4318/v1/metrics"
+adk web path/to/your/agents_dir
+```
+
+> **Note:** You can also set the general `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable if you would like to send traces and logs to the same endpoint in addition to metrics.
+
+#### GCP export
+
+To enable metrics export to Google Cloud Monitoring, use the `-otel_to_cloud` flag:
+
+```bash
+adk web -otel_to_cloud path/to/your/agents_dir
+```
+
+### Programmatic metrics export
+
+You can also configure metrics export programmatically in your application code.
+
+#### OTLP export setup
+
+To enable metrics and export them to an OpenTelemetry Collector (or an OTLP-compatible backend) programmatically:
+
+```python
+from google.adk.telemetry.setup import maybe_set_otel_providers
+import os
+
+os.environ["OTEL_EXPORTER_OTLP_METRICS_ENDPOINT"] = "http://your-collector:4318/v1/metrics"
+os.environ["OTEL_SERVICE_NAME"] = "your-adk-agent"
+os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "key1=value1,key2=value2"
+maybe_set_otel_providers()
+```
+
+#### GCP export setup
+
+To export metrics to Google Cloud Monitoring programmatically, use the OpenTelemetry Google Cloud exporter. Here is an example in Python:
+
+```python
+from google.adk.telemetry.google_cloud import get_gcp_exporters
+from google.adk.telemetry.setup import maybe_set_otel_providers
+import os
+
+gcp_exporters = get_gcp_exporters(
+  enable_cloud_metrics = True,
+)
+os.environ["OTEL_SERVICE_NAME"] = "your-adk-agent"
+os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "key1=value1,key2=value2"
+maybe_set_otel_providers([gcp_exporters])
+```
diff --git a/docs/observability/traces.md b/docs/observability/traces.md
@@ -0,0 +1,92 @@
+# Agent activity traces
+
+<div class="language-support-tag">
+  <span class="lst-supported">Supported in ADK</span><span class="lst-python">Python v1.17.0</span><span class="lst-go">Go v1.0.0</span>
+</div>
+
+Agent Development Kit (ADK) provides distributed tracing capabilities to help you visualize the end-to-end journey of a request as it travels through your agent's architecture. While metrics tell you *how long* a process took and logs tell you *what* happened, traces connect these events, showing you exactly *where* the time was spent and the hierarchical relationship between LLM reasoning, tool calls, and external APIs.
+
+## Traces philosophy
+
+ADK's approach to tracing is built on standard protocols to ensure seamless integration with your existing observability stack.
+
+*   **OpenTelemetry Semantic Conventions:** ADK implements the OpenTelemetry (OTel) [Semantic Conventions for GenAI](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-agent-spans.md). This ensures that trace spans and attributes are recorded under standard, predictable names.
+*   **OTLP Wire Format:** ADK emits data using the standard OTLP format, ensuring that your traces will seamlessly integrate into any OTel-compatible backend (e.g., Google Cloud Trace, Jaeger, Grafana Tempo, Datadog).
+*   **Hierarchical Visualization:** Traces are organized into "Spans." An agent run is a root span, which contains child spans for LLM operations, which may in turn contain child spans for tool executions. This creates a clear "waterfall" view of the agent's reasoning loop.
+*   **Context Propagation:** ADK automatically passes trace context across process boundaries, ensuring that if your agent calls an external microservice via a tool, that service's spans are linked to the agent's root trace.
+
+---
+
+## Traces schema
+
+When tracing is enabled, ADK automatically instruments key operations following the OpenTelemetry GenAI Semantic Conventions for Agents. A typical trace waterfall includes the following spans:
+
+| Span Name | Type | Description | Key Attributes |
+| :--- | :--- | :--- | :--- |
+| **[`invoke_agent`](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-agent-spans.md#invoke-agent-client-span)** | Client / Internal Span | Describes GenAI agent invocation over a remote service or locally. Represents the lifecycle of an agent interaction.| `gen_ai.agent.name`, `gen_ai.system` |
+| **[`invoke_workflow`](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-agent-spans.md#invoke-workflow-span)** | Child Span | Describes the invocation of a multi-step agentic workflow. | `gen_ai.workflow.name`, `gen_ai.system`|
+| **[`execute_tool`](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-agent-spans.md#execute-tool-span)**       | Child Span | Represents the execution of a specific tool or function call requested by the GenAI system.| `gen_ai.tool.name`, `gen_ai.system`|
+| **[`generate_content {model.name}`](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md)** | Internal Span | Represents the invocation of the underlying language model (via the GenAI SDK) to generate content. It tracks the request parameters, response details, and usage metrics. | `gen_ai.operation.name`, `gen_ai.system`, `gen_ai.request.model`, `gen_ai.agent.name`, `gen_ai.conversation.id`, `user.id`, `gen_ai.request.top_p`, `gen_ai.request.max_tokens`, `gen_ai.response.finish_reasons`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens` |
+
+---
+
+## Traces export setup
+
+### Traces export in ADK Web
+
+If you are running your agent using the `adk web` or `adk api_server` CLI commands, you can configure trace exports.
+
+#### OTLP export
+
+To export traces to an OTLP-compatible backend, set the standard OTel environment variables:
+
+```bash
+export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="http://your-collector:4318/v1/traces"
+adk web path/to/your/agents_dir
+```
+
+> **Note:**  You can also set the general `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable if you would like to send metrics and logs to the same endpoint in addition to traces.
+
+
+#### GCP export
+
+To enable trace export to Google Cloud Trace, use the `-otel_to_cloud` flag:
+
+```bash
+adk web -otel_to_cloud path/to/your/agents_dir
+```
+
+### Programmatic traces export
+
+You can also configure trace export programmatically in your application code.
+
+#### OTLP export setup
+
+To enable tracing and export spans to an OpenTelemetry Collector programmatically:
+
+```python
+from google.adk.telemetry.setup import maybe_set_otel_providers
+import os
+
+os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = "http://your-collector:4318/v1/traces"
+os.environ["OTEL_SERVICE_NAME"] = "your-adk-agent"
+os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "key1=value1,key2=value2"
+maybe_set_otel_providers()
+```
+
+#### GCP export setup
+
+To export traces to Google Cloud Trace programmatically, use the OpenTelemetry Google Cloud exporter. Here is an example in Python:
+
+```python
+from google.adk.telemetry.google_cloud import get_gcp_exporters
+from google.adk.telemetry.setup import maybe_set_otel_providers
+import os
+
+gcp_exporters = get_gcp_exporters(
+  enable_cloud_tracing = True,
+)
+os.environ["OTEL_SERVICE_NAME"] = "your-adk-agent"
+os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "key1=value1,key2=value2"
+maybe_set_otel_providers([gcp_exporters])
+```