Add observability for script execution

## User Story

**As a** cluster operator, **I want** logging and metrics for script execution **so that** I can monitor and diagnose issues in production.

## Background

The code mode feature (STORY-001) adds server-side Starlark script execution to vMCP via the `execute_tool_script` virtual tool. Operators need visibility into this new execution path to monitor health, diagnose failures, and understand resource consumption. Without observability, script execution is a black box that operators cannot troubleshoot or capacity-plan around.

The vMCP codebase already has a well-established telemetry pattern using OpenTelemetry (see `pkg/vmcp/server/telemetry.go` and `pkg/telemetry/middleware.go`). This story follows those existing patterns to add structured logging and OTel metrics/traces for the script execution engine.

## Scope

### In Scope

- Structured logging for script lifecycle events (start, completion, errors, timeouts, limit violations)
- OTel metrics for script execution (count, duration, error rate, tool calls per script, parallel fan-out)
- OTel tracing spans for script execution and inner tool calls
- Follow existing vMCP telemetry patterns (`telemetryBackendClient` decorator pattern, `MCPHistogramBuckets`, `instrumentationName` conventions)

### Out of Scope

- Custom dashboards or alerting rules (operators bring their own observability stack)
- Metrics for adoption tracking (covered in STORY-003)
- Changes to the telemetry pipeline or collector configuration

## Acceptance Criteria

- [ ] unit: Structured logs emitted for script start, completion, and errors — log entries include script hash, session ID, execution duration, tool call count, and error details; script content is NOT logged
- [ ] unit: OTel counter `toolhive_vmcp_script_executions` increments with `status` attribute (`success`/`error`/`timeout`/`step_limit`) on each script execution
- [ ] unit: OTel histogram `toolhive_vmcp_script_duration` records script execution duration in seconds
- [ ] unit: OTel histogram `toolhive_vmcp_script_tool_calls` records the number of inner tool calls per script execution
- [ ] unit: OTel counter `toolhive_vmcp_script_parallel_goroutines` increments by the number of goroutines spawned per `parallel()` call
- [ ] unit: A parent trace span is created for `execute_tool_script`, with child spans for each inner tool call, including attributes `script.tool_count`, `script.parallel_used`, `script.step_count`
- [ ] unit: Metrics and traces use the existing `instrumentationName` constant and `MeterProvider`/`TracerProvider` from the vMCP server

## Technical Notes

### Existing Patterns to Follow

The `telemetryBackendClient` in `pkg/vmcp/server/telemetry.go` demonstrates the project's telemetry decorator pattern:
- Metrics are created via `meter.Int64Counter()`, `meter.Float64Histogram()` with descriptive names prefixed by `toolhive_vmcp_`
- Histograms use `telemetry.MCPHistogramBuckets` for bucket boundaries
- The `record()` method pattern creates a span, records start metrics, and returns a deferred cleanup function
- Attributes follow both backward-compat (`tool_name`) and OTEL spec (`gen_ai.tool.name`) conventions

### Implementation Approach

1. **Metrics/tracing injection**: Accept `metric.MeterProvider` and `trace.TracerProvider` in the script engine or middleware constructor. Do not create global meters.
2. **Decorator or inline**: Either wrap the script engine with a telemetry decorator (preferred, matches existing pattern) or instrument the engine directly. The decorator approach keeps telemetry concerns separated from execution logic.
3. **Logging**: Use `slog.With()` to attach structured fields. The middleware already has access to the request context for correlation.
4. **Inner tool call spans**: Each tool call dispatched from within a script should create a child span under the script execution span. The existing `telemetryBackendClient.CallTool` may already handle this if inner calls flow through the instrumented backend client.

### Key Files

- `pkg/script/engine.go` -- Starlark execution engine (instrument here or wrap)
- `pkg/script/bridge.go` -- Tool bridge and `parallel()` (goroutine counting)
- `pkg/script/middleware.go` -- HTTP middleware entry point (logging, top-level span)
- `pkg/vmcp/server/telemetry.go` -- Existing telemetry patterns to follow
- `pkg/telemetry/middleware.go` -- `MCPHistogramBuckets` and shared telemetry utilities

### Dependencies

- Depends on STORY-001 (the script execution engine must exist before it can be instrumented)
- Uses `go.opentelemetry.io/otel/metric` and `go.opentelemetry.io/otel/trace` (already in go.mod)

## References

- Prototype PR: https://github.com/stacklok/toolhive/pull/4714 (branch: `jerm-dro/script-middleware-prototype`)
- Existing telemetry: `pkg/vmcp/server/telemetry.go`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add observability for script execution #4743

User Story

Background

Scope

In Scope

Out of Scope

Acceptance Criteria

Technical Notes

Existing Patterns to Follow

Implementation Approach

Key Files

Dependencies

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add observability for script execution #4743

Description

User Story

Background

Scope

In Scope

Out of Scope

Acceptance Criteria

Technical Notes

Existing Patterns to Follow

Implementation Approach

Key Files

Dependencies

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions