Skip to content

Possibility to override traceId in datadog receiver #41120

@hlib-kuznetsov

Description

@hlib-kuznetsov

Component(s)

receiver/datadog

Is your feature request related to a problem? Please describe.

HI!
As far as I see, currently restoring full 128bit trace id require caching root metadata #39654
I have concern that if we have really long running script and while it running we will receive more than trace_id_cache_size unique traceIds, it will not able to restore full traceId and we will miss the parent span.

But here is the trick: at least in PHP it is possible to add tag manually,

$corellationId = \DDTrace\logs_correlation_trace_id();
$corellationIdHigh = substr($corellationId, 0, 16);

\DDTrace\add_global_tag('_dd.p.tid', $corellationIdHigh);
\DDTrace\add_global_tag('otel.trace_id', $corellationId);

\DDTrace\add_distributed_tag('otel.trace_id', $corellationId);
\DDTrace\add_distributed_tag('otel.tid.high', $corellationIdHigh);

From my observations add_global_tag will add tag to all child spans like so, but not the first, the one with _dd.p.tid set by extension :

\DDTrace\add_global_tag('_dd.p.tid', $corellationIdHigh);
\DDTrace\add_global_tag('otel.trace_id', $corellationId);

Debug output of child span with _dd.p.tid from \DDTrace\add_global_tag('_dd.p.tid', $corellationIdHigh);:

Here's the log
otel-collector-1  | Span #36
otel-collector-1  |     Trace ID       : 8c10b1ae722c274628cf9fe2f4235627
otel-collector-1  |     Parent ID      : 18f5c3e88f648208
otel-collector-1  |     ID             : 788771bdf03c631e
otel-collector-1  |     Name           : Redis.connect
otel-collector-1  |     Kind           : Client
otel-collector-1  |     Start time     : 2025-07-06 10:18:25.187458707 +0000 UTC
otel-collector-1  |     End time       : 2025-07-06 10:18:25.187839441 +0000 UTC
otel-collector-1  |     Status code    : Ok
otel-collector-1  |     Status message :
otel-collector-1  | Attributes:
otel-collector-1  |      -> dd.span.Resource: Str(Redis.connect)
otel-collector-1  |      -> datadog.span.id: Str(8685035467000537886)
otel-collector-1  |      -> datadog.trace.id: Str(2940744878803605031)
otel-collector-1  |      -> component: Str(phpredis)
otel-collector-1  |      -> db.system: Str(redis)
otel-collector-1  |      -> _dd.base_service: Str(symfony)
otel-collector-1  |      -> _dd.p.tid: Str(8c10b1ae722c2746) # it not added by datadog extension, this is the one from `add_global_tag`
otel-collector-1  |      -> otel.trace_id: Str(8c10b1ae722c274628cf9fe2f4235627) # and this one from `add_global_tag` too
otel-collector-1  |      -> out.host: Str(172.17.0.1)
otel-collector-1  |      -> out.port: Str(49028)
otel-collector-1  |      -> span.kind: Str(client)

And add_distributed_tag will add tag with _dd.p. prefix to the first span, the one with _dd.p.tid

Here's the log
otel-collector-1  | ScopeSpans #0
otel-collector-1  | ScopeSpans SchemaURL:
otel-collector-1  | InstrumentationScope Datadog 1.10.0
otel-collector-1  | Span #0
otel-collector-1  |     Trace ID       : 8c10b1ae722c274628cf9fe2f4235627
otel-collector-1  |     Parent ID      : e877e62adfe40b9b
otel-collector-1  |     ID             : 9a9d0e22fdc2bca8
otel-collector-1  |     Name           : symfony.request
otel-collector-1  |     Kind           : Server
otel-collector-1  |     Start time     : 2025-07-06 10:18:25.149190661 +0000 UTC
otel-collector-1  |     End time       : 2025-07-06 10:18:43.034958011 +0000 UTC
otel-collector-1  |     Status code    : Ok
otel-collector-1  |     Status message :
otel-collector-1  | Attributes:
otel-collector-1  |      -> dd.span.Resource: Str(event-stream/subscribe)
otel-collector-1  |      -> sampling.priority: Str(1.000000)
otel-collector-1  |      -> datadog.span.id: Str(11141076596633549992)
otel-collector-1  |      -> datadog.trace.id: Str(2940744878803605031)
otel-collector-1  |      -> _dd.p.otel.trace_id: Str(8c10b1ae722c274628cf9fe2f4235627) # this one comes from `add_distributed_tag`
otel-collector-1  |      -> span.kind: Str(server)
otel-collector-1  |      -> _dd.p.dm: Str(-0)
otel-collector-1  |      -> _dd.p.otel.tid.high: Str(8c10b1ae722c2746) # this one also comes from `add_distributed_tag`
otel-collector-1  |      -> _dd.p.tid: Str(8c10b1ae722c2746) # set by application, can not be overridden by `add_distributed_tag`
otel-collector-1  |      -> runtime-id: Str(92e3eb83-f08e-43f7-a24a-0a52d5124e7f)
otel-collector-1  |      -> user_agent.original: Str(Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:140.0) Gecko/20100101 Firefox/140.0)
otel-collector-1  |      -> component: Str(symfony)
otel-collector-1  |      -> symfony.route.action: Str(App\Controller\EventStreamController@subscribe)
otel-collector-1  |      -> http.response.status_code: Str(200)
otel-collector-1  |      -> _dd.parent_id: Str(0000000000000000)
otel-collector-1  |      -> http.request.method: Str(GET)
otel-collector-1  |      -> symfony.route.name: Str(event-stream/subscribe)
otel-collector-1  |      -> process.pid: Double(84)
otel-collector-1  |      -> _sampling_priority_v1: Double(1)
otel-collector-1  |      -> php.compilation.total_time_ms: Double(1673.82)
otel-collector-1  |      -> php.memory.peak_usage_bytes: Double(22235784)
otel-collector-1  |      -> php.memory.peak_real_usage_bytes: Double(23072768)

In other words, if I got it correctly, then with such addition if should be able to restore full traceId even if there no full traceId in cache.

if val, ok := traceIDCache.Get(span.TraceID); ok {
return val, nil
} else if val, ok := span.Meta["_dd.p.tid"]; ok {

But there are two problems:

  • Here is the bug with concurrent access to LRU cache (seems like it is datadog receiver causing collector to panic/crash with invalid memory address or nil pointer deference #40557):

    See the log
    otel-collector-1  | fatal error: concurrent map read and map write
    otel-collector-1  |
    otel-collector-1  | goroutine 18852 [running]:
    otel-collector-1  | internal/runtime/maps.fatal({0xc17b2da?, 0x79c87ac3b108?})
    otel-collector-1  |     runtime/panic.go:1058 +0x18
    otel-collector-1  | github.com/hashicorp/golang-lru/v2/simplelru.(*LRU[...]).Get(0xd5660c0, 0xbe259c0)
    otel-collector-1  |     github.com/hashicorp/golang-lru/v2@v2.0.7/simplelru/lru.go:72 +0x35
    otel-collector-1  | github.com/open-telemetry/opentelemetry-collector-contrib/receiver/datadogreceiver/internal/translator.traceID64to128(0xc0018602a0, 0xc000df3760)
    otel-collector-1  |     github.com/open-telemetry/opentelemetry-collector-contrib/receiver/datadogreceiver@v0.129.0/internal/translator/traces_translator.go:88 +0x3e
    otel-collector-1  | github.com/open-telemetry/opentelemetry-collector-contrib/receiver/datadogreceiver/internal/translator.ToTraces(0xc000bca680, 0xc001c42f70, 0xc0013752c0, 0xc000df3760)
    otel-collector-1  |     github.com/open-telemetry/opentelemetry-collector-contrib/receiver/datadogreceiver@v0.129.0/internal/translator/traces_translator.go:198 +0xfda
    otel-collector-1  | github.com/open-telemetry/opentelemetry-collector-contrib/receiver/datadogreceiver.(*datadogReceiver).handleTraces(0xc000c930e0, {0xd44e7e0, 0xc00152d980}, 0xc0013752c0)
    otel-collector-1  |     github.com/open-telemetry/opentelemetry-collector-contrib/receiver/datadogreceiver@v0.129.0/receiver.go:285 +0x398
    otel-collector-1  | net/http.HandlerFunc.ServeHTTP(0xc000e75ec0?, {0xd44e7e0?, 0xc00152d980?}, 0x0?)
    otel-collector-1  |     net/http/server.go:2294 +0x29
    
  • It looks a bit like dirty trick and I can't be sure implementation will not change in future and it still will work.

Describe the solution you'd like

It would be nice to have option to pick traceId from custom tag, like:

otel-collector-1  | Span #36
otel-collector-1  |     Trace ID       : 8c10b1ae722c274628cf9fe2f4235627
otel-collector-1  |     Parent ID      : 18f5c3e88f648208
otel-collector-1  |     ID             : 788771bdf03c631e
otel-collector-1  |     Name           : Redis.connect
otel-collector-1  |     Kind           : Client
otel-collector-1  | Attributes:
otel-collector-1  |      -> otel.trace_id: Str(8c10b1ae722c274628cf9fe2f4235627) # Set in app by user, like with `add_global_tag` or any other way.

Config to pick traceId from otel.trace_id

receivers:
  datadog:
    full_trace_id_tag: otel.trace_id

To be able to pick full 128bit traceId (or override it if we want) without using cache.

Describe alternatives you've considered

No response

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions