Skip to content

CAMEL-15260 - opentracing: fixes header extraction regression#4044

Merged
oscerd merged 4 commits into
apache:masterfrom
jam01:opentracing-fix
Jul 29, 2020
Merged

CAMEL-15260 - opentracing: fixes header extraction regression#4044
oscerd merged 4 commits into
apache:masterfrom
jam01:opentracing-fix

Conversation

@jam01
Copy link
Copy Markdown
Contributor

@jam01 jam01 commented Jul 24, 2020

This fixes a regression introduced with #4001 spotted by @valdar here #4001 (comment)

This change makes the tracer fall back to camel's header extract adapter when a span is not found for the exchange. That's a span that would be there when utilizing instrumentation. So this would work with instrumentation, or without as it worked before the regression. It also introduces an integration test to prevent issues in the future.

@valdar let me know if this satisfies the issue for you and I'll mark the PR as ready.

Copy link
Copy Markdown
Contributor

@oscerd oscerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually this needs to go on 3.4.x too, I guess

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 24, 2020

What's the process @oscerd ? Do I point to master and then you'd merge down into 3.4, or should I point to 3.4.x?

@oscerd
Copy link
Copy Markdown
Contributor

oscerd commented Jul 24, 2020

You can open a pr against 3.4.x too, or if you don't have time, I'll backport myself

@valdar
Copy link
Copy Markdown
Member

valdar commented Jul 24, 2020

Hi @jam01 thanks a lot for the effort and the quick response. I was reasoning about your assumptions/conditions:

  1. Camel should only attempt to read from the incoming headers when there's not already an active span for that exchange. This is in case instrumentation already extracted the parent from the headers. Activating the span for the exchange in instrumentation is the part I'm still working on.
    
  2. Only read headers for components that are not internal (seda, direct, vm, etc).
    
  3. Only add the Server tag when.... the component is not internal and no span is active for the exchange. Whether we found a parent in the headers or not.
    

While I understand the principle behind, is the condition number 2 really necessary? I have a real use-case coming from a camel subproject: camel-kafka-connectr where Kafka Connect api and camel are interfaced using exactly direct endpoints as the route starter. So the uber-trace-id is coming from the kafka message consumed by Kafka connect api and passed to camel-kafka-connector. Then camel-kafka-connector crafts an exchange coping the uber-trace-id in the proper header and finally sending it to the direct endpoint using a producerTemplate.

At the end of the day if the exchange is initiated in another route it should already have an active span right?

There would be another way to handle this and still comply to your conditions?

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 24, 2020

Ah I see. So in this case you want the kafka-connector to just fwd over the headers. Acting sort of as a non-trace-participant proxy. You don't care about capturing kafka specific tags like in KafkaSpanDecorator. Your trace would be something like:

[--------------------- trace --------------------------------------]
[-- kafka outbound span --] possibly kafka outbound tags
                                       [--- camel:direct -------] server tag, camel-direct decorator tags

instead of

[--------------------- trace --------------------------------------]
[-- kafka outbound span --] possibly kafka outbound tags
                            [--- camel-kafka inbound span---------] server tag, camel-kafka decorator tags
                                      [--- camel:direct --------] camel-direct decorator tags

I hadn't thought of a scenario like this. I thought of it as an user error to receive an out-of-process request and not:

  • Extract the parent context
  • Create a new Span as its child
  • Call ActiveSpanManager#activate before using a producer termplate.

but that doesn't take into consideration libraries like kafka-connector 😲 where you may not be ready to depend on a Tracing API yet. I also didn't know there were camel "components" that were not Endpoints so I didn't consider that. What's the difference between camel-kafka and camel-kafka-connector?

So that said, I think I'll go ahead and allow internal endpoints to also extract from the headers. Perhaps it's worth noting in the docs that for application developers it is an error not to do the process above when using producer templates.

A little unrelated... Is the connector looking for uber-trace-id specifically? If so it wouldn't support tracers different than Jaeger when using default context propagation. I think the only way to do that "correctly" without relying on opentracing or opentelemetry is to use the standardized W3C header names (proxying like this is a driving force behind the standard), or make the headers to fwd configurable so library users could use any tracer implementation.

@valdar
Copy link
Copy Markdown
Member

valdar commented Jul 26, 2020

@jam01 I will try to answer:

camel-kafka is the camel component that connects to a Kafka broker. camel-kafka-connector is a camel subproject that makes easy to use camel components as Kafka Connect connectors.

The connector is not looking for "uber-tracer-id" it is using the tracer mechanism trough the .extract() method. This has a specific implementation for each tracer, the Jaeger tracer looks for that header, others might look for different ones.

I understand that I can actually do the steps you mentioned BEFORE using the producer template, I was trying to reason about what we get and loose by doing them regardless. I see 3 possible scenarios:

  1. the exchange coming to an internal endpoint has already an active span/context registered in the exchange property (i.e. it is coming from an internal camel routing call and somewhere else during the camel routing a tracing span/context has been activated)

  2. the exchange coming in has no span activated but the tracer.extract() return a context to use (i.e. someone has used a producer template setting the right headers for this to happen)

  3. the exchange coming has no span activated and tracer.extract() does not return a tracing context (i.e. someone has used a producer template without passing a tracing context)

1 is what you handle right now, if we are not in this case then we can fall back to 2 and then 3. In the case of 3, we can either start a new trace or ignore for tracing (I would vote for starting a new trace).

I might miss something but I don't see what we lose or have not considered in acting this way.

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 26, 2020

Gotcha. Thanks for the explanation. With my last commit we'd support 1 -> 2 -> 3 :)

So it'd support whichever approach you decide to take as far as activating the exchage property before executing the producer template or not.

What you gain when activating it is that you can add the metadata that won't be available to camel since there's no Kakfa endpoint/kafka decorator involved. Also an extra kafka-connector Span which may or may not be critical if the connector doesn't take long before fwding to camel.

What you lose... yeah if you already depend on opentracing to extract the context I can't think of anything you'd lose from activating a Span. Unless you wanted to hide kafka-connector from traces.

@valdar
Copy link
Copy Markdown
Member

valdar commented Jul 26, 2020

@jam01 Let's forget for a minute about my use-case and reason more generally about tracing support in camel. Do you see any problems or missing cases in taking this approach?

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 26, 2020

Hmm you know I think there is one minor change I'd like to propose/reintroduce. In scenario 3, do create a Span but only add a server tag if it's not an internal endpoint.

With scenario 2 someone was deliberate enough to fwd the headers, so it feels ok to assume they want camel to appear as the inter process handler.

With scenario 3, we can't make the same assumption. I imagine a scenario where an application dev is using any non-camel http server, has a server Span created but forgets to activate it on the exchange before delegating to camel for a response. We'll end up with 2 independent traces regardless, but if we add a server tag then we have 2 server spans for what's really a single service execution.

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 26, 2020

A TracedProducerTemplate would be good. It could grab whatever is the active Span and activate it on the exchange right before executing.

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 26, 2020

Oh more appropriate use cases for skipping the server tag with internal endpoints:

  • Quartz or other scheduled job that delegates to camel
  • Any application with an in-process interface that delegates to camel ie. a desktop app, command line app

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 28, 2020

@valdar what do you think? I'll mark the PR as ready, unless you still have reservations.

@jam01 jam01 marked this pull request as ready for review July 28, 2020 15:28
@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

@rubenvp8510 so this became a little more complicated to merge. I think in order to have the functionality be the same between camel-otel and camel-otracing in the way that camel-tracing handles things we'll need to introduce a SpanContextAdapter which may not be trivial. I haven't looked at how otel implements theirs...

Or just have each implementation do the same operations on their own, should be a minor duplication.

cc: @oscerd

@oscerd
Copy link
Copy Markdown
Contributor

oscerd commented Jul 29, 2020

A minor code duplication is fine

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

OK, I've rebased and setup opentracing to extract the headers correctly, however I can't seem to figure out how to extract an otel SpanContext from the headers...

Following what OpenTelemetryTracer#inject does I tried to do

OpenTelemetry.getPropagators().getHttpTextFormat().extract(Context.current(), sd.getExtractAdapter(exchange.getIn().getHeaders(), encoding), new OpenTelemetryGetter());

but that returns a Context and Span.Builder#setParent(io.opentelemetry.trace.SpanContext) visibly expects a SpanContext

I'll dig down a little more and see if I can figure this out... but not finding a way so far.

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

Figured it out... should be good to go! All opentelemetry and opentracing tests are passing

Copy link
Copy Markdown
Contributor

@oscerd oscerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, lets wait for @valdar

Copy link
Copy Markdown
Member

@valdar valdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@oscerd
Copy link
Copy Markdown
Contributor

oscerd commented Jul 29, 2020

I guess this will need to be backports to 3.4.x too for the regression, but we don't have camel-tracing common module there @jam01

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

I have the original work saved up. If/when you merge this one I'll open a PR to 3.4.x

@oscerd oscerd merged commit 22d4527 into apache:master Jul 29, 2020
@oscerd
Copy link
Copy Markdown
Contributor

oscerd commented Jul 29, 2020

Merged @jam01

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

OK I'm on it ...

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

oh it seems we're also missing #3919 and #4001 so the functionality in 3.4.x is as it was originally. Would you like me to send a PR that includes all changes?

@oscerd
Copy link
Copy Markdown
Contributor

oscerd commented Jul 29, 2020

If the regression reported from @valdar apply in 3.4.x I believe we need to backport whatever is needed to fix it, but if it's a complex situation or the behavior will change completely, then I don't think it's a good idea, we need to fix the specific case of the regression. @davsclaus FYI

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

It does not apply, the regression never made it to 3.4.x. Just the behavior in 3.4.x will be different than 3.5

@oscerd
Copy link
Copy Markdown
Contributor

oscerd commented Jul 29, 2020

Sorry, I'm having some confusion, from what I understood @valdar had some trouble with 3.4.x, so I thought it was related

@jam01
Copy link
Copy Markdown
Contributor Author

jam01 commented Jul 29, 2020

Oh you're right... I was under the same impression. But my changes never made it to https://github.com/apache/camel/commits/camel-3.4.x/components/camel-opentracing/src/main/java/org/apache/camel/opentracing/OpenTracingTracer.java I guess valdar just noticed the problem when building against 3.5 or just happened to notice the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants