|
| 1 | +# OTLP over messaging systems |
| 2 | + |
| 3 | +Best practices for transporting OTLP data over a message system. |
| 4 | + |
| 5 | +## Motivation |
| 6 | + |
| 7 | +This proposal tries to bring consistency and guidelines when transporting |
| 8 | +OTLP data over messaging systems. A non-exclusive list of examples |
| 9 | +of products in this category are: |
| 10 | + |
| 11 | +* Apache Kafka |
| 12 | +* Apache Pulsar |
| 13 | +* Google Pubsub |
| 14 | +* AWS SNS |
| 15 | +* RabbitMQ |
| 16 | + |
| 17 | +Using an intermediate messaging system is in some cases preferred compared to |
| 18 | +a direct network connection between the telemetry producer and consumer. |
| 19 | +Reasons for using such an intermediate medium could be: |
| 20 | + |
| 21 | +* Protecting against backend failure |
| 22 | +* Security Policies |
| 23 | +* Network Policies |
| 24 | +* Buffering |
| 25 | + |
| 26 | +An extra motivation to have a consistent definition of the payload is that |
| 27 | +it would be easy to transfer the OTLP data from one messaging system to |
| 28 | +another just by reading the payload from one system and writing it to |
| 29 | +another without the need for transformations. |
| 30 | + |
| 31 | +## Explanation |
| 32 | + |
| 33 | +Because the OTLP payload that is sent over messaging systems is |
| 34 | +well-defined, it’s easy to implement new systems consistently. |
| 35 | +An implementation should at least support `otlp_proto`, meaning that the |
| 36 | +payload is Protobuf serialized from `ExportTraceServiceRequest` for traces, |
| 37 | +`ExportMetricsServiceRequest` for metrics, or `ExportLogsServiceRequest` for |
| 38 | +logs. |
| 39 | + |
| 40 | +Optionally an implementation could support `otlp_json` or other alternative |
| 41 | +encodings like jaeger or zipkin payloads. Alternative encodings should be configured |
| 42 | +by an `encoding` field in the configuration file. |
| 43 | + |
| 44 | +For the user it’s beneficial that both an exporter, and a receiver are |
| 45 | +implemented in the OpenTelemetry Collector, but SDK developers could |
| 46 | +also implement a language specific exporter. |
| 47 | + |
| 48 | +For log data, implementors of a receiver are encouraged to also support |
| 49 | +receiving raw data from a topic. This enables scenarios were |
| 50 | +non-OpenTelemetry log producers can produce a stream of logs, either |
| 51 | +structured or unstructured, on that topic and the receiver wraps the log |
| 52 | +data in a configurable resource. |
| 53 | + |
| 54 | +## Internal details |
| 55 | + |
| 56 | +The default implementation must implement `otlp_proto`, meaning that the |
| 57 | +payload is Protobuf serialized from `ExportTraceServiceRequest` for traces, |
| 58 | +`ExportMetricsServiceRequest` for metrics, or `ExportLogsServiceRequest` for |
| 59 | +logs. If an implementation support other encodings an `encoding` field should |
| 60 | +be added to the configuration to make it switchable. |
| 61 | + |
| 62 | +| value | trace | metric | log | description | |
| 63 | +|---|---|---|---|---| |
| 64 | +| `otlp_proto` (default) | X | X | X | protobuf serialized from the `Export(Trace/Metric/Log)ServiceRequest` | |
| 65 | +| `otlp_json` | X | X | X | proto3 json representation of a `Export(Trace/Metric/Log)ServiceRequest` | |
| 66 | +| `jaeger_proto` | X | - | - | the payload is deserialized to a single Jaeger proto `Span` | |
| 67 | +| `jaeger_json` | X | - | - | the payload is deserialized to a single Jaeger JSON Span using `jsonpb` | |
| 68 | +| `zipkin_proto` | X | - | - | the payload is deserialized into a list of Zipkin proto spans | |
| 69 | +| `zipkin_json` | X | - | - | the payload is deserialized into a list of Zipkin V2 JSON spans | |
| 70 | +| `zipkin_thrift` | X | - | - | the payload is deserialized into a list of Zipkin Thrift spans | |
| 71 | +| `raw_string` | - | - | X | see `Log specific collector receiver` | |
| 72 | +| `raw_binary` | - | - | X | see `Log specific collector receiver` | |
| 73 | + |
| 74 | +Above you’ll find a non-exclusive list of possible encodings. Only `otlp_proto` |
| 75 | +is mandatory and the default. |
| 76 | + |
| 77 | +### Log specific collector receiver |
| 78 | + |
| 79 | +As the flow control is the hardest part of the implementations, adding a feature |
| 80 | +for reading raw log events avoids having an additional parallel |
| 81 | +implementation todo just that. |
| 82 | + |
| 83 | +The encoding field should be used to indicate that the data received is not the |
| 84 | +default `otlp_proto` data, but raw log data. The receiver should construct a valid |
| 85 | +OTLP message for each raw message received. Valid encoding are `raw_string` |
| 86 | +and `raw_binary`, that will control the type that the data will have when set |
| 87 | +in the `body` of the OTLP message. |
| 88 | + |
| 89 | +As the raw log data don’t have a resource attached to them, the receiver should add |
| 90 | +a generic resource and instrumentation library message around the raw messages. The |
| 91 | +instrumentation library should be set to the name of the receiver and the version to |
| 92 | +that of the collector. Defining the exact resource should be done in the pipeline, |
| 93 | +using for example the `resourceprocessor`. |
| 94 | + |
| 95 | +## Prior art and alternatives |
| 96 | + |
| 97 | +The `kafkareceiver` and `kafkaexporter` already implement this OTLP as |
| 98 | +described in the OpenTelemetry Collector. The description in th OTEP |
| 99 | +makes both of them compliant without modification. Although it |
| 100 | +doesn't implement the raw logging support but this could be added |
| 101 | +without conflicting with the implementation. |
| 102 | + |
| 103 | +## Open questions |
| 104 | + |
| 105 | +This proposal doesn't take advantage of attributes that some systems |
| 106 | +support. Should it? Would it be useful to look at the CloudEvents |
| 107 | +spec, to leverage the conventions for attributes? |
| 108 | + |
| 109 | +No guaranty of order could be given, because not all systems support order. |
| 110 | +Is that a problem? |
0 commit comments