Skip to content

Reduce GC pressure in CompositeJsonEncoder and *LogstashTcpSocketAppender #461

@brenuart

Description

@brenuart

CompositeJsonEncoder implements the Encoder interface and therefore must return the encoded event as a byte array.
The implementation makes use of an intermediate ByteArrayOutputStream to collect the various parts produced by the formatter and prefix/suffix encoders. When done, the result is returned as a byte array.

A new ByteArrayOutputStream is initialised for every log event. It starts with an initial size of about 1Kb (+ prefix/suffix length) by default and grows if the formatter produces a larger output. If we are lucky and the initial size is large enough, this process allocates 2 byte arrays and 2 memory copy. If the buffer needs to grow, a new one is allocated (larger) and the content of the previous is copied into it. We end up with 3 allocations and 3 copy operations.

This process is repeated for every event and imposes an extra overhead on the garbage collector.

Most of the time, the caller will write the output of the Encoder into an output stream. In this case, using an intermediate byte array isn't the most efficient design (well, I know, this is how Logback's Encoder interface is designed :-( ...

But maybe we could do better... I was thinking about introducing a new StreamingEncoder interface similar to this:

public interface StreamingEncoder {
  void encode(Event event, OutputStream stream) throws IOException;
}

CompositeJsonEncoder can be easily modified to implement this new interface alongside the existing byte[] encode(event) method. Then we can adapt AbstractLogstashTcpSocketAppender around lines L598-L602 to tell the encoder to write directly into the output stream if it implements the new StreamingEncoder interface.

This would be highly efficient while preserving support for "legacy" encoders.

I made a first POC with this idea and everything looks OK.
What do you think ?
Do you see other areas/classes that could be optimised using a similar technique?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions