Skip to content

Allow customizing the number of outputs #5224

@davidovich

Description

@davidovich

Is your feature request related to a problem? Please describe.

There is currently a hard limit on the number of outputs that can be configured.

This limit is not documented at the moment, there is an issue for that.

The current landing page of fluentbit has a statement on scalability stating: "1pb Data Throughput Across Thousands Of Sources And Destinations Daily". And: Dynamic Routing ... Distribute data to multiple destinations with a zero copy strategy I misinterpreted this to fit my use-case.

Describe the solution you'd like

A configuration that allows to change this limit.

Describe alternatives you've considered

Fluentd seems to support many more outputs, but I have not tried implementing it as the footprint seems too high.

Some kind of sharding in fluent-bit would allow to keep the number of outputs fixed, and keep the limit. This sharding could be managed by the fluent-operator.

Additional context

We have an on-premise multi-tenant Kubernetes cluster where an API provides a varying endpoint for bulk ingestion in OpenSearch. Each new tenant on the cluster gets a dedicated OpenSearch tenant. We link the tenant in the k8s cluster to the ingestion endpoint using an output plugin. This endpoint is not known in advance. As tenants are self-serve, the amount of outputs might reach in the thousands.

Through a cluster management API, we use the fluent-operator to easily drop ClusterOutput documents to manage the binding from tenant-namespaces to OpenSearch endpoints.

I have not seen other solutions that are as elegant as the plugin system of fluent-bit, maybe there is an equivalent in fluentd, but overhead does matter in a shared cluster setting so that a minimal footprint is a desired quality.

Open question: Allowing the increase of outputs will have a memory impact and other compute side effects. I am leaning on the intuition that the low footprint of fluent-bit allows for more vertical scaling, but I do not know the internals to warrant if the assumption holds.

References:

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions