in_node_exporter_metrics: Increase buffer size to read /proc/stat correctly by piwai · Pull Request #11253 · fluent/fluent-bit

piwai · 2025-12-04T13:38:39Z

Hello,

for weeks we've been experiencing an issue with fluent-bit where the CPU metrics provided by node exporter like node_cpu_seconds_total would stall, for an unknown reason. Also, after a few days, the metrics would "unfreeze" and resume, and we really didn't understand why.

After recompiling and adding debug into fluent-bit, I managed to trace it down to an issue in the "ne_utils_file_read_lines()" function:

fluent-bit/plugins/in_node_exporter_metrics/ne_utils.c

Line 151 in 712b48d

    
           int ne_utils_file_read_lines(const char *mount, const char *path, struct mk_list *list)

This function uses a 512 bytes buffer to read lines, which is insufficient to read /proc/stat entries correctly. Indeed, the "intr" line can be larger.

Most of the time, this isn't an issue, except when the line being read has a length which is multiple of (buffer size -2), which cause the fgets() loop to produce an empty line, causing an error up in the call stack, and preventing the CPU metrics to be updated. But as soon as some interrupt counter reaches an additional digit (e.g 99 -> 100), the problem disappears since the line length will increase, and there will be a single character to read.

Sample /proc/stat file showing the issue, with intr line of 1020 chars (obtained from an Ubuntu 20.04 LTS):

$ cat proc_stat.txt
cpu  644441261 250278 193710764 6839564199 150758 0 21261544 4590423 0 0
cpu0 160903415 60997 48404023 1709556132 38581 0 7668248 1152730 0 0
cpu1 160732678 64155 48446004 1710342339 38188 0 5804330 1145684 0 0
cpu2 161798128 63702 48389798 1709541165 35120 0 4468764 1145420 0 0
cpu3 161007038 61422 48470938 1710124561 38867 0 3320201 1146587 0 0
intr 32324448099 29 9 0 0 0 0 3 0 1 0 9724876 32 15 0 0 19295503 0 0 0 0 0 0 0 0 0 2367 0 0 0 9209232 9791754 9997988 8338376 0 1001160064 916738961 0 410030264 436692148 0 652365018 480804600 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 150886032456
btime 1745401938
processes 111586863
procs_running 1
procs_blocked 0
softirq 19224384590 0 1384271522 571488 1539519326 9648544 0 137410393 4010803922 21329 3552203474

Sample reproducer script, using a copy of ne_utils_file_read_lines() as main:

$ cat test.c
#include <stdio.h>
#include <string.h>


// copy of ne_utils_file_read_lines() in plugins/in_node_exporter_metrics/ne_utils.s
int main (int argc, char **argv) {
    
    int len;
    int ret;
    FILE *f;
    char line[512];
    // char real_path[2048];

    // mk_list_init(list);

    // /* Check the path starts with the mount point to prevent duplication. */
    // if (strncasecmp(path, mount, strlen(mount)) == 0 &&
    //     path[strlen(mount)] == '/') {
    //     mount = "";
    // }


    f = fopen(argv[1], "r");
    if (f == NULL) {
        printf("Cannot open %s\n", argv[1]);
        //flb_errno();
        return -1;
    }

    /* Read the content */
    while (fgets(line, sizeof(line) - 1, f)) {
        len = strlen(line);
        if (line[len - 1] == '\n') {
            line[--len] = 0;
            if (len && line[len - 1] == '\r') {
                line[--len] = 0;
            }
        }
        printf("line of %d bytes: %s \n", len, line);
        if (len == 0) {
            printf("!!!!!!!!! ERROR, line has no len, flb_slist_add will fail!!!!!!!!!!!!\n");
        }

        //ret = flb_slist_add(list, line);
        //if (ret == -1) {
        //    fclose(f);
        //    flb_slist_destroy(list);
        //    return -1;
        //}

    }
}

Sample output:

$ gcc test.c
$ ./a.out proc_stat.txt
line of 72 bytes: cpu  644441261 250278 193710764 6839564199 150758 0 21261544 4590423 0 0
line of 68 bytes: cpu0 160903415 60997 48404023 1709556132 38581 0 7668248 1152730 0 0
line of 68 bytes: cpu1 160732678 64155 48446004 1710342339 38188 0 5804330 1145684 0 0
line of 68 bytes: cpu2 161798128 63702 48389798 1709541165 35120 0 4468764 1145420 0 0
line of 68 bytes: cpu3 161007038 61422 48470938 1710124561 38867 0 3320201 1146587 0 0
line of 510 bytes: intr 32324448099 29 9 0 0 0 0 3 0 1 0 9724876 32 15 0 0 19295503 0 0 0 0 0 0 0 0 0 2367 0 0 0 9209232 9791754 9997988 8338376 0 1001160064 916738961 0 410030264 436692148 0 652365018 480804600 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
line of 510 bytes:  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
line of 0 bytes:
!!!!!!!!! ERROR, line has no len, flb_slist_add will fail!!!!!!!!!!!!
line of 17 bytes: ctxt 150886032456
line of 16 bytes: btime 1745401938
line of 19 bytes: processes 111586863
line of 15 bytes: procs_running 1
line of 15 bytes: procs_blocked 0
line of 98 bytes: softirq 19224384590 0 1384271522 571488 1539519326 9648544 0 137410393 4010803922 21329 3552203474

Proposed fix is to increase the readline buffer size from 512 to 2048 bytes (1024 seems a bit low given the size of the intr line which is 1020 chars)

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

[N/A] Example configuration file for the change
Debug log output from testing the change

[N/A] Attached Valgrind output that shows no leaks or memory corruption was found (static buffer)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

[N/A] Run local packaging test showing all targets (including any new ones) build.
Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

[N/A] Documentation required for this feature

Backporting

Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

Bug Fixes
- Increased the maximum supported line length for file reading, allowing the system to correctly handle longer lines from diverse data sources.
- Improves compatibility and reduces failures/truncation when processing larger or unexpectedly long input lines.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-04T13:39:01Z

Walkthrough

In ne_utils_file_read_lines, the line-read buffer size was increased from 512 to 2048 bytes; file open, line reading, newline trimming, and appending behavior remain unchanged. No public API or function signatures were modified.

Changes

Cohort / File(s)	Summary
Buffer expansion in file reader `plugins/in_node_exporter_metrics/ne_utils.c`	Increased line-read buffer from 512 to 2048 bytes in `ne_utils_file_read_lines`; logic for opening, reading, trimming, and appending lines unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

Verify buffer allocation location (stack vs heap) and lifetime
Search for any code that assumes the previous 512-byte limit
Confirm 2048 bytes fits expected metric line sizes

Possibly related PRs

in_node_exporter_metrics: add netstat linux collector #11052 — Netstat collector added there calls ne_utils_file_read_lines; the buffer-size increase (512→2048) directly affects parsing of /proc/net/snmp.

Poem

My nose twitches at the code,
I hop where buffers stretch and grow,
From tiny steps of five-one-two,
To roomy bounds of two-oh-four-oh,
Metrics munching, happy rabbit, go! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title accurately describes the main change: increasing buffer size in the in_node_exporter_metrics plugin from 512 to 2048 bytes to fix /proc/stat reading issues.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

plugins/in_node_exporter_metrics/ne_utils.c (1)

175-190: Consider explicit handling of partial reads for future robustness.

The current logic adds each fgets result as a separate list entry. While the buffer increase makes partial reads unlikely, lines exceeding 2046 characters would still be split across multiple entries. For maximum robustness, consider:

Accumulating chunks until a newline is found, or

Explicitly skipping empty lines (len == 0) before calling flb_slist_add

Given that /proc/stat lines are unlikely to exceed 2KB in practice, this is purely a future-proofing suggestion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c88c545 and 7fc057f.

📒 Files selected for processing (1)

plugins/in_node_exporter_metrics/ne_utils.c (1 hunks)

🔇 Additional comments (1)

plugins/in_node_exporter_metrics/ne_utils.c (1)

156-156: LGTM! Buffer increase addresses the reported issue.

The buffer size increase from 512 to 2048 bytes directly fixes the bug where /proc/stat's "intr" line (~1020 characters) was causing intermittent stalls. With a 2048-byte buffer, lines up to 2046 characters will be read atomically, providing comfortable headroom for the reported use case.

edsiper · 2025-12-04T23:42:15Z

Thanks for your contribution, please fix your commit subject:

Run python .github/scripts/commit_prefix_check.py
❌ Commit 7fc057f3fc failed:
Missing prefix in commit subject: 'fix(ne_utils): Increase buffer size to read /proc/stat correctly'
Commit prefix validation failed.
Error: Process completed with exit code 1.

it must be prefixed with in_node_exporter_metrics: ...

…rectly The "intr" entry of proc stat can be larger than 512 chars, and generate errors leading to stalled CPU metrics if it's the wrong length. Signed-off-by: Pierre-Yves Rofes <3604235+piwai@users.noreply.github.com>

piwai · 2025-12-05T08:29:07Z

@edsiper thanks for the review, I updated the commit title as requested.

edsiper · 2025-12-05T23:46:07Z

thanks!

piwai requested review from cosmo0920 and edsiper as code owners December 4, 2025 13:38

github-actions bot added the docs-required label Dec 4, 2025

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

edsiper added this to the Fluent Bit v4.2.1 milestone Dec 4, 2025

piwai temporarily deployed to pr December 4, 2025 23:41 — with GitHub Actions Inactive

piwai temporarily deployed to pr December 5, 2025 00:01 — with GitHub Actions Inactive

piwai force-pushed the master branch from 7fc057f to 4e2d51f Compare December 5, 2025 08:23

piwai force-pushed the master branch from 4e2d51f to 7eef4c3 Compare December 5, 2025 08:25

piwai temporarily deployed to pr December 5, 2025 11:46 — with GitHub Actions Inactive

cosmo0920 changed the title ~~fix(ne_utils): Increase buffer size to read /proc/stat correctly~~ in_node_exporter_metrics: Increase buffer size to read /proc/stat correctly Dec 5, 2025

piwai temporarily deployed to pr December 5, 2025 12:05 — with GitHub Actions Inactive

piwai temporarily deployed to pr December 5, 2025 12:06 — with GitHub Actions Inactive

edsiper merged commit 7ded9ae into fluent:master Dec 5, 2025
59 of 61 checks passed

coderabbitai bot mentioned this pull request Dec 9, 2025

out_kafka: support AWS MSK IAM authentication #11224

Closed

7 tasks

BrewTestBot mentioned this pull request Dec 18, 2025

fluent-bit 4.2.1 Homebrew/homebrew-core#259139

Merged

coderabbitai bot mentioned this pull request Dec 26, 2025

Add length verification for preventing buffer overflow #11319

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

in_node_exporter_metrics: Increase buffer size to read /proc/stat correctly#11253

in_node_exporter_metrics: Increase buffer size to read /proc/stat correctly#11253
edsiper merged 1 commit intofluent:masterfrom
piwai:master

piwai commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

edsiper commented Dec 4, 2025

Uh oh!

piwai commented Dec 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

edsiper commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

piwai commented Dec 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

edsiper commented Dec 4, 2025

Uh oh!

piwai commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

edsiper commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

piwai commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 4, 2025 •

edited

Loading

piwai commented Dec 5, 2025 •

edited

Loading