Skip to content

otel.process.context_switches metric counts only the context switches of the main thread #36804

@tobiasstarkwayve

Description

@tobiasstarkwayve

Component(s)

receiver/hostmetrics

What happened?

Description

The otel.process.context_switches metric claims to count context switches for a given process. It obtains the number of context switches from /proc/<pid>/status (through https://github.com/shirou/gopsutil). Unfortunately, this file counts only the context switches of the lead thread on Linux. Calling it the context switches of the process is hence misleading.

Steps to Reproduce

Log into a Linux machine, find a process with multiple threads, and compare the context-switch count. It can happen that a thread has more context switches than the process context-switch count. Example:

cat /proc/4491/status /proc/4491/task/4491/status /proc/4491/task/4495/status | grep nonvoluntary
nonvoluntary_ctxt_switches:     9
nonvoluntary_ctxt_switches:     9
nonvoluntary_ctxt_switches:     139

Expected Result

Either the metric should note that is is correct only for single-threaded processes, or the scraper should go through all threads within the same process and sum up their individual context switches

Actual Result

The metric reports the context switches of the lead thread as if they were the context switches of the entire process

Collector version

0.110.0

Environment information

Environment

OS: Ubuntu 20.04
Compiler(if manually compiled):

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions