Skip to content

[receiver/systemd]: Scrape unit CPU time#44646

Merged
atoulme merged 1 commit intoopen-telemetry:mainfrom
SquidDev:systemd-cpu
Nov 30, 2025
Merged

[receiver/systemd]: Scrape unit CPU time#44646
atoulme merged 1 commit intoopen-telemetry:mainfrom
SquidDev:systemd-cpu

Conversation

@SquidDev
Copy link
Contributor

Description

  • Adds support for reading each unit's cgroup statistics.
  • Adds a new systemd.unit.cpu.time metric. This has a cpu.mode, tracking both system and user time. Unlike other CPU time metrics (e.g. process.cpu.time) which use seconds, this metric uses microseconds (us), as that is the native unit that systemd works with. Feedback is welcome on whether we should convert this to seconds.

Link to tracking issue

Part of #33532, but does not close it.

Testing

  • There are unit tests with a mocked-out cgroups filesystem.
  • Have tested on my local laptop, and checked correct data is produced.

Documentation

None beyond the generated documentation right now.

atoulme pushed a commit that referenced this pull request Nov 30, 2025
#### Description
This promotes the systemd receiver to alpha stability. While this is by
no means feature complete (e.g. per-unit CPU and memory usage, unit
uptime), its current functionality is still useful (for observability
around unit failures), and it's clear there is demand for it (e.g.
#44420).

There are still some open questions about the shape of the exported
metrics, such as whether to represent each unit as a resource or not (as
discussed in #33532, and what units to default to (see #44646), but I
think we'll be better able to answer those once this is used in the
wild.

#### Documentation
Additional examples have been added to the receiver's README.
@atoulme atoulme merged commit b1f6859 into open-telemetry:main Nov 30, 2025
190 checks passed
@github-actions github-actions bot added this to the next release milestone Nov 30, 2025
@otelbot
Copy link
Contributor

otelbot bot commented Nov 30, 2025

Thank you for your contribution @SquidDev! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help.

@SquidDev SquidDev deleted the systemd-cpu branch November 30, 2025 22:20
atoulme pushed a commit that referenced this pull request Dec 19, 2025
#### Description
When scraping non-service units, the systemd receiver would still
attempt to fetch the unit's control group, which would fail. We now
guard this by checking the unit is actually a service[^1].

[^1]: This is done by checking the unit ends with `.service`. Ideally
we'd check for the presence of the "Service" interface instead, but the
only way to do that is parsing the dbus introspection XML. For better or
worse, just checking the suffix seems to be common practice, and is much
easier.

To match this, we also rename the `systemd.unit.cpu.time` metric to
`systemd.service.cpu.time`. This is a breaking change, but the metric
still has a stability of "development", so I don't believe needs to go
through the normal notification cycle?

Really sorry about this! I deliberately namespaced the initial
`systemd.unit.state` metric, to allow for separate `systemd.service.*`
metrics, and then entirely forgot about this when implementing #44646.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants