Skip to content

fix(soot): add unique controller names to prevent metric conflicts#1043

Merged
prometherion merged 1 commit intoclastix:masterfrom
syedazeez337:fix/issue-1025-negative-metrics
Jan 4, 2026
Merged

fix(soot): add unique controller names to prevent metric conflicts#1043
prometherion merged 1 commit intoclastix:masterfrom
syedazeez337:fix/issue-1025-negative-metrics

Conversation

@syedazeez337
Copy link
Contributor

Fixes #1025

Previously, all soot managers (one per TenantControlPlane) created controllers with identical names (e.g., 'clusterrolebinding', 'configmap'). These controllers registered metrics with the same labels in the global Prometheus registry, causing worker count conflicts and negative values when TenantControlPlanes were dynamically created/destroyed.

This fix adds unique controller names by:

  • Adding a ControllerName field to all soot controller structs
  • Using .Named() in SetupWithManager with a unique name per TCP
  • Generating names as {namespace}-{name}-{controller-type}

Now each TenantControlPlane's controllers have unique metric labels, preventing conflicts and eliminating negative worker counts.

@netlify
Copy link

netlify bot commented Dec 29, 2025

Deploy Preview for kamaji-documentation canceled.

Name Link
🔨 Latest commit 5127f80
🔍 Latest deploy log https://app.netlify.com/projects/kamaji-documentation/deploys/69535c92dae7f1000875e714

@syedazeez337 syedazeez337 force-pushed the fix/issue-1025-negative-metrics branch from 7eb959e to f5d88fc Compare December 29, 2025 09:56
Copy link
Member

@prometherion prometherion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done on doing this investigation @syedazeez337, if CI is green, happy to get this merged!

Fixes clastix#1025

Previously, all soot managers (one per TenantControlPlane) created
controllers with identical names (e.g., 'clusterrolebinding',
'configmap'). These controllers registered metrics with the same
labels in the global Prometheus registry, causing worker count
conflicts and negative values when TenantControlPlanes were
dynamically created/destroyed.

This fix adds unique controller names by:
- Adding a ControllerName field to all soot controller structs
- Using .Named() in SetupWithManager with a unique name per TCP
- Generating names as {namespace}-{name}-{controller-type}

Now each TenantControlPlane's controllers have unique metric labels,
preventing conflicts and eliminating negative worker counts.

Signed-off-by: Azeez Syed <syedazeez337@gmail.com>
@syedazeez337 syedazeez337 force-pushed the fix/issue-1025-negative-metrics branch from f5d88fc to 5127f80 Compare December 30, 2025 05:01
@syedazeez337
Copy link
Contributor Author

Hi @prometherion
I have fixed the one e2e test failing because of name issue. Now it should pass. Let me know if this good

@prometherion prometherion merged commit f55df56 into clastix:master Jan 4, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect controller_runtime_active_workers metrics values

2 participants