[receiver/vcenter] Add vCenter Host metrics (dropped packet rate + capacity)#33646
Conversation
f137150 to
4b6d86a
Compare
StefanKurek
left a comment
There was a problem hiding this comment.
Just a few small things. Also needs the scraper test updated with new results.
receiver/vcenterreceiver/metrics.go
Outdated
There was a problem hiding this comment.
Do you know how different this number ends up (if at all) from numCpuCores * cpuMhz?
There was a problem hiding this comment.
Yea, so I've been trying to figure this out. From what I understand totalCapacity would be the more accurate metric to measure here. In most cases the numbers should be very close together (usually a bit lower). This is because numCpuCores might get artificially inflated by any logical cores caused by hyper threading. Beyond that the numbers should be very similar if not the same
There was a problem hiding this comment.
After talking with Stefan a bit more on this topic, we uncovered that what we think of as totalCapacity as shown on these performance metrics is a bit different than what someone would normally think (numCpuCores * cpuMhz) as shown on the vSphere client.
The difference being the calculated total capacity using the cpuCores and cpuMhz would be talking about how much capacity the host has for it

The performance metric totalCapacity is referring to the Total reservation capacity

and the performance metric reservedCapacity is referring "used reservation" of the total reservation capacity.

In the end, we think both the performance metric and the quickstat metric are both very useful depending on the user use case and setup, so we will be including them both.
StefanKurek
left a comment
There was a problem hiding this comment.
Should be good to move from DRAFT. I still would like to know the difference between the total CPU performance metric and one you could calculate though.
receiver/vcenterreceiver/client.go
Outdated
There was a problem hiding this comment.
Actually I'd say this PR is blocked until this PR is merged in and you can rebase (because this function for example isn't actually quite right ATM)
There was a problem hiding this comment.
I think the description and name of this one has some room for improvement (still sounds a bit confusing).
What do you think about something like vcenter.host.cpu.reserved for the name? I know the performance metric names have "capacity" in them, but the UI equivalents do not. "Capacity" seems to only make sense for "total" and not "used" to me.
Then the description could then be something like The CPU of the host reserved for use by virtual machines.
Attribute could be changed to cpu_reservation_type with values of total and used. Description could be The type of CPU reservation for the host.
These are all just suggestions, but what do you think @BominRahmani ?
There was a problem hiding this comment.
I was actually teetering back and forth between vcenter.host.cpu.reserved and vcenter.host.cpu.reserve.capacity I only decided the latter since it felt a bit more verbose. I was also thinking about switching the cpu_reservation_type to total and used to match the UI a bit more originally, So I am totally ok with these changes.
a06edcb to
fdd24e7
Compare
djaglowski
left a comment
There was a problem hiding this comment.
LGTM, just need checks fixed
fdd24e7 to
e44f552
Compare
800febe to
b51853d
Compare
Description:
The following PR adds the following metrics
These metrics can be found in the following links respectively:
errorTx and errorRx
reservedCapacity and totalCapacity
Link to tracking Issue: #33607
Testing:
Tested against a live environment to scrape added metrics, and updated golden test files.
Documentation:
Updated documentation through mdatagen.