initial Turin CPU platform#9043
Conversation
4ef8aba to
d5727cc
Compare
0f8bd29 to
f34e225
Compare
| .set_extended_processor_and_feature_identifiers(Some(leaf)) | ||
| .expect("can set leaf 8000_0001h"); |
There was a problem hiding this comment.
in a truly distressing case of the ankle bone being connected to the wrist bone, if PerfCtrExtCore is set and TopologyExtensions is not, Windows Server 2022 sits in a loop at boot. I noticed this in checking out a fix for oxidecomputer/propolis#959, an initial version of which just cleared TopologyExtensions bit to match discarding leaf 0x8000_001E. Both bits together are fine. Having topology extensions and not six perf counters (as we've had on Milan for a while) is fine. Having neither is fine. Having six perf counters and no topology extensions does a loop at boot.
I'm a little suspicious there's some relationship between this and the incomplete representation of SMT, so I'm going to set this to a more Milan-like situation where we hide perf counter extensions for now, and omit topology extensions, and then see how this looks with issues like oxidecomputer/propolis#940 sorted out.
edit: these bits are now both cleared, and boy will I feel silly if I've overlooked something here
There was a problem hiding this comment.
How does this tie into the definition of 8000_0022 %ecx NumPerfCtrCore?
There was a problem hiding this comment.
8000_0022 eax is zero so guests shouldn't care, but if we were filling it in I'd pick 4 without PerfCtrExtCore and 6 with.
| // guests. | ||
| const TURIN_V1_CPUID: [CpuidEntry; 25] = [ | ||
| cpuid_leaf!(0x0, 0x0000000D, 0x68747541, 0x444D4163, 0x69746E65), | ||
| cpuid_leaf!(0x1, 0x00B00F21, 0x00000800, 0xF6D83203, 0x078BFBFF), |
There was a problem hiding this comment.
%ecx bit 31 is the bit to indicate hypervisor leafs are present right?
There was a problem hiding this comment.
yeah, funnily this has more words in the APM than PPR..
| const TURIN_V1_CPUID: [CpuidEntry; 25] = [ | ||
| cpuid_leaf!(0x0, 0x0000000D, 0x68747541, 0x444D4163, 0x69746E65), | ||
| cpuid_leaf!(0x1, 0x00B00F21, 0x00000800, 0xF6D83203, 0x078BFBFF), | ||
| cpuid_leaf!(0x5, 0x00000000, 0x00000000, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
Is leaf 5 zero here because we don't actually indicate support for monitor / mwait?
There was a problem hiding this comment.
yeah, specifically it's zero and in this list because I want to confirm the assembled profile has this leaf zeroed in addition to leaf 1 ECX monitor being clear.
| cpuid_subleaf!( | ||
| 0x7, 0x1, 0x00000030, 0x00000000, 0x00000000, 0x00000000 | ||
| ), | ||
| cpuid_subleaf!( |
There was a problem hiding this comment.
I assume leaf B is left out here because it's dynamically generated?
There was a problem hiding this comment.
that's right, Propolis will fill it in with a level 0 and 1 that look the same as Milan (https://github.com/oxidecomputer/propolis/blob/ff52055/lib/propolis/src/cpuid.rs#L370-L378)
| 0x7, 0x1, 0x00000030, 0x00000000, 0x00000000, 0x00000000 | ||
| ), | ||
| cpuid_subleaf!( | ||
| 0xD, 0x0, 0x000000E7, 0x00000980, 0x00000980, 0x00000000 |
There was a problem hiding this comment.
How do we get 980 in %ebx at this state without feeding in the value of %xcr0?
There was a problem hiding this comment.
here and D.1 ebx are managed on the read in bhyve, so the value here doesn't have any bearing on a VM. so I arbitrarily picked the largest (and I think most likely) values we'd see in these leaves at runtime.
| cpuid_leaf!(0x80000003, 0x2D6E6972, 0x656B696C, 0x6F725020, 0x73736563), | ||
| cpuid_leaf!(0x80000004, 0x2020726F, 0x20202020, 0x20202020, 0x00202020), | ||
| cpuid_leaf!(0x80000007, 0x00000000, 0x00000000, 0x00000000, 0x00000100), | ||
| cpuid_leaf!(0x80000008, 0x00003030, 0x20000005, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
Just confirming, we cap %eax at 0x30/0t48 because we don't support virtualizing 5 level paging, right?
| cpuid_leaf!(0x8000000A, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x8000001A, 0x0000000A, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x8000001B, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
I assume leaf 8000_001e is filled dynamically.
There was a problem hiding this comment.
I actually omit 8000_001E entirely (and clear TopologyExtensions), this goes to the kind of awkward semi-SMT situation I want to fix with an early virtual platform change (oxidecomputer/propolis#940), because I think we want to disallow VM shapes with odd vCPU counts. Otherwise Linux for example will assume the 8000_001E leaf is bogus if it indicates an SMT sibling that doesn't exist. Not somewhere I'd love to rely on the grace of guest OSes..
8000_001E with ThreadsPerCore = 0 would be fine even now, but there's no API surface to not have Propolis indicate SMT when filling in CPU topology, so.. out with this leaf for now.
| cpuid_leaf!(0x8000001B, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x8000001F, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x80000021, 0x000D8C47, 0x00000000, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
This has some new features here in %eax versus what is in RFD 314:
- How did you make the decision for UpperAddressIgnore? Basically it's not part of our EFER support?
- I'm assuming we don't include AutomaticIBRS because we need work on all mitigation passthrough.
- I guess because we don't support SMM there's no point in bit 9.
- PMC2PreciseRetire is not there due to the existing lack of support?
- PrefetchCtlMsr jis not there due to not virtualizing the MSR?
- L2TlbSizeX32 is not there because we don't pass through any of the TLB stuff.
- GpOnUserCpuid I'm guessing because we don't virtualize this.
- Why no PREFETCHI support? This doesn't seem to require hypervisor support (bit 19).
- No FP512_DOWNGRADE seems reasonable for now without virtualization.
- Given the pass through of security facts that don't require specific enablement, why not ERAPS on bit 24?
- Similar question on bit 30, SRSO_USER_KERNEL_NO. Given the other NO stuff we have added, seems something worth asking.
There was a problem hiding this comment.
UpperAddressIgnore and AutomaticIBRS theoretically should both "just work". If a guest sets them, the bit gets set in the guest's EFER and that's that. I have them clear because I didn't find any OS that would try using UAI (and from the LKML conversations it did not seem like that would change at least there). AutoIBRS is probably fine. I just feel itchy advertising this before at least mentioning it in specialreg.h. And since there'll be a rev to include mitigation MSRs and bits, yeah, I figure it's not that bad to keep it clear at first.
otherwise yeah, missing bits are because there's no support/no MSR/nothing productive for guests. except PREFETCHI, ERAPS, and SRSO_USER_KERNEL_NO, which probably should be set. I'll make sure guests look reasonable with them (though prefetchi I'd tested last week and just didn't set when I'd set movdir)
| .set_extended_processor_and_feature_identifiers(Some(leaf)) | ||
| .expect("can set leaf 8000_0001h"); |
There was a problem hiding this comment.
How does this tie into the definition of 8000_0022 %ecx NumPerfCtrCore?
| cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x8000001F, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
| cpuid_leaf!(0x80000021, 0x000D8C47, 0x00000000, 0x00000000, 0x00000000), | ||
| ]; |
There was a problem hiding this comment.
I assume that right now we're not including the extended leaf 8000_0026 bits.
There was a problem hiding this comment.
right. that would need a smidge of Propolis work and more importantly won't be particularly interesting beyond 8000_001E for the time being.
|
Given that guest OSes handle these as we expect, the review from rm, and imminent R17 branch, I'm going to go ahead and include this - if we find something we want different in a Turin CPU profile, this is much much closer than "Milan by a different name" that it's replacing. (I'm very cautiously pessimistic about the lack of cache information surviving contact with reality in some way we've not yet seen, for one..) |
this is so much simpler than codifying all the of the bits describing all of the CPU surface area! what a breath of fresh air!
The feature selection here is the intersection of "PPR says it's there", "useful for guests", "supported in byhve/propolis", and "doesn't seem like we're painted into a corner if a future platform changes it." the bits here are, also, a subset of what what I'd seen on a 9365 in a Cosmo.
While byhve/Propolis would let guests turn on AutoIBRS, I haven't looked at it in the context of guest OSes much at all (though they do boot when told they're allowed to use AutoIBRS). UAI is in a similar boat but I don't think anyone uses it. So both EFER features are hidden for the time being.
Otherwise, as-is, I've booted Linux, Windows, OmniOS, and FreeBSD with this profile and they seem fine. Linux for example omits mentioning caches in
/proc/cpuinfo, which make sense since I've avoided as much cache topology information as I can here.. the reasoning for that is discussed more in RFD 314.