[LLVM] Expose Host CPU Feature Detection#14946
Merged
junrushao merged 1 commit intoapache:mainfrom May 25, 2023
Merged
Conversation
Collaborator
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
5c7365f to
1edaac4
Compare
yzh119
approved these changes
May 25, 2023
mei-ye
pushed a commit
to mei-ye/tvm
that referenced
this pull request
Jun 1, 2023
A small script that exposes host CPU name, target triple and features:
<details>
```python
import tvm
def main():
get_default_target_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetDefaultTargetTriple")
get_process_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetProcessTriple")
get_host_cpu_name = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUName")
get_host_cpu_features = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUFeatures")
target_triple = get_default_target_triple()
process_triple = get_process_triple()
host_cpu_name = get_host_cpu_name()
host_cpu_features = get_host_cpu_features()
print("target_triple: {}".format(target_triple))
print("process_triple: {}".format(process_triple))
print("host_cpu_name: {}".format(host_cpu_name))
print("host_cpu_features:")
for name, value in host_cpu_features.items():
print(" {}: {}".format(name, bool(value)))
if __name__ == "__main__":
main()
```
</details>
Output (AMD CPU):
<details>
```
target_triple: x86_64-unknown-linux-gnu
process_triple: x86_64-unknown-linux-gnu
host_cpu_name: znver2
host_cpu_features:
xsaveopt: True
tsxldtrk: False
sse: True
movdiri: False
mmx: True
pku: False
amx-int8: False
amx-tile: False
rdpid: True
avx512vbmi2: False
cmov: True
widekl: False
f16c: True
bmi: True
gfni: False
avx512cd: False
movdir64b: False
rdseed: True
clwb: True
avx512er: False
avx512f: False
sse4.2: True
avxifma: False
sse2: True
avx512vp2intersect: False
prfchw: True
avx512pf: False
vaes: False
waitpkg: False
amx-bf16: False
prefetchi: False
uintr: False
fxsr: True
bmi2: True
lzcnt: True
avx512vbmi: False
avx512bf16: False
prefetchwt1: False
xsaves: True
movbe: True
rtm: False
pclmul: True
hreset: False
sahf: True
fma4: False
xop: False
vpclmulqdq: False
sgx: False
avx512vnni: False
popcnt: True
xsavec: True
aes: True
avx512vpopcntdq: False
kl: False
avx512bitalg: False
xsave: True
avxvnni: False
raoint: False
clflushopt: True
sse4a: True
avx512bw: False
cx16: True
avxvnniint8: False
amx-fp16: False
cldemote: False
rdrnd: True
ptwrite: False
rdpru: True
avx: True
adx: True
avx512vl: False
pconfig: False
shstk: False
64bit: True
crc32: True
sha: True
cmpccxadd: False
tbm: False
serialize: False
mwaitx: True
avx512ifma: False
avx512fp16: False
clzero: True
avx2: True
cx8: True
fma: True
lwp: False
enqcmd: False
wbnoinvd: True
sse4.1: True
avx512dq: False
ssse3: True
fsgsbase: True
invpcid: False
sse3: True
avxneconvert: False
```
</details>
Note that LLVM doesn't guarantee automatic feature detection always succeeds, particularly for newer CPU models and older LLVM builds (e.g. M2 CPU + LLVM 16), the result is usually inaccurate. In this case, i.e. CPU feature detection fails, we will print a warning message and return an empty dict instead.
To properly detect CPU features on macbook, the commands below provided by the system are the most accurate:
```bash
sysctl -a machdep.cpu
sysctl -a hw.optional
```
On linux, usually it is recommended to directly query via:
```bash
cat /proc/cpuinfo
```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A small script that exposes host CPU name, target triple and features:
Details
Output (AMD CPU):
Details
Note that LLVM doesn't guarantee automatic feature detection always succeeds, particularly for newer CPU models and older LLVM builds (e.g. M2 CPU + LLVM 16), the result is usually inaccurate. In this case, i.e. CPU feature detection fails, we will print a warning message and return an empty dict instead.
To properly detect CPU features on macbook, the commands below provided by the system are the most accurate:
On linux, usually it is recommended to directly query via: