ggml : riscv: add xtheadvector support#13720
Conversation
|
@ggerganov Gentle ping on this for review when you get a chance. |
ggerganov
left a comment
There was a problem hiding this comment.
We should figure out a way to rework these ifdef branches so the code is more readable. Not for this PR, just a general note that it is something we should do.
I plan to split the arch-dependent implementations in |
Let me first sync Btw, I wonder if we should first start with renaming the "aarch64" misnomer in the codebase. The code in |
|
Yes, I think it would be a good idea to rename it to something like |
|
@xctan Sync is complete. We could use some help with reorganizing the source tree, so feel free to help out. About the |
This PR builds upon #12530 to introduce k-quant support for the older RVV v0.7.1 implementation (xtheadvector).
Additionally, it updates zfh extension detection to use the built-in compiler macro, eliminating the need for an extra definition.
Evaluation
Build instruction
Verification
Test model: gemma-3-4b-it-GGUF, Q4_K_M quantization. The results of llama-perplexity are:
Performance
Using the same model as above on SG2042.