Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
main.rs	main.rs

Name

Last commit message

Last commit date

candle-quantized-qwen3-moe

Qwen3 MoE GGUF contains the GGUF format of Qwen3 32B MoE models, developed by Alibaba Cloud.

Running the example

# Local GGUF file
cargo run --features cuda --example quantized-qwen3-moe --release -- --model /path/Qwen3-30B-A3B-Instruct-2507-Q4_K_M.gguf --prompt "Write a function to count prime numbers up to N."

Models available via --which argument: 16b_q2k, 16b_q4k, 16b_q6k, 16b_q80; 32b_q2k, 32b_q4k, 32b_q6k, 32b_q80;

# Obtained from Huggingface
cargo run --features cuda --example quantized-qwen3-moe --release -- --which 32b_q4k --prompt "A train is travelling at 120mph, how far does it travel in 3 minutes 30 seconds?"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

candle-quantized-qwen3-moe

Running the example

FilesExpand file tree

quantized-qwen3-moe

Directory actions

More options

Directory actions

More options

Latest commit

History

quantized-qwen3-moe

Folders and files

parent directory

README.md

candle-quantized-qwen3-moe

Running the example