This folder provides Olive optimization / fine-tuning / quantization / evaluation recipes for meta-llama/Llama-3.2-1B-Instruct.
Each recipe is a self‑contained JSON passed to the Olive CLI: olive run --config <file>.json.
Install dependencies (make sure you are in the olive-recipes/meta-llama-Llama-3.2-1B-Instruct/olive directory):
python -m pip install -r requirements.txtTypical steps:
olive run --config qlora.jsonOutput models & adapters saved under the output_dir (default models/).
| File | Goal | Main Pass Chain (order) |
|---|---|---|
qlora.json |
QLoRA PEFT finetune + export + ORT opt + extract adapters | q (qlora) → m (ModelBuilder fp16) → o (OrtTransformersOptimization fp16) → e (ExtractAdapters) |
loha.json |
LoHa finetune + ONNX export + ORT opt + extract | l (loha) → c (OnnxConversion) → o (ORT opt) → e (ExtractAdapters) → m (metadata) |
lokr.json |
LoKr finetune + ONNX export + ORT opt + extract | l (lokr) → c (OnnxConversion) → o → e → m |
dora.json |
DoRA finetune + ORT opt + extract | d (dora) → m (ModelBuilder fp16) → o → e |
rtn.json |
Block‑wise RTN quantization (ONNX) | m (ModelBuilder fp16) → q (OnnxBlockWiseRtnQuantization) |
hqq.json |
HQQ quantization (ONNX) | m (ModelBuilder fp16) → q (OnnxHqqQuantization) |
lmeval.json |
HF (fp16/fp32) evaluation with LMEval | evaluator only |
lmeval_onnx.json |
INT4 ModelBuilder + LMEval | mb (ModelBuilder int4) + evaluator |
QLoRA training + optimization:
olive run --config qlora.jsonLoHa adapter training and export to ONNX:
olive run --config loha.jsonHQQ quantization (after ONNX build inside pass chain):
olive run --config hqq.jsonRun LM evaluation on HF model:
olive run --config lmeval.jsonEvaluate INT4 ONNX build:
olive run --config lmeval_onnx.jsonClean cache for a fresh run (example):
olive run --config qlora.json --clean_cache