fix: TypeError when loading base model remotely in convert_lora_to_gguf#17385
Merged
CISC merged 4 commits intoggml-org:masterfrom Nov 20, 2025
Merged
fix: TypeError when loading base model remotely in convert_lora_to_gguf#17385CISC merged 4 commits intoggml-org:masterfrom
CISC merged 4 commits intoggml-org:masterfrom
Conversation
CISC
reviewed
Nov 19, 2025
Member
CISC
left a comment
There was a problem hiding this comment.
Thanks, I was planning to address this, but hadn't gotten around to it yet.
I was thinking of changing this function instead, and then the rest only need minor changes:
def load_hparams_from_hf(hf_model_id: str) -> tuple[dict[str, Any], Path | None]:
from huggingface_hub import try_to_load_from_cache
# normally, adapter does not come with base model config, we need to load it from AutoConfig
config = AutoConfig.from_pretrained(hf_model_id)
cache_dir = try_to_load_from_cache(hf_model_id, "config.json")
cache_dir = Path(cache_dir).parent if isinstance(cache_dir, str) else None
return config.to_dict(), cache_dir
Contributor
Author
|
Hi @CISC, thank you for your guidance :D My implementation introduced additional variables, which is not elegant enough. I directly copied and used this code snippet, and made adjustments to several calling locations: def load_hparams_from_hf(hf_model_id: str) -> tuple[dict[str, Any], Path | None]:
from huggingface_hub import try_to_load_from_cache
# normally, adapter does not come with base model config, we need to load it from AutoConfig
config = AutoConfig.from_pretrained(hf_model_id)
cache_dir = try_to_load_from_cache(hf_model_id, "config.json")
cache_dir = Path(cache_dir).parent if isinstance(cache_dir, str) else None
return config.to_dict(), cache_dirAre there any other parts of the code that need to be adjusted? Related Tests:
python convert_lora_to_gguf.py --base Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model: Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:gguf: indexing model part 'model.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
python convert_lora_to_gguf.py --base-model-id Qwen/Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
python convert_lora_to_gguf.py lora_path
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf |
CISC
reviewed
Nov 20, 2025
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Contributor
Author
|
I've resubmitted the code. Related Tests:
python convert_lora_to_gguf.py --base Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model: Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:gguf: indexing model part 'model.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf
python convert_lora_to_gguf.py --base-model-id Qwen/Qwen2.5-1.5B-Instruct lora_path
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:hf-to-gguf:Using remote model with HuggingFace id: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Qwen-Qwen2.5-1.5B-Instruct-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Qwen-Qwen2.5-1.5B-Instruct-F16.gguf
INFO:lora-to-gguf:Loading base model from Hugging Face: Qwen/Qwen2.5-1.5B-Instruct
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:lora_path/Lora_Path-F16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:lora-to-gguf:Model successfully exported to lora_path/Lora_Path-F16.gguf |
CISC
approved these changes
Nov 20, 2025
Anico2
added a commit
to Anico2/llama.cpp
that referenced
this pull request
Jan 15, 2026
…ora_to_gguf (ggml-org#17385) * fix: TypeError when loading base model remotely in convert_lora_to_gguf * refactor: simplify base model loading using cache_dir from HuggingFace * Update convert_lora_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * feat: add remote_hf_model_id to trigger lazy mode in LoRA converter --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
blime4
referenced
this pull request
in blime4/llama.cpp
Feb 5, 2026
…ora_to_gguf (#17385) * fix: TypeError when loading base model remotely in convert_lora_to_gguf * refactor: simplify base model loading using cache_dir from HuggingFace * Update convert_lora_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * feat: add remote_hf_model_id to trigger lazy mode in LoRA converter --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When loading base model from Hugging Face,
dir_base_modelisNone, causingTypeErrorinindex_tensors().Passes
remote_hf_model_idtoLoraModelto load tensors from Hugging Face.Related issue: