`get_base_model()` is returning the base model with the LoRA still applied. 

Technically, I'm just grabbing the `.base_model.model` directly, rather than using `get_base_model()`, but that should have the same effect, since that's all `get_base_model()` does if the `active_peft_config` is not `PromptLearningConfig` as seen [here](https://github.com/huggingface/peft/blob/4fd374e80d670781c0d82c96ce94d1215ff23306/src/peft/peft_model.py#L328).

After loading a llama model with a LoRA, like so:
```python
shared.model = PeftModel.from_pretrained(shared.model, Path(f"{shared.args.lora_dir}/{lora_names[0]}"), **params)
```

The PeftModel loads fine and everything is working as expected. However, I can not figure out how to get the original model back without a LoRA still being active when I do an inference.

The code I'm using is from [here](https://github.com/oobabooga/text-generation-webui/blob/main/modules/LoRA.py):
```python
shared.model.disable_adapter()
shared.model = shared.model.base_model.model
```
This gives me the model back as a `LlamaForCausalLM`, but when I go to inference, the LoRA is still applied. I made a couple of test LoRAs so that there would be no question as to whether the LoRA is still loaded. They can be found here: https://huggingface.co/clayshoaf/AB-Lora-Test

I am digging around right now, and I see [this line](https://github.com/huggingface/peft/blob/4fd374e80d670781c0d82c96ce94d1215ff23306/src/peft/tuners/lora.py#L293): `if isinstance(module, LoraLayer):` from:
```python
    def _set_adapter_layers(self, enabled=True):
        for module in self.model.modules():
            if isinstance(module, LoraLayer):
                module.disable_adapters = False if enabled else True
```

So I checked in the program and if I load a LoRA and do 
```python
[module for module in shared.model.base_model.model.modules() if hasattr(module, "disable_adapters")]
```
it returns a bunch of modules that are of the type `Linear8bitLt` (if loaded in 8bit) or `Linear4bitLt` (if loaded in 4bit).

Would it work to set the modules' `disable_adapters` value to false? I don't want to hack around too much in the code, because I don't have a deep enough understanding to be sure that I won't mess something else up in the process.

If that won't work, is there something else that I should be doing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`get_base_model()` is returning the base model with the LoRA still applied. #430

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

get_base_model() is returning the base model with the LoRA still applied. #430

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`get_base_model()` is returning the base model with the LoRA still applied. #430