Please upgrade accelerate requirement to version 0.33.0

### Reminder

- [X] I have read the README and searched the existing issues.

### System Info

- `llamafactory` version: 0.8.4.dev0
- Platform: Linux-5.4.0-146-generic-x86_64-with-glibc2.27
- Python version: 3.11.9
- PyTorch version: 2.4.0+cu121 (GPU)
- Transformers version: 4.43.4
- Datasets version: 2.20.0
- Accelerate version: 0.32.0
- PEFT version: 0.12.0
- TRL version: 0.9.6
- GPU type: NVIDIA A100-PCIE-40GB
- DeepSpeed version: 0.15.0
- Bitsandbytes version: 0.43.3
- vLLM version: 0.5.5

### Reproduction

In the current highest supported version of `accelerate` (0.32.0), loading the 7B model with the following code in my environment results in a speed of "Loading checkpoint shards 00:30, 7.63s/it". However, in version 0.33.0 of `accelerate`, the loading speed is "Loading checkpoint shards 00:04, 1.14s/it".

```python
model = AutoModelForCausalLM.from_pretrained(model_name)
```

Additionally, when loading the model in web UI chat mode, the loading speed is twice as slow.

### Expected behavior

Please upgrade the `accelerate` requirement to version 0.33.0 or let me know any possible mistakes that could cause the model loading speed to be slow under version 0.32.0 of `accelerate`.

### Others

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please upgrade accelerate requirement to version 0.33.0 #5295

Reminder

System Info

Reproduction

Expected behavior

Others

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Please upgrade accelerate requirement to version 0.33.0 #5295

Description

Reminder

System Info

Reproduction

Expected behavior

Others

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions