Skip to content

llama_pro训练Llama-3.2-1B-Instruct NotImplementedError: Cannot copy out of meta tensor; no data! #6812

@liboaccn

Description

@liboaccn

Reminder

  • I have read the above rules and searched the existing issues.

System Info

我使用llama_pro进行模型训练过程中,报错如下,模型为Llama-3.2-1B-Instruct

python scripts/llama_pro.py     --model_name_or_path ~/data/model/Llama-3.2-1B-Instruct     --output_dir /model/pro/llama3-8b-pro     --num_expand 8 
Add layer 0 copied from layer 0.
Add layer 1 copied from layer 1.
Add layer 2 expanded from layer 1.
Add layer 3 copied from layer 2.
Add layer 4 copied from layer 3.
Add layer 5 expanded from layer 3.
Add layer 6 copied from layer 4.
Add layer 7 copied from layer 5.
Add layer 8 expanded from layer 5.
Add layer 9 copied from layer 6.
Add layer 10 copied from layer 7.
Add layer 11 expanded from layer 7.
Add layer 12 copied from layer 8.
Add layer 13 copied from layer 9.
Add layer 14 expanded from layer 9.
Add layer 15 copied from layer 10.
Add layer 16 copied from layer 11.
Add layer 17 expanded from layer 11.
Add layer 18 copied from layer 12.
Add layer 19 copied from layer 13.
Add layer 20 expanded from layer 13.
Add layer 21 copied from layer 14.
Add layer 22 copied from layer 15.
Add layer 23 expanded from layer 15.
Note that `shard_checkpoint` is deprecated and will be removed in v4.44. We recommend you using split_torch_state_dict_into_shards from huggingface_hub library
Save weights:  67%|████████████████████████████████████████████████████████████████████                                  | 2/3 [00:08<00:04,  4.16s/it]
Traceback (most recent call last):
  File "/home/users/admin/code/LLaMA-Factory/scripts/llama_pro.py", line 132, in <module>
    fire.Fire(block_expansion)
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/users/admin/code/LLaMA-Factory/scripts/llama_pro.py", line 111, in block_expansion
    save_file(shard, os.path.join(output_dir, shard_file), metadata={"format": "pt"})
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/safetensors/torch.py", line 286, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/safetensors/torch.py", line 496, in _flatten
    return {
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/safetensors/torch.py", line 500, in <dictcomp>
    "data": _tobytes(v, k),
  File "/home/users/admin/miniconda3/lib/python3.9/site-packages/safetensors/torch.py", line 422, in _tobytes
    tensor = tensor.to("cpu")
NotImplementedError: Cannot copy out of meta tensor; no data!   

Reproduction

Put your message here.

python scripts/llama_pro.py --model_name_or_path ~/data/model/Llama-3.2-1B-Instruct --output_dir /model/pro/llama3-8b-pro --num_expand 8

Others

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    solvedThis problem has been already solved

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions