Reminder
System Info
导出之后的reward model用这个方式加载:
model = AutoModelForCausalLMWithValueHead.from_pretrained('...')
弹出一个warning: no v_head weight is found. This IS expected if you are not resuming PPO training.
请问这是正常可以忽略的吗?我想用保存后的reward model做inference输出value
Reproduction
model = AutoModelForCausalLMWithValueHead.from_pretrained('...')
Expected behavior
No response
Others
No response
Reminder
System Info
导出之后的reward model用这个方式加载:
弹出一个warning: no v_head weight is found. This IS expected if you are not resuming PPO training.
请问这是正常可以忽略的吗?我想用保存后的reward model做inference输出value
Reproduction
Expected behavior
No response
Others
No response