What would you like to be added:
Take Mistral for example, it not only contain the chunked model weights, it also has consolidated model weights, when downloading models from huggingface, we should pay attention to this or we will download two replicas of model weights.
Why is this needed:
Fast model loading.
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
What would you like to be added:
Take Mistral for example, it not only contain the chunked model weights, it also has consolidated model weights, when downloading models from huggingface, we should pay attention to this or we will download two replicas of model weights.
Why is this needed:
Fast model loading.
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.