-
-
Notifications
You must be signed in to change notification settings - Fork 45
Support speculative decoding with llama.cpp #240
Copy link
Copy link
Closed
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.help wantedExtra attention is neededExtra attention is neededneeds-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
Metadata
Metadata
Assignees
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.help wantedExtra attention is neededExtra attention is neededneeds-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
What would you like to be added:
We have supported vllm, since llama.cpp adds this feature, we should support it as well, see ggml-org/llama.cpp#10455
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.