llama : speed-up grammar sampling

**Note: This issue was copied from [https://github.com/ggml-org/llama.cpp/issues/4218](https://github.com/ggml-org/llama.cpp/issues/4218)**

**Original Author:** @ggerganov
**Original Issue Number:** #4218
**Created:** 2023-11-25T17:04:06Z

---

There have been a few reports where the grammar sampling can significantly degrade the performance.
It would be nice to profile and optimize the implementation - there should be room for improvements.

Already on-going efforts:

- #4210 
- #4213

Probably worth looking in multi-threading the implementation as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : speed-up grammar sampling #299

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llama : speed-up grammar sampling #299

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions