[Relax][Web] Add ApplyPresenceAndRequencyPenalty by CharlieFRuan · Pull Request #16504 · apache/tvm

CharlieFRuan · 2024-02-01T08:05:13Z

This PR adds ApplyPresenceAndFrequencyPenalty() to lm_support.cc and exposes it to Web runtime.

This is essentially the same as applyRepetitionPenalty except we follow a different way of penalizing repeating tokens, following https://platform.openai.com/docs/guides/text-generation/frequency-and-presence-penalties.

Tested end-to-end with WebLLM.

CharlieFRuan · 2024-02-01T08:05:34Z

cc @tqchen @MasterJH5574

This PR adds `GenerationConfig`, which allows per-generation configs. See `get-started.ts` for its example usage: ```typescript let genConfig: webllm.GenerationConfig = { presence_penalty: 0.5, frequency_penalty: 0.5, max_gen_len: 20, // stop: ["is", "Canada"] // for demonstration purpose } const prompt0 = "What is the capital of Canada?"; const reply0 = await chat.generate(prompt0, generateProgressCallback, 1, genConfig); ``` In addition to the existing fields in `mlc-chat-config.json`, we also support OpenAI-like fields `frequency_penalty`, `presence_penalty`, and `stop` to prepare for the incoming OpenAI-like APIs. This PR also sets up unit tests; use `npm test` to run tests. However, some work needs to be done to support end-to-end testing (e.g. accessing WebGPU in a test environment). All prebuilt WASMs are updated correspondingly: mlc-ai/binary-mlc-llm-libs#90 as we introduced a new API in tvmjs's `runtime.ts` via apache/tvm#16504. Note that the update of Llama WASMs is breaking in the sense that users will have to update their WebLLM npm.

[Relax][WebGPU] Add ApplyPresenceAndRequencyPenalty

d1b686f

CharlieFRuan changed the title ~~[Relax][WebGPU] Add ApplyPresenceAndRequencyPenalty~~ [Relax][Web] Add ApplyPresenceAndRequencyPenalty Feb 1, 2024

MasterJH5574 approved these changes Feb 1, 2024

View reviewed changes

MasterJH5574 merged commit 5c45ae8 into apache:main Feb 1, 2024

CharlieFRuan mentioned this pull request Feb 15, 2024

[ChatModule] Add GenerationConfig and set up unit tests mlc-ai/web-llm#298

Merged

ysh329 mentioned this pull request Apr 21, 2024

[Release] v0.16.0 Release Candidate Notes #16911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relax][Web] Add ApplyPresenceAndRequencyPenalty#16504

[Relax][Web] Add ApplyPresenceAndRequencyPenalty#16504
MasterJH5574 merged 1 commit intoapache:mainfrom
CharlieFRuan:pr-0201-freq-penalty

CharlieFRuan commented Feb 1, 2024

Uh oh!

CharlieFRuan commented Feb 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CharlieFRuan commented Feb 1, 2024

Uh oh!

CharlieFRuan commented Feb 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants