[Web] Seperate parallel shard download and iterative shard loading#16650
Merged
tqchen merged 8 commits intoapache:mainfrom Mar 14, 2024
Merged
[Web] Seperate parallel shard download and iterative shard loading#16650tqchen merged 8 commits intoapache:mainfrom
tqchen merged 8 commits intoapache:mainfrom
Conversation
CharlieFRuan
approved these changes
Feb 27, 2024
Contributor
|
@DiegoCao Its not the real solution. Definitely HF CDN is not great. It should allow to do parallel requests without mayor problems. Real Solution:
|
Contributor
|
I can give it a shot tomorrow |
tqchen
approved these changes
Mar 1, 2024
Member
|
@DavidGOrtega Thanks for offering help! We found out that the issue was probably not due to parallel downloading the shards but due to parallelly processing the shards. We managed to keep the parallel downloads. If download issues persist, we'll add parallel download in batch as you suggested. |
CharlieFRuan
added a commit
to mlc-ai/web-llm
that referenced
this pull request
Mar 2, 2024
The new version include 2 changes: - Include cache deletion API via #314 - Fix model download/caching issue on TVMjs side via apache/tvm#16650
Member
|
need rebase |
CharlieFRuan
added a commit
to mlc-ai/web-llm
that referenced
this pull request
Mar 12, 2024
Another minor follow-up to version 0.2.24 (or hence to 0.2.25). This PR adds a `try-catch` when loading the **_already-downloaded_** weights, attempting to provide more information to the `exit(1)` error in #322. The only change is TVMJS's commit apache/tvm@b193cbb from apache/tvm#16650
…ization process Co-authored-by: Charlie Ruan <53290280+CharlieFRuan@users.noreply.github.com>
This reverts commit 74dcddd.
tqchen
approved these changes
Mar 14, 2024
CharlieFRuan
added a commit
to mlc-ai/web-llm
that referenced
this pull request
Mar 14, 2024
Changes in WebLLM: - Stateful chat completion: #330 - OpenAI's `logit_bias`: #331 - OpenAI's `logprobs` and `top_logprobs`: #333 Changes in TVMjs: - apache/tvm#16650 - Fix param download issues (already reflected in 0.2.26, but at the time this PR was not merged yet) - Expose `sampleTopPFromProb` to support `logprobs` (new in 0.2.27)
thaisacs
pushed a commit
to thaisacs/tvm
that referenced
this pull request
Apr 3, 2024
…pache#16650) * Fix Parallel Download Issue by seperating the downloading with serialization process Co-authored-by: Charlie Ruan <53290280+CharlieFRuan@users.noreply.github.com> * Fix callback disply * [Web] Support IndexDB Caching * Limit max concurrent download to 4 shards * Try to catch error when loading model to ndarray cache --------- Co-authored-by: Charlie Ruan <53290280+CharlieFRuan@users.noreply.github.com>
atebites-hub
pushed a commit
to atebites-hub/web-llm
that referenced
this pull request
Oct 4, 2025
The new version include 2 changes: - Include cache deletion API via mlc-ai#314 - Fix model download/caching issue on TVMjs side via apache/tvm#16650
atebites-hub
pushed a commit
to atebites-hub/web-llm
that referenced
this pull request
Oct 4, 2025
Another minor follow-up to version 0.2.24 (or hence to 0.2.25). This PR adds a `try-catch` when loading the **_already-downloaded_** weights, attempting to provide more information to the `exit(1)` error in mlc-ai#322. The only change is TVMJS's commit apache/tvm@b193cbb from apache/tvm#16650
atebites-hub
pushed a commit
to atebites-hub/web-llm
that referenced
this pull request
Oct 4, 2025
Changes in WebLLM: - Stateful chat completion: mlc-ai#330 - OpenAI's `logit_bias`: mlc-ai#331 - OpenAI's `logprobs` and `top_logprobs`: mlc-ai#333 Changes in TVMjs: - apache/tvm#16650 - Fix param download issues (already reflected in 0.2.26, but at the time this PR was not merged yet) - Expose `sampleTopPFromProb` to support `logprobs` (new in 0.2.27)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR address the issue in mlc-ai/web-llm#313. We make the following changes:
try-catchwhen loading shards onto ndarraycacheSeparately, we add and export IndexedDB initial implementation.