Add retry to dataset loading#10
Merged
Merged
Conversation
rwitten
approved these changes
Apr 21, 2023
A9isha
pushed a commit
that referenced
this pull request
Apr 11, 2024
* Add GitHub Action to run all DAG scripts locally Change-Id: I7625c3ed953ce0da2e3ccfb5d4614eba7625b739 * fix requirements.txt path Change-Id: I81785543e9b2a77efe369bbd0396e7bef0e4c8e4 * Add BQ dep Change-Id: I2b50735c7d72c627e1fd38083b6c3c5b1c9feec3 * fix GHA name Change-Id: I347cc18fc0d39ac87fe81c467993e8353e94c5ad * comment out packages Change-Id: Ic14969ffb3350492797bfe7e2b67dde641ee5465
geeningwang
pushed a commit
to geeningwang/maxtext
that referenced
this pull request
Apr 20, 2026
Verified both scan modes on commit 055a4c2 after full env restore: scan_layers=false: 55.4 ms decode, 123.6 ms prefill (577.5 tok/s) scan_layers=true: 68.4 ms decode, 121.9 ms prefill (468 tok/s) Updated: - env_restore.md: add 2026-04-20 noscan results + summary table - opt4 plan: add noscan row to benchmark table - perf optimization: add opt AI-Hypercomputer#9 (reverted) and AI-Hypercomputer#10 rows, update both benchmark sections with 2026-04-20 results
ecnal-cienet
added a commit
that referenced
this pull request
Jun 24, 2026
…ffload (#10) Both paths need Pathways/TPU-memory infra at runtime, so the external pieces (reshard_pytree via pathwaysutils; move_memory_to_device) are mocked and the test pins our changes: - #9: scan_layers=False no longer raises and the unscanned policy params are pushed to the inference engine (guard removal). - #10: optimizer_memory_host_offload runs the device_put/update plumbing and yields the same params as the no-offload step (memory placement, not math).
hsuan-lun-chiang
pushed a commit
that referenced
this pull request
Jun 25, 2026
…ffload (#10) Both paths need Pathways/TPU-memory infra at runtime, so the external pieces (reshard_pytree via pathwaysutils; move_memory_to_device) are mocked and the test pins our changes: - #9: scan_layers=False no longer raises and the unscanned policy params are pushed to the inference engine (guard removal). - #10: optimizer_memory_host_offload runs the device_put/update plumbing and yields the same params as the no-offload step (memory placement, not math).
hsuan-lun-chiang
pushed a commit
that referenced
this pull request
Jun 25, 2026
…ffload (#10) Both paths need Pathways/TPU-memory infra at runtime, so the external pieces (reshard_pytree via pathwaysutils; move_memory_to_device) are mocked and the test pins our changes: - #9: scan_layers=False no longer raises and the unscanned policy params are pushed to the inference engine (guard removal). - #10: optimizer_memory_host_offload runs the device_put/update plumbing and yields the same params as the no-offload step (memory placement, not math).
hsuan-lun-chiang
pushed a commit
that referenced
this pull request
Jun 25, 2026
…ffload (#10) Both paths need Pathways/TPU-memory infra at runtime, so the external pieces (reshard_pytree via pathwaysutils; move_memory_to_device) are mocked and the test pins our changes: - #9: scan_layers=False no longer raises and the unscanned policy params are pushed to the inference engine (guard removal). - #10: optimizer_memory_host_offload runs the device_put/update plumbing and yields the same params as the no-offload step (memory placement, not math).
ecnal-cienet
added a commit
that referenced
this pull request
Jun 25, 2026
…ffload (#10) Both paths need Pathways/TPU-memory infra at runtime, so the external pieces (reshard_pytree via pathwaysutils; move_memory_to_device) are mocked and the test pins our changes: - #9: scan_layers=False no longer raises and the unscanned policy params are pushed to the inference engine (guard removal). - #10: optimizer_memory_host_offload runs the device_put/update plumbing and yields the same params as the no-offload step (memory placement, not math).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We have found rare flaky behavior with the data loading, a simple retry should solve the issue.