Skip to content

feat(scan_layers): verify scan_layers compatibility from checkpoint metadata#4304

Draft
RexBearIU wants to merge 2 commits into
mainfrom
jackyf/proactive-scan-layers
Draft

feat(scan_layers): verify scan_layers compatibility from checkpoint metadata#4304
RexBearIU wants to merge 2 commits into
mainfrom
jackyf/proactive-scan-layers

Conversation

@RexBearIU

@RexBearIU RexBearIU commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Description

Note

This PR is based on #4269.

This PR introduces proactive validation of scan_layers configuration compatibility when loading checkpoint metadata in MaxText.

Key changes:

  • Created a robust load_checkpoint_metadata helper in checkpointing.py that handles custom metadata loading with a simplified broad-except fallback.
  • Refactored sync_lora_metadata in lora_utils.py to reuse the unified metadata helper.
  • Added proactive check of scan_layers value from the checkpoint custom metadata inside from_pretrained function in model_creation_utils.py to assert configuration match and prevent silent model mismatches.

Tests

Tested this change by running the unit tests for both hf_checkpoint_conversion_test.py and model_creation_utils_test.py under the CPU platform:

PYTHONPATH=src JAX_PLATFORMS=cpu pytest tests/unit/hf_checkpoint_conversion_test.py tests/unit/model_creation_utils_test.py

Output: 45 passed, 25 skipped

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

Co-authored-by: Xibin Liu <xibin@google.com>
@RexBearIU RexBearIU changed the base branch from jackyf/lora-ckpt-metadata to main June 30, 2026 10:48
@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 30.00000% with 35 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/utils/lora_utils.py 4.16% 23 Missing ⚠️
src/maxtext/common/checkpointing.py 47.05% 8 Missing and 1 partial ⚠️
...rc/maxtext/checkpoint_conversion/to_huggingface.py 0.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

@RexBearIU RexBearIU force-pushed the jackyf/proactive-scan-layers branch from 0451365 to 0330b0a Compare June 30, 2026 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant