Skip to content

Simplified checkpoint loading#18

Merged
jlamypoirier merged 4 commits into
mainfrom
common_checkpoint_loading
Oct 23, 2024
Merged

Simplified checkpoint loading#18
jlamypoirier merged 4 commits into
mainfrom
common_checkpoint_loading

Conversation

@jlamypoirier
Copy link
Copy Markdown
Collaborator

@jlamypoirier jlamypoirier commented Oct 22, 2024

✨ Description

Refactored checkpoint loading to make is simpler. The code is still a mess, but will be more manageable for future work.

Functional changes:

  • Distributed checkpoint loading will determine automatically whether to use the fast (same format) or safe (different format) loading scheme. This means checkpoints will load correctly after a config change in mid-training, and in some case pretrained checkpoint loading will be faster.
  • Added some safety checks in checkpoint configs.

🔍 Type of change

Select all that apply:

  • 🐛 Bug fix (non-breaking change that addresses a specific issue)
  • 🚀 New feature (non-breaking change that adds functionality)
  • ⚠️ Breaking change (a change that could affect existing functionality)
  • 📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
  • 🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
  • 📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
  • 📝 Documentation change (updates documentation, including new content or typo fixes)
  • 🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)

@jlamypoirier jlamypoirier marked this pull request as ready for review October 23, 2024 15:57
@jlamypoirier jlamypoirier merged commit a528154 into main Oct 23, 2024
@jlamypoirier jlamypoirier deleted the common_checkpoint_loading branch October 23, 2024 17:00
@jlamypoirier jlamypoirier mentioned this pull request Oct 25, 2024
@tscholak tscholak added this to the 0.2.0 milestone Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants