Skip to content

Apriel SSM/Hybrid#258

Merged
oleksost merged 165 commits into
mainfrom
oleksiy/apriel-ssm
Jun 12, 2025
Merged

Apriel SSM/Hybrid#258
oleksost merged 165 commits into
mainfrom
oleksiy/apriel-ssm

Conversation

@oleksost
Copy link
Copy Markdown
Contributor

@oleksost oleksost commented May 9, 2025

✨ Description

This pr improves some minor things in SSM/Hybrid classes, adds functionality for loading and exporting Apriel SSM and hybrid SSM models (adds corresponding modeling.py classes), adds embeddings_lr_scale argument

🔍 Type of change

Select all that apply:

  • 🐛 Bug fix (non-breaking change that addresses a specific issue)
  • 🚀 New feature (non-breaking change that adds functionality)
  • ⚠️ Breaking change (a change that could affect existing functionality)
  • 📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
  • 🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
  • 📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
  • 📝 Documentation change (updates documentation, including new content or typo fixes)
  • 🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)

📝 Changes

List the key changes introduced in this PR:

  1. Add mdoeling.py classes for Apriel SSM and hybrid
  2. Import & Export of Apriel SSM and hybrid models
  3. Added embeddings_lr_scale
  4. Added output_lr_scale
  5. Debug parsing of lr_schedule when its provided as a string

✅ Checklist

Make sure the following tasks are completed before submitting the PR:

General

  • 📜 I have read and followed the contributing guidelines.
  • 🏷️ I am using a clear and descriptive PR title that summarizes the key change or feature introduced.
  • 🎉 The functionality is complete, and I have tested the changes.
  • 📝 I have updated the documentation if needed.
  • ⚠️ The change does not introduce any new issues (e.g., runtime warnings, type checker errors, linting problems, unhandled edge cases).
  • 🧩 I have commented my code, especially in hard-to-understand areas.

Dependencies and Configuration

  • 🐋 I have updated the Docker configuration or dependencies, if applicable.
  • 🔄 I have ensured compatibility with the existing setup after dependency changes.

Testing

  • 🧪 I have added or updated tests to cover my changes.
  • ✔️ New and existing tests pass locally with my changes.
  • 🚦 I have tested these changes on GPUs and verified training stability.
  • 🏋️ I have tested the changes on realistic training workloads, if applicable.

Performance Impact

  • 📊 I have run benchmarks where applicable to evaluate the performance impact.
  • ✅ The benchmarks show no performance regression.
  • 🚀 The benchmarks indicate a potential performance improvement.
  • ⚠️ The benchmarks indicate a potential performance degradation.
  • 📈 I have provided benchmark results and detailed any performance impact below, if applicable.

📊 Performance Impact Details


🗒️ Additional Notes

Copy link
Copy Markdown
Collaborator

@jlamypoirier jlamypoirier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ready to merge?

@oleksost
Copy link
Copy Markdown
Contributor Author

oleksost commented Jun 11, 2025

@jlamypoirier apologies for delayed reply, yes, it should be ready. Just need to run local tests and verify everything is ok, will merge after.

@oleksost oleksost merged commit 0e1c23c into main Jun 12, 2025
4 checks passed
@oleksost oleksost deleted the oleksiy/apriel-ssm branch June 12, 2025 03:27
@oleksost oleksost restored the oleksiy/apriel-ssm branch June 12, 2025 03:29
@oleksost oleksost mentioned this pull request Jun 12, 2025
25 tasks
@jlamypoirier jlamypoirier deleted the oleksiy/apriel-ssm branch June 12, 2025 14:53
@oleksost oleksost restored the oleksiy/apriel-ssm branch June 12, 2025 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants