Skip to content

8bit Schedule-Free Optimizer #1950

@idunafu

Description

@idunafu

Feature request

I'd like to request support for the 8-bit version of the Schedule-Free Optimizer.

Motivation

Schedule-Free optimizers are useful because they remove the need for an external learning-rate schedule but their AdamW variant still carries optimizer-state memory overhead. A related request was raised in the Schedule-Free repository, where the maintainer suggested that projects already maintaining 8-bit optimizers might be a better place for Schedule-Free 8-bit variants. Since bitsandbytes already provides memory-efficient 8-bit optimizers, an 8-bit Schedule-Free AdamW variant seems good.

This would be useful for users who want the training behavior of ScheduleFree AdamW but are constrained by GPU memory.

Your contribution

I made a small experimental prototype and observed similar loss curves to the official fp32 ScheduleFree AdamW implementation in limited experiments. So far I have only tested small CNNs and ViT-Tiny on CIFAR-10 due to GPU constraints. In those tests, the prototype gave roughly 40% memory reduction in my setup, although throughput was somewhat lower. These results are preliminary and not meant as a broad benchmark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    OptimizersIssues or feature requests relating to optimizers
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions