Skip to content

Contrib: FLUX.1-lite-8B-alpha (native FLUX.1 compatibility)#147

Open
jimburtoft wants to merge 3 commits intoaws-neuron:mainfrom
jimburtoft:contrib/flux1-lite-8b
Open

Contrib: FLUX.1-lite-8B-alpha (native FLUX.1 compatibility)#147
jimburtoft wants to merge 3 commits intoaws-neuron:mainfrom
jimburtoft:contrib/flux1-lite-8b

Conversation

@jimburtoft
Copy link
Copy Markdown
Contributor

Summary

  • FLUX.1-lite-8B-alpha (Freepik) is architecturally identical to FLUX.1-dev with 8 double-stream MMDiT blocks instead of 19. All other components (CLIP + T5-XXL encoders, VAE, scheduler, RoPE) are the same.
  • NxDI's first-party FLUX.1 implementation reads num_layers from the model's config.json at runtime, so FLUX.1-lite works out of the box with no custom modeling code.
  • This contrib provides a standalone generation script, integration tests, and documentation demonstrating native compatibility.

Validation Results (trn2.3xlarge, LNC=2, TP=4)

Metric Value
Resolution 1024x1024
Inference steps 25
E2E generation time 5.91s avg
Pipeline steps/sec 4.23
Backbone forward/sec 4.49
Compilation time ~128s

Checklist

Model Type

  • Diffusion/image generation model

Contribution Contents

  • README with model info, benchmarks, usage instructions
  • src/ directory with generation script
  • test/integration/test_model.py with 3 passing tests
  • Sample output image
  • vLLM integration (N/A -- diffusion model)

Testing

  • All code tested on Neuron hardware (trn2.3xlarge)
  • All numbers in README are measured, not estimated
  • Integration tests pass: smoke test, image generation, timing

SDK Compatibility

  • Neuron SDK 2.29 (DLAMI 20260410)
  • NxD Inference 0.9
  • PyTorch 2.9

FLUX.1-lite-8B-alpha (Freepik) is architecturally identical to FLUX.1-dev
with 8 double-stream blocks instead of 19. NxDI's FLUX.1 implementation
reads num_layers from config.json at runtime, so it works out of the box
with FLUX.1-lite weights -- no custom modeling code needed.

Validated on trn2.3xlarge (LNC=2, TP=4):
- 5.91s per 1024x1024 image (25 steps)
- 4.49 backbone fwd/sec
- ~128s compilation time
- SDK 2.29, NxD Inference 0.9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant