Skip to content

Bfloat16 quantization#1229

Merged
mrariden merged 3 commits intomainfrom
bfloat16_quantization
Jun 9, 2025
Merged

Bfloat16 quantization#1229
mrariden merged 3 commits intomainfrom
bfloat16_quantization

Conversation

@mrariden
Copy link
Copy Markdown
Collaborator

@mrariden mrariden commented Jun 6, 2025

CPSAM's transformer weights are unnecessarily too precise at 32bit, and 16 bit is sufficient for prediction. Switching to 16bit comes with multiple benefits, not the least of which is to free up RAM for loading more images for evaluation, reducing OOM issues.

This PR:

  • Sets default to bfloat16, although this can be changed during CellposeModel object instantiation using the use_bfloat16 flag.
  • Reduces model size from 1.2GB to 580MB
  • Reduces runtime by ~20%
  • Retains all segmentation accuracy

Testing:

  • Notebook verification for API
  • GUI tesing
  • CLI testing

@mrariden mrariden self-assigned this Jun 6, 2025
@mrariden mrariden merged commit 7e194bc into main Jun 9, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant