Skip to content

perf: use is_coalesced flag in sparse_eye instead of .coalesce()#66

Merged
theo-barfoot merged 2 commits intocai4cai:mainfrom
aymuos15:perf/sparse-eye-remove-coalesce
Jan 13, 2026
Merged

perf: use is_coalesced flag in sparse_eye instead of .coalesce()#66
theo-barfoot merged 2 commits intocai4cai:mainfrom
aymuos15:perf/sparse-eye-remove-coalesce

Conversation

@aymuos15
Copy link
Contributor

My local benchmarks (in the style of the repo) show nice improvements on this.

Since indices are constructed in sorted order with no duplicates,
pass is_coalesced=True to torch.sparse_coo_tensor() instead of
calling .coalesce() after creation. This avoids unnecessary
index scanning and provides a speedup.
@aymuos15 aymuos15 force-pushed the perf/sparse-eye-remove-coalesce branch from 4b70e1a to 76556d0 Compare January 13, 2026 11:58
@aymuos15
Copy link
Contributor Author

Indices from torch.arange (lines 880, 885) are inherently sorted with no duplicates. So is_coalesced=True safe.

Copy link
Collaborator

@theo-barfoot theo-barfoot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I have added a comment to clarify in the code: "is_coalesced=True since there are no duplicate indices in identity matrix, flag avails in PyTorch 2.1+"

@theo-barfoot theo-barfoot merged commit 3b1e478 into cai4cai:main Jan 13, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants