Issue 629 - Use a generator for Cells::getAllCacheKeys to improve performance#822
Closed
matt-allan wants to merge 1 commit intoPHPOffice:developfrom
matt-allan:get-all-keys-generator
Closed
Issue 629 - Use a generator for Cells::getAllCacheKeys to improve performance#822matt-allan wants to merge 1 commit intoPHPOffice:developfrom matt-allan:get-all-keys-generator
matt-allan wants to merge 1 commit intoPHPOffice:developfrom
matt-allan:get-all-keys-generator
Conversation
Using a generator reduces memory usage and improves performance when loading large spreadsheets.
Member
|
Excellent PR, thanks ! |
guillaume-ro-fr
pushed a commit
to guillaume-ro-fr/PhpSpreadsheet
that referenced
this pull request
Jun 12, 2019
Using a generator reduces memory usage and improves performance when loading large spreadsheets. Closes PHPOffice#822
BlackyTay
pushed a commit
to BlackyTay/PhpSpreadsheet
that referenced
this pull request
Aug 8, 2025
Using a generator reduces memory usage and improves performance when loading large spreadsheets. Closes PHPOffice#822
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Using a generator reduces memory usage and improves performance
when loading large spreadsheets.
This is:
Checklist:
Why this change is needed?
PHPSpreadsheet currently uses a lot of memory when loading large spreadsheets (see #648, #629). All of the coordinates in the spreadsheet are copied into a new array when the method
Cells::getAllCacheKeysis called. Since the coordinates are concatenated with a new string this results in new strings being created in memory.This result of this method is only ever passed to
Psr\SimpleCache\Cacheinterface::getMultipleandPsr\SimpleCache\Cacheinterface::setMultiple. Since both of these methods accept aniterablewe can use a generator instead. Using a generator means we don't need to build the entire array in memory and instead can return one value at a time as needed.I benchmarked this with a ~100K row spreadsheet and saw a 16% improvement in run time and 20% improvement in memory usage. You can view a comparison here.
You can also view the individual profiles here:
This is the benchmark script I used:
I also attached the xlsx if you would like to run the benchmark yourself.