I'll create the issue following the repository's template format:
Bug Report
Describe the current, buggy behavior
The get_file_list() method in src/Dist_Archive_Command. php (lines 493-536) uses RecursiveIteratorIterator which iterates through every single file in the directory tree, including all files inside directories that should be ignored according to .distignore. This causes severe performance degradation when dealing with large ignored directories like node_modules.
The problem occurs because:
- The
RecursiveIteratorIterator descends into every subdirectory, including ignored ones like node_modules/
- For each file (potentially 30,000+ files in
node_modules), the iterator:
- Creates an
SplFileInfo object
- Enters the
foreach loop
- Calculates the relative filepath
- Calls
$this->checker->isPathIgnored() to check if it should be excluded
- Only after all this work does it decide not to include the file in the archive
The check if ( $this->checker->isPathIgnored( $relative_filepath ) ) happens inside the loop for every single item found, rather than preventing descent into ignored directories in the first place.
Describe how other contributors can replicate this bug
- Create a WordPress plugin with a
node_modules directory containing 30,000+ files (or any large directory)
- Add
node_modules to your .distignore file:
node_modules
. git
vendor
tests
- Run
wp dist-archive . build. zip from the plugin directory
- Observe that the command takes several minutes to complete
- The command iterates through all 30,000+ files in
node_modules even though the entire directory is ignored
Describe what you would expect as the correct outcome
The iterator should skip descending into directories that are marked as ignored in .distignore, avoiding the need to iterate through their contents entirely.
For a project with node_modules containing 30,000 files:
- Current behavior: 30,000+ iterations and
isPathIgnored() checks, taking several minutes
- Expected behavior: Skip the
node_modules directory at the directory level, only iterate through files that need to be processed, completing in seconds
Let us know what environment you are running this on
OS: [Various - affects all operating systems]
PHP version: [Various - affects all PHP versions]
WP-CLI version: [Various - affects current versions using this package]
The issue is present in the current implementation regardless of environment.
Provide a possible solution
The solution is to use a RecursiveFilterIterator to filter out ignored directories before the RecursiveIteratorIterator descends into them. This way, the iterator never enters ignored directories, avoiding thousands of unnecessary iterations.
The filter would check at the directory level: "If this is a directory and it's ignored in .distignore, don't descend into it." This is fundamentally different from the current approach which checks every file after already finding it.
Provide additional context/Screenshots
This is a well-known performance pattern issue with RecursiveIteratorIterator. The iterator is designed to "flatten" the entire directory tree before filtering can occur.
Related discussion: The issue was previously mentioned in #81 (comment)
For WordPress plugins with modern build processes (npm packages, composer vendor directories, etc.), this performance issue makes the dist-archive command nearly unusable without manually deleting these directories first, which defeats the purpose of having a .distignore file.
To submit this issue:
Go to https://github.com/wp-cli/dist-archive-command/issues/new and paste the content above, or would you like me to attempt to create it directly for you?
I'll create the issue following the repository's template format:
Bug Report
Describe the current, buggy behavior
The
get_file_list()method insrc/Dist_Archive_Command. php(lines 493-536) usesRecursiveIteratorIteratorwhich iterates through every single file in the directory tree, including all files inside directories that should be ignored according to.distignore. This causes severe performance degradation when dealing with large ignored directories likenode_modules.The problem occurs because:
RecursiveIteratorIteratordescends into every subdirectory, including ignored ones likenode_modules/node_modules), the iterator:SplFileInfoobjectforeachloop$this->checker->isPathIgnored()to check if it should be excludedThe check
if ( $this->checker->isPathIgnored( $relative_filepath ) )happens inside the loop for every single item found, rather than preventing descent into ignored directories in the first place.Describe how other contributors can replicate this bug
node_modulesdirectory containing 30,000+ files (or any large directory)node_modulesto your.distignorefile:wp dist-archive . build. zipfrom the plugin directorynode_moduleseven though the entire directory is ignoredDescribe what you would expect as the correct outcome
The iterator should skip descending into directories that are marked as ignored in
.distignore, avoiding the need to iterate through their contents entirely.For a project with
node_modulescontaining 30,000 files:isPathIgnored()checks, taking several minutesnode_modulesdirectory at the directory level, only iterate through files that need to be processed, completing in secondsLet us know what environment you are running this on
The issue is present in the current implementation regardless of environment.
Provide a possible solution
The solution is to use a
RecursiveFilterIteratorto filter out ignored directories before theRecursiveIteratorIteratordescends into them. This way, the iterator never enters ignored directories, avoiding thousands of unnecessary iterations.The filter would check at the directory level: "If this is a directory and it's ignored in
.distignore, don't descend into it." This is fundamentally different from the current approach which checks every file after already finding it.Provide additional context/Screenshots
This is a well-known performance pattern issue with
RecursiveIteratorIterator. The iterator is designed to "flatten" the entire directory tree before filtering can occur.Related discussion: The issue was previously mentioned in #81 (comment)
For WordPress plugins with modern build processes (npm packages, composer vendor directories, etc.), this performance issue makes the
dist-archivecommand nearly unusable without manually deleting these directories first, which defeats the purpose of having a.distignorefile.To submit this issue:
Go to https://github.com/wp-cli/dist-archive-command/issues/new and paste the content above, or would you like me to attempt to create it directly for you?