Skip to content

Improve performance on files whitelist processing #410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Konamiman
Copy link
Contributor

The original file whitelisting process works by using a FileWhitelistScoper class that maintains a list of all the whitelisted files. Every time a file is to be prefixed, the list is scanned and if it contains the file, then it's not prefixed. This is inefficient when the list of whitelisted files is big.

This pull request changes the process to use a different approach

  1. First, the whitelisted files list is intersected with the list of files matched by the finders (so that unmatched files aren't whitelisted either).
  2. Then the matched files list is diffed with the whitelisted files list (so that it contains only the list of files to prefix). After that, the file contents are retrieved.
  3. Both files lists, matched and whitelisted, are passed to the AddPrefixCommand class.
  4. The files in the matched list are prefixed as usual.
  5. Whitelisted files are simply copied to the destination directory using fileSystem->copy.

The ConfigurableScoper and FileWhitelistScoper classes are removed since they are no longer needed.

The logger is modified too so that now at the end of the process two separate counts are displayed, one for the prefixed files and
another one for the copied (whitelisted) files.

Improvement on validateOutputDir

The validateOutputDir method asks the user for confirmation in a number of cases (e.g. target directory already exists) but does nothing with the user input. As an additional improvement, this pull request fixes this so that if the user answers "no" then the AddPrefixCommand process is aborted and the application terminates.

TODO: Unit tests.

The original file whitelisting process works by using a
FileWhitelistScoper class that maintains a list of all the whitelisted
files. Every time a file is to be prefixed, the list is scanned and
if it contains the file, then it's not prefixed. This is inefficient
when the list of whitelisted files is big.

This commit changes the process to use a different approach.
First, the whitelisted files are excluded from the list of files
generated by the finders (and whose contents are loaded in memory).
Then both lists are passed to the AddPrefixCommand: the matched files
are processed as usual, the whitelisted ones are simply copied verbatim
using fileSystem->copy.

The logger is modified too so that now at the end of the process
two separate counts are displayed, one for the prefixed files and
another one for the copied (whitelisted) files.
The validateOutputDir asks the user for confirmation in a number of
cases (e.g. target directory already exists) but does nothing with
the user input. This commit fixes this so that if the user answers
"no" then the AddPrefixCommand process is aborted and the application
terminates.
@Konamiman
Copy link
Contributor Author

I need some help with unit tests on that one:

  • ConfigurationTest->test_it_can_create_a_complete_configuration: the generated config file now needs to have a finders section (otherwise the whitelist ends up empty), but if I add one I get Error: Class 'Finder' not found.
  • I can't run Command/AddPrefixCommandTest, I get Symfony\Component\Console\Exception\CommandNotFoundException: Command "init" is not defined.

@theofidry
Copy link
Member

Overall I'm fine with the approach but you need to double check the usage in https://github.com/box-project/box. It's fine to have a BC but what is currently done should remain possible for Box as well, which consumes PHP-Scoper not via the command but via the scoper directly

@Konamiman
Copy link
Contributor Author

Could you please point an example of where this pull request breaks Box?

@theofidry
Copy link
Member

If there is any break I suspect this would happen in https://github.com/box-project/box/blob/master/src/Configuration/Configuration.php#L2829

This change is needed but it's best to address it in a separate
pull request.
@theofidry
Copy link
Member

Closed via #445

@theofidry theofidry closed this May 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants