Skip to content

scancode crash when supplying PNG file to --license-policy #3594

Open
@armijnhemel

Description

@armijnhemel

Description

Sanity checks for --license-policy are missing leading to a crash.

How To Reproduce

$ ./scancode -l scancode --spdx-tv /tmp/scancode.spdx --license-policy /tmp/tmp8ulw9skr.png 
Setup plugins...
Collect file inventory...
Scan files for: licenses with 1 process(es)...
[####################] 2                  
ERROR: failed to run post-scan plugin: license-policy:
Traceback (most recent call last):
  File "/home/armijn/git/scancode-toolkit/src/scancode/cli.py", line 1084, in run_codebase_plugins
    plugin.process_codebase(codebase, **kwargs)
  File "/home/armijn/git/scancode-toolkit/src/licensedcode/plugin_license_policy.py", line 77, in process_codebase
    if has_policy_duplicates(license_policy):
  File "/home/armijn/git/scancode-toolkit/src/licensedcode/plugin_license_policy.py", line 114, in has_policy_duplicates
    policies = load_license_policy(license_policy_location).get('license_policies', [])
  File "/home/armijn/git/scancode-toolkit/src/licensedcode/plugin_license_policy.py", line 141, in load_license_policy
    conf_content = conf.read()
  File "/usr/lib64/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

WARNING: Files are missing a SHA1 attribute. Incomplete SPDX document created.
Scanning done.
Some files failed to scan properly:
ERROR: failed to run post-scan plugin: license-policy:
Traceback (most recent call last):
  File "/home/armijn/git/scancode-toolkit/src/scancode/cli.py", line 1084, in run_codebase_plugins
    plugin.process_codebase(codebase, **kwargs)
  File "/home/armijn/git/scancode-toolkit/src/licensedcode/plugin_license_policy.py", line 77, in process_codebase
    if has_policy_duplicates(license_policy):
  File "/home/armijn/git/scancode-toolkit/src/licensedcode/plugin_license_policy.py", line 114, in has_policy_duplicates
    policies = load_license_policy(license_policy_location).get('license_policies', [])
  File "/home/armijn/git/scancode-toolkit/src/licensedcode/plugin_license_policy.py", line 141, in load_license_policy
    conf_content = conf.read()
  File "/usr/lib64/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
Summary:        licenses with 1 process(es)
Errors count:   1
Scan Speed:     1.83 files/sec. 
Initial counts: 1 resource(s): 1 file(s) and 0 directorie(s) 
Final counts:   1 resource(s): 1 file(s) and 0 directorie(s) 
Timings:
  scan_start: 2023-11-17T150048.155768
  scan_end:   2023-11-17T150051.949987
  setup_scan:licenses: 3.24s
  setup: 3.24s
  scan: 0.55s
  total: 3.85s
Removing temporary files...done.

System configuration

For bug reports, it really helps us to know:

  • What OS are you running on? (Windows/MacOS/Linux)
  • What version of scancode-toolkit was used to generate the scan file?
  • What installation method was used to install/run scancode? (pip/source download/other)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions