Closed
Description
beautifulsoup4 4.13 introduces a breaking change in the text processing module at /src/commoncode/text.py (Link), see #4129
as_unicode(s)
returns bytes
instead of str
starting with 4.13, which in turn breaks is_markup(location)
/is_markup_text(text)
in scancode here.
From the Changelog:
- UnicodeDammit.markup is now always a bytestring representing the
original markup (sans BOM), and UnicodeDammit.unicode_markup is
always the converted Unicode equivalent of the original
markup. Previously, UnicodeDammit.markup was treated inconsistently
and would often end up containing Unicode. UnicodeDammit.markup was
not a documented attribute, but if you were using it, you probably
want to switch to using .unicode_markup instead.If
UnicodeDammit(s).unicode_markup
is used here instead ofUnicodeDammit(s).markup
, a unicode string is returned:
Metadata
Metadata
Assignees
Labels
No labels