Closed
Description
The invalid_regex
lint incorrectly identifies a valid regular expression involving a unicode general category as invalid, when written as a raw string as the argument to regex::bytes::Regex::new
.
Minimal example:
extern crate regex;
use regex::bytes::Regex;
fn main() {
let re = Regex::new(r"\p{C}").unwrap();
let text = "hello world\0";
let processed_text = String::from_utf8(re.replace_all(text.as_bytes(), &b""[..]).to_vec()).unwrap();
println!("{:?}", processed_text);
}
Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b4cfb83fe8ffa625e5c5881b05a89dbc
Clippy produces this error:
Checking playground v0.0.1 (/playground)
error: regex syntax error: Unicode not allowed here
--> src/main.rs:6:27
|
6 | let re = Regex::new(r"\p{C}").unwrap();
| ^^^^^
|
= note: `#[deny(clippy::invalid_regex)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#invalid_regex
error: aborting due to previous error
Clippy seems to be OK with:
regex::Regex
instead ofregex::bytes::Regex
The same regex written as an ordinary stringEDIT: actually this fails in the same way"\\p{C}"
Meta
regex
crate v1.3.9cargo clippy -V
: clippy 0.0.212 (0d0f6b1 2020-09-03)rustc -Vv
:rustc 1.46.0 (04488afe3 2020-08-24) binary: rustc commit-hash: 04488afe34512aa4c33566eb16d8c912a3ae04f9 commit-date: 2020-08-24 host: x86_64-unknown-linux-gnu release: 1.46.0 LLVM version: 10.0