Skip to content

False positive in invalid_regex with unicode class and bytes regex #6005

Closed
@michaelsproul

Description

@michaelsproul

The invalid_regex lint incorrectly identifies a valid regular expression involving a unicode general category as invalid, when written as a raw string as the argument to regex::bytes::Regex::new.

Minimal example:

extern crate regex;

use regex::bytes::Regex;

fn main() {
    let re = Regex::new(r"\p{C}").unwrap();
    let text = "hello world\0";
    let processed_text = String::from_utf8(re.replace_all(text.as_bytes(), &b""[..]).to_vec()).unwrap();
    println!("{:?}", processed_text);
}

Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b4cfb83fe8ffa625e5c5881b05a89dbc

Clippy produces this error:

Checking playground v0.0.1 (/playground)
error: regex syntax error: Unicode not allowed here
 --> src/main.rs:6:27
  |
6 |     let re = Regex::new(r"\p{C}").unwrap();
  |                           ^^^^^
  |
  = note: `#[deny(clippy::invalid_regex)]` on by default
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#invalid_regex

error: aborting due to previous error

Clippy seems to be OK with:

Meta

  • regex crate v1.3.9
  • cargo clippy -V: clippy 0.0.212 (0d0f6b1 2020-09-03)
  • rustc -Vv:
    rustc 1.46.0 (04488afe3 2020-08-24)
    binary: rustc
    commit-hash: 04488afe34512aa4c33566eb16d8c912a3ae04f9
    commit-date: 2020-08-24
    host: x86_64-unknown-linux-gnu
    release: 1.46.0
    LLVM version: 10.0
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: Clippy is not doing the correct thing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions