Skip to content

Do not find erroneus literals when there is a capture group and a character class of five or more #1081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PeterSP opened this issue Oct 8, 2018 · 4 comments
Labels
duplicate An issue that is duplicative of another.

Comments

@PeterSP
Copy link
Contributor

PeterSP commented Oct 8, 2018

What version of ripgrep are you using?

ripgrep 0.10.0
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

I used cargo install

What operating system are you using ripgrep on?

Darwin C02TM0LHG8WL 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64

If this is a bug, what is the actual behavior?

$ echo " a_" | rg ' ([abcde]+_)' --debug
DEBUG|grep_regex::literal|${HOME}/.cargo/registry/src/github.jpy.wang-1ecc6299db9ec823/grep-regex-0.1.1/src/literal.rs:110: required literal found: " _"
DEBUG|globset|${HOME}/.cargo/registry/src/github.jpy.wang-1ecc6299db9ec823/globset-0.4.2/src/lib.rs:429: built glob set; 0 literals, 0 basenames, 8 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
$ echo " a_" | rg ' ([abcd]+_)' --debug
DEBUG|grep_regex::literal|${HOME}/.cargo/registry/src/github.jpy.wang-1ecc6299db9ec823/grep-regex-0.1.1/src/literal.rs:100: required literals found: [Cut( a), Cut( b), Cut( c), Cut( d), Complete( _)]
DEBUG|globset|${HOME}/.cargo/registry/src/github.jpy.wang-1ecc6299db9ec823/globset-0.4.2/src/lib.rs:429: built glob set; 0 literals, 0 basenames, 8 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
 a_

Or, without all that noise:

$ echo " a_" | rg ' ([abcde]+_)'
$ echo " a_" | rg ' ([abcd]+_)'
 a_

If this is a bug, what is the expected behavior?

What do you think ripgrep should have done?

  1. Not find " _" as a literal which neither matches all nor any of the strings the pattern desired matches.
  2. Find " a_" as a match.
@lespea
Copy link

lespea commented Oct 8, 2018

Looks like the same issues as #1081. I tested the version of ripgrep with this fixed and it appears to work as expected.

$  echo " a_" | rg ' ([abcde]+_)'; echo " a_" | rg ' ([abcd]+_)'
a_
a_

@PeterSP
Copy link
Contributor Author

PeterSP commented Oct 8, 2018

It appears that this is a dupe of #1064 (in that it's fixed by the associated change).

$ cargo uninstall ripgrep
$ cargo install ripgrep --git [email protected]:BurntSushi/ripgrep.git
...
$ echo "abg" | rg 'a([bcdef]+g)' --debug
DEBUG|grep_regex::literal|grep-regex/src/literal.rs:110: required literal found: "a"
DEBUG|globset|globset/src/lib.rs:429: built glob set; 0 literals, 0 basenames, 8 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
abg

@PeterSP PeterSP closed this as completed Oct 8, 2018
@PeterSP
Copy link
Contributor Author

PeterSP commented Oct 8, 2018

I can still add a regression test for this, but I suspect it's not worth it (as it's nearly identical to the regression test for #1064).

@BurntSushi
Copy link
Owner

Yeah, I think we are all set here. Thanks for the report!

@BurntSushi BurntSushi added the duplicate An issue that is duplicative of another. label Oct 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate An issue that is duplicative of another.
Projects
None yet
Development

No branches or pull requests

3 participants