Skip to content

Add collection querying rule #962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lovro-bikic
Copy link

Adds a rule to complement RuboCop PR rubocop/rubocop#14288.

This rule suggests to use Enumerable querying methods rather than expressions with #count to check if collections meet some criteria.

@pirj
Copy link
Member

pirj commented Jun 14, 2025

Those two methods are not mutually interchangeable in the general case, are they?
The cop is marked as unsafe.
The two methods differ semantically.
Do we need a guideline to prefer one over the next other?

How important is it to have this guideline to get the referenced cop merged?

@lovro-bikic
Copy link
Author

Good questions! I'll answer what I can from my side.

Those two methods are not mutually interchangeable in the general case, are they?

It depends on method signature.

With a block, they're mutually interchangeable (e.g. x.count(&:something).positive? == x.any?(&:something)) if you only care about the return value (as long as #something doesn't run side effects).

With no arguments, they're mutually interchangeable as long as the collection doesn't include falsy values (this has been noted in the rule).

The two methods differ semantically.

This can be discussed, but assuming users work with truthy-value collections, I believe the semantics are the same, but predicates express them more clearly.

For example, in this Rails report, the cop commonly caught stuff like Article.where(published: true).count == 0 (check if there are no published articles), the intention which I think Article.where(published: true).none? makes more obvious and human readable.

The cop is marked as unsafe.

I'm not sure what should this comment signify, but there's a fair number of unsafe cops that are linked to style guide rules.

@pirj
Copy link
Member

pirj commented Jun 14, 2025

With a block, they're mutually interchangeable

Fair enough. However, we can’t guide those who have falsey values in enumerbles if they’re not passing a block.

if you only care about the return value (as long as #something doesn't run side effects).

Interesting mention. any? would return early without calling the block on all entries, while count will.

For example, rubocop/rubocop#14288 (comment), the cop commonly caught stuff like Article.where(published: true).count == 0 (check if there are no published articles), the intention which I think Article.where(published: true).none? makes more obvious and human readable

True that. But that’s AR-specific. There’s also performance to consider rubocop/rails-style-guide#232 (comment)

but there's a fair number of unsafe cops that are linked to style guide rules.

I would guess because cop can’t reliably detect the type, eg Array vs AR::Relation.
But in our case the cop is unsafe, but there’s some unsafety to the guideline itself, too, specifically the non-block form with enumerables having falsey values.

What do you think of reducing the scope of the guideline to only tell about the block form of count?

My primary concern here is that those guidelines aim to prevent programmers from making mistakes, while using any? instead of count may work, but is a time bomb, which will blow up when falsey values start appearing in the data.

@bbatsov
Copy link
Collaborator

bbatsov commented Jun 14, 2025

The guidelines are not laws, but rather suggestions for the most common cases that people might encounter. While there are caveats to be kept in mind, I do think that in most cases using the querying methods is the way to go and most likely people don't use them just because they forget about them (or are not aware of them in the first place).

README.adoc Outdated
@@ -4690,6 +4690,31 @@ ary[..42]
ary[0..42]
----

=== Collection querying [[collection-querying]]

When possible, use https://docs.ruby-lang.org/en/master/Enumerable.html#module-Enumerable-label-Methods+for+Querying[predicate methods from `Enumerable`] rather than expressions with `#count`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might also want to add a bit of rationale here (e.g. readability and better performance in some cases)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a paragraph that explains readability and performance benefits

[source,ruby]
----
# bad
array.count > 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, there's also length to consider. ;-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, updated. Note just that I wouldn't include length/size in the cop because it would yield too many false positives since Ruby has core classes which implement those methods, but aren't Enumerable (e.g. Integer, String, File, etc.).

@bbatsov
Copy link
Collaborator

bbatsov commented Jun 15, 2025

One more thing - this should probably mention length as well. (https://batsov.com/articles/2014/02/17/the-elements-of-style-in-ruby-number-13-length-vs-size-vs-count/)

In general for me the use of count without a block is a code smell on its own.

@lovro-bikic lovro-bikic force-pushed the collection-querying branch from 97d9d25 to 6769f36 Compare June 15, 2025 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants