Skip to content

fix(forge test): ensure --rerun only runs failing tests from correct contracts #10753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

dghelm
Copy link
Contributor

@dghelm dghelm commented Jun 10, 2025

Motivation

Fixes issue #10693 where forge test --rerun incorrectly runs passing tests that have the same name as failing tests in different contracts.

Solution

Implemented qualified test matching with contract-aware filtering while maintaining full backward compatibility:

Key Changes

  1. Extended TestFilter trait with matches_qualified_test(contract_name, test_name) method for context-aware filtering
  2. Enhanced FilterArgs system with qualified_failures field to store specific contract/test combinations
  3. Smart failure persistence that intelligently chooses between qualified format (ContractName_testName) and legacy format (testName1|testName2)
    based on whether multiple contracts are involved
  4. Updated test runner logic in is_matching_test_in_context to use qualified matching when appropriate

Architecture

  • Intelligent format detection: Uses qualified format only when test names are ambiguous across multiple contracts
  • Graceful fallback: Maintains legacy behavior for single-contract scenarios
  • Conservative approach: Minimizes format changes to preserve existing workflows

Files Modified

  • crates/common/src/traits.rs - Extended TestFilter trait
  • crates/forge/src/cmd/test/filter.rs - Added qualified matching logic and fixed test name format consistency
  • crates/forge/src/cmd/test/mod.rs - Enhanced failure persistence and loading with intelligent format selection
  • crates/forge/src/multi_runner.rs - Updated test matching logic to use qualified context
  • crates/forge/tests/cli/test_cmd.rs - Added comprehensive regression test

Backward Compatibility

All existing functionality is preserved:

  • Single-contract scenarios continue using legacy format
  • Existing test failure files work unchanged
  • All CLI arguments and behavior remain the same

Side Effects and Considerations

Potential Side Effects

  1. Test failure file format changes: Files may now contain qualified names (ContractTest_testName) in multi-contract scenarios
  2. Filter matching behavior: Slightly different matching logic when qualified failures are present
  3. Performance impact: Minimal additional processing to detect ambiguous test names

Mitigation

  • Backward compatibility preserved: Legacy format still used for single-contract scenarios
  • Graceful fallback: System falls back to legacy behavior when qualified parsing fails
  • Conservative approach: Only uses qualified format when absolutely necessary

PR Checklist

  • Added Tests
    • Added should_replay_failures_only_correct_contract regression test
    • Verified existing should_replay_failures_only test continues to pass
  • Added Documentation
    • Comprehensive inline code comments explaining the qualified matching logic
    • Updated function documentation for new matches_qualified_test method
  • Breaking changes

This contribution was made to explore the viability of bug-fixing work via Claude Code assistance, as prompted by this Tweet. If it's a mess, please don't spend too much time helping me get it over the line.

Copy link
Collaborator

@grandizzy grandizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for your PR, left some comments

// This is the key test: --rerun should NOT run both contracts with same test name
// Before the fix, this would run 2 tests (both testSameName functions)
// After the fix, this should run only 1 test (only the specific failing one)
cmd.forge_fuse().args(["test", "--rerun"]).assert_failure();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's check the stderr and make sure there's only one test executed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified test to capture output and assert stdout contains "0 tests passed, 1 failed, 0 skipped (1 total tests)" to ensure exactly one test runs. Does this work?

// Parse qualified pattern format: ContractName_testName or legacy format
if content.contains('_') && !content.contains('|') {
// Single qualified failure
if let Some(pos) = content.rfind('_') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct as tests could have the test_ name format, so in this case a ContractName_test_Name pattern will result in contract_part being ContractName_test and test_part - Name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

Created split_qualified_test_name() function that looks for _test patterns to properly separate contract and test names, with fallback to original behavior.

@@ -42,6 +42,11 @@ pub struct FilterArgs {
/// Only show coverage for files that do not match the specified regex pattern.
#[arg(long = "no-match-coverage", visible_alias = "nmco", value_name = "REGEX")]
pub coverage_pattern_inverse: Option<regex::Regex>,

/// Qualified test failures from --rerun (contract, test) pairs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should make this pub arg but rather set (or reuse) the rerun flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove pub in next commit.

Not sure of a better option. The rerun flag is insufficient as a boolean when this is holding internal state that contains the actual list of (contract, test) pairs to rerun.

The flow is:
f--rerun flag → load failure file → populate qualified_failures → use in filtering

As I see it, the alternatives would be:

  1. Pass the data separately through multiple function parameters (messier)
  2. Global state (bad practice)
  3. Reload the file multiple times (inefficient)

@dghelm dghelm requested a review from grandizzy June 11, 2025 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

bug(forge test): --rerun runs passing tests of the same name in different contracts
2 participants