-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Migrate core test to insta, part1 #16324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Migrate core test to insta, part1 #16324
Conversation
Thank you @Chen-Yuan-Lai |
Hey @Chen-Yuan-Lai! In the PR title you have
but in the PR body you write
Just FYI, if you expect more work on this issue, you may want to change "closes" to "part of" - because otherwise github will actually close the issue once this PR is merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I wrote some comments, please let me know what you think :)
let plan = test_sql(sql).unwrap(); | ||
assert_eq!(expected_plan, format!("{plan}")); | ||
assert_snapshot!( | ||
format!("{plan}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_snapshot triggers format internally so you can just do
format!("{plan}"), | |
plan, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment, the format!
is exactly unnecessary in these cases
.collect::<Vec<_>>() | ||
.join("\n"); | ||
insta::with_settings!({filters => vec![ | ||
(r"\d+\.?\d*[µmn]?s", "[TIME]"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[µmn]
- I think this can be somewhat flaky it the test takes more time - maybe match on []
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, current pattern can't match the longer time (ex. min, h, d). However, a broader pattern might match irrelevant strings. For example, match on [ ]
: metrics=[output_rows=1, elapsed_compute=110.947µs] -> metrics=[TIME]
I'm currently considering to add more time units in the regex pattern:
r"\d+\.?\d*(?:µs|ms|ns|s|min|h)\b"
Or, do you have any suggestions for a more precise way to match time-related strings? many thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just for this teset case we could avoid using insta
and use the previous approach of substring matches
While that approach is also non ideal, it has seemed to work this far
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with keeping as is, but on this
might match irrelevant strings
you can probably use lookahead https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably use lookahead https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion
Unfortunately, Rust's regex crate does not support lookahead or lookbehind assertions 😢 ( ref )
.map(|line| re.replace_all(line, "$1").trim_end().to_string()) | ||
.filter(|line| !line.is_empty() && !line.starts_with('+')) | ||
.collect::<Vec<_>>() | ||
.join("\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about just asserting the table itself without modifications? I fear that this code will:
a - make test a bit more complicated
b - needs to be manually maintained when we change the formatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to assert the raw table, but I found that it caused inconsistent trailing whitespace in each snapshot testing. That is why I trimmed the table to only keep the plan context.
Do you have any idea for keeping the consistent table format in each execution? thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can just keep using assert_metrics
for this style test (or perhaps revert the changes from this PR and work on migrating the metrics tests to a nicer style in a separate PR so we don't delay merging the other parts of this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I have revert the change in this PR
@@ -797,14 +796,16 @@ async fn explain_physical_plan_only() { | |||
let sql = "EXPLAIN select count(*) from (values ('a', 1, 100), ('a', 2, 150)) as t (c1,c2,c3)"; | |||
let actual = execute(&ctx, sql).await; | |||
let actual = normalize_vec_for_explain(actual); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After searching ExplainNormalizer
in the codebase, I found that only four test cases used it:
test_physical_plan_display_indent
(initializeExplainNormalizer
)test_physical_plan_display_indent_multi_children
(initializeExplainNormalizer
)explain_logical_plan_only
(callnormalize_vec_for_explain
)explain_physical_plan_only
(callnormalize_vec_for_explain
)
1.2. created the physical plan with fixed core numbers (90000), and 3. 4. doesn't have partitioning=RoundRobinBatch()
string in the snapshot, so I think this change may not be necessary?
Also could you please resolve the conflicts |
c3864e3
to
9944c9f
Compare
Oh! Sorry, I have corrected the PR body |
f787d3a
to
a43a6a3
Compare
…alyze_baseline_metrics
a43a6a3
to
c981e70
Compare
Which issue does this PR close?
core
tests toinsta
#15791 .Rationale for this change
What changes are included in this PR?
Are these changes tested?
Yes, I manually tested the before/after changes.
Are there any user-facing changes?
No