Skip to content

Allow return type annotations to use their own parentheses #6436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 11, 2023

Conversation

charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Aug 9, 2023

Summary

This PR modifies our logic for wrapping return type annotations. Previously, we always wrapped the annotation in parentheses if it expanded; however, Black only exhibits this behavior when the function parameters is empty (i.e., it doesn't and can't break). In other cases, it uses the normal parenthesization rules, allowing nodes to bring their own parentheses.

For example, given:

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(x) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...

Black will format as:

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...


def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x,
) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...

Whereas, prior to this PR, Ruff would format as:

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...


def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x,
) -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...

Closes #6431.

Test Plan

Before:

  • zulip: 0.99702
  • django: 0.99784
  • warehouse: 0.99585
  • build: 0.75623
  • transformers: 0.99470
  • cpython: 0.75988
  • typeshed: 0.74853

After:

  • zulip: 0.99724
  • django: 0.99791
  • warehouse: 0.99586
  • build: 0.75623
  • transformers: 0.99474
  • cpython: 0.75956
  • typeshed: 0.74857

@charliermarsh charliermarsh marked this pull request as draft August 9, 2023 04:16
@charliermarsh charliermarsh added the formatter Related to the formatter label Aug 9, 2023
@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch from 232dea5 to b06831e Compare August 9, 2023 04:17
@charliermarsh
Copy link
Member Author

This seems to do the right thing, but I'm already anticipating that I'm misunderstanding something around why / how Black makes it decisions, and the relationship to our grouping and parenthesization logic.

@charliermarsh charliermarsh marked this pull request as ready for review August 9, 2023 04:26
@charliermarsh
Copy link
Member Author

(Looks like I have one failing stability test.)

@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

PR Check Results

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.01      9.6±0.33ms     4.2 MB/sec    1.00      9.6±0.30ms     4.3 MB/sec
formatter/numpy/ctypeslib.py               1.01  1876.6±62.75µs     8.9 MB/sec    1.00  1850.8±78.02µs     9.0 MB/sec
formatter/numpy/globals.py                 1.02   218.5±12.49µs    13.5 MB/sec    1.00    213.3±8.63µs    13.8 MB/sec
formatter/pydantic/types.py                1.02      4.1±0.19ms     6.3 MB/sec    1.00      4.0±0.14ms     6.4 MB/sec
linter/all-rules/large/dataset.py          1.00     12.5±0.29ms     3.3 MB/sec    1.02     12.7±0.28ms     3.2 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      3.4±0.12ms     4.9 MB/sec    1.00      3.3±0.09ms     5.0 MB/sec
linter/all-rules/numpy/globals.py          1.00   479.2±14.32µs     6.2 MB/sec    1.02   487.2±17.72µs     6.1 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.5±0.16ms     3.9 MB/sec    1.02      6.6±0.17ms     3.8 MB/sec
linter/default-rules/large/dataset.py      1.00      6.5±0.14ms     6.2 MB/sec    1.04      6.8±0.18ms     6.0 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1426.6±34.31µs    11.7 MB/sec    1.03  1466.6±67.49µs    11.4 MB/sec
linter/default-rules/numpy/globals.py      1.02    179.9±7.27µs    16.4 MB/sec    1.00    176.2±5.33µs    16.7 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.0±0.06ms     8.6 MB/sec    1.01      3.0±0.07ms     8.6 MB/sec

Windows

group                                      main                                    pr
-----                                      ----                                    --
formatter/large/dataset.py                 1.00     11.6±0.64ms     3.5 MB/sec     1.03     12.0±0.72ms     3.4 MB/sec
formatter/numpy/ctypeslib.py               1.00      2.1±0.13ms     8.1 MB/sec     1.14      2.3±0.09ms     7.1 MB/sec
formatter/numpy/globals.py                 1.00   242.7±28.57µs    12.2 MB/sec     1.12   270.9±26.35µs    10.9 MB/sec
formatter/pydantic/types.py                1.00      4.4±0.28ms     5.8 MB/sec     1.12      4.9±0.20ms     5.2 MB/sec
linter/all-rules/large/dataset.py          1.01     15.6±0.59ms     2.6 MB/sec     1.00     15.4±0.41ms     2.6 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      4.3±0.18ms     3.9 MB/sec     1.00      4.2±0.19ms     3.9 MB/sec
linter/all-rules/numpy/globals.py          1.06   568.0±26.72µs     5.2 MB/sec     1.00   537.2±22.34µs     5.5 MB/sec
linter/all-rules/pydantic/types.py         1.00      8.0±0.32ms     3.2 MB/sec     1.01      8.1±0.32ms     3.1 MB/sec
linter/default-rules/large/dataset.py      1.10      9.5±0.63ms     4.3 MB/sec     1.00      8.6±0.29ms     4.7 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1758.3±125.96µs     9.5 MB/sec    1.00  1761.2±62.14µs     9.5 MB/sec
linter/default-rules/numpy/globals.py      1.00   197.6±14.45µs    14.9 MB/sec     1.16   229.6±10.22µs    12.9 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.5±0.18ms     7.3 MB/sec     1.10      3.8±0.20ms     6.6 MB/sec

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting and I wonder how intentional the behavior is. I tried to deduce the behavior from Black's source code. I can confirm that the behavior is different if the function has no arguments, but I haven't yet found the place where the branching happens:

  • Black uses left_hand_split for function definitions (rather than the usual rhs) [source]
  • For empty: It forces the optional parentheses of the return type and applies the default transforms for parenthesized expressions (delimiter_split, standalone_comment_split, rhs). This matches optional_parentheses
  • Non empty: It splits after the ( of the arguments and then applies the default transform for unparenthesized expressions (rhs, hug_power_op). This matches mabye_parenthesize with Parenthesize::IfBreaks

Base automatically changed from charlie/function-signature-ii to main August 9, 2023 12:14
@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch from a44080a to 10aef8e Compare August 9, 2023 20:53
@charliermarsh
Copy link
Member Author

(I need to figure out the instability here, then will re-request a review.)

@charliermarsh
Copy link
Member Author

charliermarsh commented Aug 9, 2023

Having trouble with this, looking for help.

Given:

def double(a: int) -> (
    int # Hello
):
    return 2*a

We first format as:

def double(
    a: int
) -> int:  # Hello
    return 2 * a

This is because we format # Hello as a trailing end-of-line comment on the return annotation, which leads to an expand_parent(), since we always expand parents for trailing comments. But the expanded parent group is the entire parameters and return type, so the parameters breaks in that odd way, I guess?

When we reformat, we get:

def double(a: int) -> int:  # Hello
    return 2 * a

Since the # Hello gets placed as a dangling comment on the function definition via placement.rs.

Is there any way for me to avoid the breaking in the first format? (This does get fixed by #6464 since we don't incorrectly break, but perhaps we won't end up merging that.)

@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch from 10aef8e to 6e263eb Compare August 10, 2023 00:27
@charliermarsh charliermarsh marked this pull request as ready for review August 10, 2023 00:27
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the comment formatting changes.

@charliermarsh charliermarsh changed the base branch from main to charlie/break August 10, 2023 12:41
@charliermarsh
Copy link
Member Author

We can certainly remove the comment formatting changes, but it then requires coming up with a strategy for resolving the instability that I documented above, which either requires further deviating from Black or changing the groups in a way that I don’t fully understand.

@MichaReiser
Copy link
Member

MichaReiser commented Aug 10, 2023

One option would be to group the parameters to revert to right to left breaking if we know that the right side has a trailing comment (it is guaranteed to break). Meaning, we then use our previous logic where we first expand the return type (which is guaranteed to expand because of the comment), and only then the parameters.

            let group_function_parameters = item
                .returns
                .as_deref()
                .is_some_and(|return_type| comments.has_trailing_comments(return_type));

            if group_function_parameters {
                write!(f, [group(&item.parameters.format())])?;
            } else {
                write!(f, [item.parameters.format()])?;
            }

@MichaReiser
Copy link
Member

Never, this doesn't work (for the opposite reason):

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x, b, c, d
) -> (Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
] # test
): 
    ...

# Pass 1
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(x, b, c, d) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:  # test
    ...

# Pass 2
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x, b, c, d
) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:  # test
    ...

@MichaReiser
Copy link
Member

MichaReiser commented Aug 10, 2023

The easiest might be to group parameters if the return type doesn't have its own breakpoint and has a trailing comment. The list should be rather short because we know that we're in a return-type position. Rome has a similar function

https://github.com/rome/tools/blob/2655264565608e29229490101f7abbfa89a35598/crates/rome_js_formatter/src/js/declarations/function_declaration.rs#L258-L297

@charliermarsh
Copy link
Member Author

(Trying this.)

@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch from 6e263eb to 2d8650c Compare August 10, 2023 16:48
@charliermarsh
Copy link
Member Author

Another deviation is that I could decide to preserve the parentheses if a trailing comment is present there, e.g., make this stable formatting:

def double(
    a: int
) -> (
    int  # Hello
):
    pass

(I haven't looked into how difficult this would be, but I assume it would be easier than what I'm trying to do here.)

@MichaReiser
Copy link
Member

Can you push your latest change so that I can play with them myself?

@charliermarsh
Copy link
Member Author

charliermarsh commented Aug 11, 2023

I just pushed. If simpler, you can also remove the should_group_parameters to get the original behavior before that suggestion:

diff --git a/crates/ruff_python_formatter/src/statement/stmt_function_def.rs b/crates/ruff_python_formatter/src/statement/stmt_function_def.rs
index d47825cc2..39025b11c 100644
--- a/crates/ruff_python_formatter/src/statement/stmt_function_def.rs
+++ b/crates/ruff_python_formatter/src/statement/stmt_function_def.rs
@@ -60,15 +60,7 @@ impl FormatNodeRule<StmtFunctionDef> for FormatStmtFunctionDef {
         }
 
         let format_inner = format_with(|f: &mut PyFormatter| {
-            if should_group_function_parameters(
-                &item.parameters,
-                item.returns.as_deref(),
-                f.context(),
-            ) {
-                write!(f, [group(&item.parameters.format())])?;
-            } else {
-                write!(f, [item.parameters.format()])?;
-            }
+            write!(f, [item.parameters.format()])?;
 
             if let Some(return_annotation) = item.returns.as_ref() {
                 write!(f, [space(), text("->"), space()])?;

@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch 3 times, most recently from 3c0b569 to b6eba55 Compare August 11, 2023 16:40
@charliermarsh
Copy link
Member Author

The current version deviates from Black, but it's at least stable... Specifically, we now preserve formatting like:

def double(
    a: int
) -> (
    int  # Hello
):
    pass

I think this is reasonable but perhaps not ideal.

@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch 2 times, most recently from 3d96db4 to a17be4f Compare August 11, 2023 16:52
Parenthesize::Optional | Parenthesize::IfBreaks => needs_parentheses,
Parenthesize::Optional
| Parenthesize::IfBreaks
| Parenthesize::IfBreaksOrIfRequired => needs_parentheses,
};

match needs_parentheses {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit less DRY but I find it much easier to reason about the match when every OptionalParentheses has a single branch.

@charliermarsh charliermarsh marked this pull request as ready for review August 11, 2023 16:53
@charliermarsh
Copy link
Member Author

charliermarsh commented Aug 11, 2023

I think this formatting is arguably more consistent for us, since we format this:

def double(
    a # comment
):
    return 2 * a

As:

def double(
    a,  # comment
):
    return 2 * a

That is, we don't push those out to after the function definition (unlike Black).

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the outcome. It even preserves the authors comment choice:

e.g. I understand that we would keep

def test() -> (
	[a, b ]  # noqa: Ruf01
): pass

@charliermarsh charliermarsh force-pushed the charlie/function-returns-wrap branch from a17be4f to f0b5447 Compare August 11, 2023 18:11
@charliermarsh charliermarsh enabled auto-merge (squash) August 11, 2023 18:18
@charliermarsh charliermarsh merged commit 53246b7 into main Aug 11, 2023
@charliermarsh charliermarsh deleted the charlie/function-returns-wrap branch August 11, 2023 18:19
durumu pushed a commit to durumu/ruff that referenced this pull request Aug 12, 2023
…#6436)

## Summary

This PR modifies our logic for wrapping return type annotations.
Previously, we _always_ wrapped the annotation in parentheses if it
expanded; however, Black only exhibits this behavior when the function
parameters is empty (i.e., it doesn't and can't break). In other cases,
it uses the normal parenthesization rules, allowing nodes to bring their
own parentheses.

For example, given:

```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(x) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...
```

Black will format as:

```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...


def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x,
) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...
```

Whereas, prior to this PR, Ruff would format as:

```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...


def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x,
) -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...
```

Closes astral-sh#6431.

## Test Plan

Before:

- `zulip`: 0.99702
- `django`: 0.99784
- `warehouse`: 0.99585
- `build`: 0.75623
- `transformers`: 0.99470
- `cpython`: 0.75988
- `typeshed`: 0.74853

After:

- `zulip`: 0.99724
- `django`: 0.99791
- `warehouse`: 0.99586
- `build`: 0.75623
- `transformers`: 0.99474
- `cpython`: 0.75956
- `typeshed`: 0.74857
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Formatter adds unnecessary parentheses around return type annotations
3 participants