-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix, document, and test parser and pretty-printer edge cases related to braced macro calls #119427
Conversation
(rustbot has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
584de16
to
48ac127
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with the idea, but I'm not sure about the implementation. Is there a way to make this nicer?
| TryBlock(..) | ||
| ConstBlock(..) => false, | ||
|
||
MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed, if all the call sites are not calling this if e ≈ MacCall
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the commit that adds this MacCall
case inside expr_requires_semi_to_be_stmt
, I change all the call sites to preserve their pre-existing behavior. So this case starts out as not reachable. But after that, one at a time for each call site, I update each call site that should be reaching this, together with adding documentation and tests for that call site.
☔ The latest upstream changes (presumably #120121) made this pull request unmergeable. Please resolve the merge conflicts. |
r? compiler |
#120690 |
This is waiting on me to incorporate Waffle's insightful feedback, and to rebase. |
Give a name to each distinct manipulation of pretty-printer FixupContext There are only 7 distinct ways that the AST pretty-printer interacts with FixupContext: 3 constructors (including Default), 2 transformations, and 2 queries. This PR turns these into associated functions which can be documented with examples. This PR unblocks rust-lang#119427 (comment). In order to improve the pretty-printer's behavior regarding parenthesization of braced macro calls in match arms, which have different grammar than macro calls in statements, FixupContext needs to be extended with 2 new fields. In the previous approach, that would be onerous. In the new approach, all it entails is 1 new constructor (`FixupContext::new_match_arm()`).
Give a name to each distinct manipulation of pretty-printer FixupContext There are only 7 distinct ways that the AST pretty-printer interacts with FixupContext: 3 constructors (including Default), 2 transformations, and 2 queries. This PR turns these into associated functions which can be documented with examples. This PR unblocks rust-lang#119427 (comment). In order to improve the pretty-printer's behavior regarding parenthesization of braced macro calls in match arms, which have different grammar than macro calls in statements, FixupContext needs to be extended with 2 new fields. In the previous approach, that would be onerous. In the new approach, all it entails is 1 new constructor (`FixupContext::new_match_arm()`).
Rollup merge of rust-lang#124191 - dtolnay:fixup, r=compiler-errors Give a name to each distinct manipulation of pretty-printer FixupContext There are only 7 distinct ways that the AST pretty-printer interacts with FixupContext: 3 constructors (including Default), 2 transformations, and 2 queries. This PR turns these into associated functions which can be documented with examples. This PR unblocks rust-lang#119427 (comment). In order to improve the pretty-printer's behavior regarding parenthesization of braced macro calls in match arms, which have different grammar than macro calls in statements, FixupContext needs to be extended with 2 new fields. In the previous approach, that would be onerous. In the new approach, all it entails is 1 new constructor (`FixupContext::new_match_arm()`).
@rustbot ready |
Thanks for the PR, @dtolnay, looks great! However, after looking through all the commits, I think I don't understand enough about the parsing subtleties involved here to review this. r? parser |
|
self.restrictions.contains(Restrictions::STMT_EXPR) | ||
&& !classify::expr_requires_semi_to_be_stmt(e) | ||
&& !classify::expr_requires_comma_to_be_match_arm(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe my only nit in the PR: that expr_is_complete
uses expr_requires_comma_to_be_match_arm
"feels" a bit less self-descriptive than it could be, since we use expr_is_complete
for things like parse_expr_dot_or_call_with_
which doesn't really have to do with match arms.
Not really sure how to make this actionable though, since I'm not sure if a rename seems tough. AFAICT this just preserves the old behavior, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is preserving old behavior. The exact same behavior that used to be incorrectly called expr_requires_semi_to_be_stmt
is now called expr_requires_comma_to_be_match_arm
.
Regarding naming, one possibility is that instead of distinguishing a "commas in match arms" and "semicolons in statements" case (and the implicit 3rd case, expressions that are neither statement nor match arm, such as a function arg), we could rename things along the following lines:
- "parse the longest possible expression"
- "parse an expression using earlier boundary rule" (for a match arm, range, or whatever else looks at expr_is_complete)
- "parse an expression until the earliest possible boundary" (for a statement)
Another possibility is to omit this specific commit ("Add classify::expr_requires_comma_to_be_match_arm") and keep using match e { MacCall(_) => …, _ => expr_requires_semi_to_be_stmt(e) }
in the cases where expr_requires_semi_to_be_stmt
is not the desired behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more option — instead of having what's currently in the PR:
// compiler/rustc_ast/src/util/classify.rs
pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
match &e.kind {
If(..)
| Match(..)
| Block(..)
| While(..)
| Loop(..)
| ForLoop { .. }
| TryBlock(..)
| ConstBlock(..) => false,
MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,
_ => true,
}
}
pub fn expr_requires_comma_to_be_match_arm(e: &ast::Expr) -> bool {
match &e.kind {
MacCall(_) => true,
_ => expr_requires_semi_to_be_stmt(e),
}
}
we could have:
pub fn expr_is_complete(e: &ast::Expr) -> bool {
// this is negative of the old "expr_requires_semi_to_be_stmt"
match &e.kind {
If(..)
| Match(..)
| Block(..)
| While(..)
| Loop(..)
| ForLoop { .. }
| TryBlock(..)
| ConstBlock(..) => true,
MacCall(_) | _ => false,
}
}
pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
match &e.kind {
MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,
_ => !expr_is_complete(e),
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer either the final option or the renaming you suggested first above. I'll leave it up to you. Otherwise this PR looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🆒
@bors r+ |
This comment has been minimized.
This comment has been minimized.
@bors r=compiler-errors |
☀️ Test successful - checks-actions |
Finished benchmarking commit (8cc6f34): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 675.636s -> 678.17s (0.38%) |
Inline ExprPrecedence::order into Expr::precedence The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427). Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps: 1. Convert `Expr` to `ExprPrecedence` using `.precedence()` 2. Convert `ExprPrecedence` to `i8` using `.order()` 3. Compare using `<` As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once. The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible: ```rust match self.kind { ExprKind::Closure(..) => ExprPrecedence::Closure, ... } ``` although there were exceptions of both many-to-one, and one-to-many: ```rust ExprKind::Underscore => ExprPrecedence::Path, ExprKind::Path(..) => ExprPrecedence::Path, ... ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match, ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch, ``` Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`. You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways: - There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls. - We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree. - There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum. This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
Inline ExprPrecedence::order into Expr::precedence The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427). Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps: 1. Convert `Expr` to `ExprPrecedence` using `.precedence()` 2. Convert `ExprPrecedence` to `i8` using `.order()` 3. Compare using `<` As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once. The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible: ```rust match self.kind { ExprKind::Closure(..) => ExprPrecedence::Closure, ... } ``` although there were exceptions of both many-to-one, and one-to-many: ```rust ExprKind::Underscore => ExprPrecedence::Path, ExprKind::Path(..) => ExprPrecedence::Path, ... ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match, ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch, ``` Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`. You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways: - There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls. - We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree. - There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum. This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
Rollup merge of rust-lang#133140 - dtolnay:precedence, r=fmease Inline ExprPrecedence::order into Expr::precedence The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427). Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps: 1. Convert `Expr` to `ExprPrecedence` using `.precedence()` 2. Convert `ExprPrecedence` to `i8` using `.order()` 3. Compare using `<` As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once. The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible: ```rust match self.kind { ExprKind::Closure(..) => ExprPrecedence::Closure, ... } ``` although there were exceptions of both many-to-one, and one-to-many: ```rust ExprKind::Underscore => ExprPrecedence::Path, ExprKind::Path(..) => ExprPrecedence::Path, ... ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match, ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch, ``` Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`. You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways: - There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls. - We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree. - There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum. This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
Inline ExprPrecedence::order into Expr::precedence The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427). Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps: 1. Convert `Expr` to `ExprPrecedence` using `.precedence()` 2. Convert `ExprPrecedence` to `i8` using `.order()` 3. Compare using `<` As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once. The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible: ```rust match self.kind { ExprKind::Closure(..) => ExprPrecedence::Closure, ... } ``` although there were exceptions of both many-to-one, and one-to-many: ```rust ExprKind::Underscore => ExprPrecedence::Path, ExprKind::Path(..) => ExprPrecedence::Path, ... ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match, ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch, ``` Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`. You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways: - There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls. - We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree. - There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum. This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
Review note: this is a deceptively small PR because it comes with 145 lines of docs and 196 lines of tests, and only 25 lines of compiler code changed. However, I recommend reviewing it 1 commit at a time because much of the effect of the code changes is non-local i.e. affecting code that is not visible in the final state of the PR. I have paid attention that reviewing the PR one commit at a time is as easy as I can make it. All of the code you need to know about is touched in those commits, even if some of those changes disappear by the end of the stack.
This is a follow-up to #119105. One case that is not relevant to
-Zunpretty=expanded
, but which came up as I'm porting #119105 and #118726 intosyn
's printer andprettyplease
's printer where it is relevant, and is also relevant to rustc'sstringify!
, is statement boundaries in the vicinity of braced macro calls.Rustc's AST pretty-printer produces invalid syntax for statements that begin with a braced macro call:
Before this PR: output is not valid Rust syntax.
fn main() { m! {} + 1; }
After this PR: valid syntax.
fn main() { (m! {}) + 1; }