Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ES2024 Regex v flag #1218

Merged
merged 3 commits into from
May 23, 2023
Merged

Conversation

ota-meshi
Copy link
Contributor

This PR adds support for the regex v flag.

close #1216

@marijnh marijnh merged commit 9909161 into acornjs:master May 23, 2023
@ota-meshi ota-meshi deleted the regexp-v branch May 23, 2023 08:30
pp.regexp_classSetExpression = function(state) {
let nextMayContainStrings = false
if (this.regexp_eatClassSetRange(state))
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use brackets, not a semicolon, for empty branches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed that.
8dd203e

this.switchN = false
this.pos = 0
this.lastIntValue = 0
this.lastStringValue = ""
this.lastAssertionIsQuantifiable = false
this.lastMayContainStrings = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using return values on the set-expression-related methods, rather than a flag, be a workable way to track this? Looking at the code around this, tracking this value with a property seems verbose and difficult to see through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to make that change. It will take some time, so please wait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed that. d09dc4e

return true
}
if (state.switchV) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the idea behind this block? Does regexp_nonEmptyClassRanges as called through regexp_classContents not consume all the set content? What does the comment that says "Unreachable since ..." mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This process is to issue the same "Invalid character in character class" message as V8.
Without this process, broken expressions such as [A&&] will display the message "Unterminated character class".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acorn as a whole does not try to mimic the precise error messages from browsers. I think it would be preferable to have less code and complexity (also here) rather than V8-aligned error messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment. I will remove those processes that have become too complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const start = state.pos
if (this.regexp_eatClassSetCharacter(state)) {
const left = state.lastIntValue
if (state.eat(0x2D /* - */) && this.regexp_eatClassSetCharacter(state)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too the side effect of eat seems to 'leak' when the condition is false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite understand. What does leak mean? I thought we could get that status back with state.pos = start at L1060. Could you tell me what else I need to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad—I forgot to delete this comment after spotting the pos reset. Still not a fan of this pattern though, see my other comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand that you don't like state.pos = start. But for this part, I don't think it makes sense to do it any other way.
The way I think without eat is to do:

pp.regexp_eatClassSetRange = function(state) {
  const start = state.pos
  if (this.regexp_eatClassSetCharacter(state)) {
    const left = state.lastIntValue
    if (state.current() === 0x2D /* - */) {
      state.advance()
      if (this.regexp_eatClassSetCharacter(state)) {
        const right = state.lastIntValue
        if (left !== -1 && right !== -1 && left > right) {
          state.raise("Range out of order in character class")
        }
        return true
      }
    }
    state.pos = start
  }
  return false
}

state.current()+state.advance() is consequently the same as eat(), and state.pos = start cannot be removed yet. If you know a better way, could you please advise?

const start = pos
// https://tc39.es/ecma262/#prod-ClassIntersection
while (
state.eat(0x26 /* & */) && state.eat(0x26 /* & */) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe add some lookahead or multi-character eat mechanism to RegExpValidationState? This repeated side-effecting with eat only to restore the position after seems both hard to read and error-prone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a function that eats multiple characters and changed it to use it.

0659703

@marijnh
Copy link
Member

marijnh commented May 23, 2023

Ignore the 'merged' message above. I'm not sure how, but I somehow hit merge without intending to. I've backed the patches out again. See my comments for feedback.

Apparently you can't re-open PRs, but consider this to still be open.

@ota-meshi ota-meshi restored the regexp-v branch May 23, 2023 08:54
@ota-meshi
Copy link
Contributor Author

ota-meshi commented May 23, 2023

Thank you for your review! Should I open another PR?
I opened new PR #1219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: ES2024 Regex v flag
2 participants