-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid backtracing in ceylon interpolation regex #1773
Conversation
FYI @dblessing |
8705cbb
to
14d7962
Compare
More Details(The following examples can be reproduced on a gitlab.com wiki page) A high CPU load can be caused using the following ceylon regex: ```ceylon
"```````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````` This ReDos example in the description of this issue renders in less than a second with the fix in this PR - it was 40+ seconds before. This fix was based on the suggestions in the "Quantifiers in Sequence" section of this page: https://www.regular-expressions.info/redos.html I also tested the following valid ceylon markdown, and it still seems to work the same with the fix: ```ceylon
print("Hello, ``person.firstName`` ``person.lastName``, the time is ``Time()``.");
print("1 + 1 = ``1 + 1``");
``` |
Thank you for your contribution @thewoolleyman ❤️ We would love to see this backtracking issue addressed in rouge ASAP. I have added some suggestions for your consideration. Please let me know what you think 🙏🏼 . |
lib/rouge/lexers/ceylon.rb
Outdated
@@ -51,7 +51,7 @@ class Ceylon < RegexLexer | |||
|
|||
rule %r("(\\\\|\\"|[^"])*"), Literal::String | |||
rule %r('\\.'|'[^\\]'|'\\\{#[0-9a-fA-F]{4}\}'), Literal::String::Char | |||
rule %r(".*``.*``.*"', String::Interpol | |||
rule %r("[^`]*``[^`]*``[^`]*"', String::Interpol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a flaw with the initial rule which is caused by this line and the next being mangled into one regex (notice the extraneous single quote '
and missing closing bracket). For better or worse, that bug might not expose the backtracking issue as it is reported 🤔
Given that we addressed the wrapping issue above:
Using the reported string, I have verified on https://regex101.com/debugger
".*``.*``.*"
-> Match 1 failed in 106 086 step(s)"[^`]*``[^`]*``[^`]*"
-> Match 1 failed in 95 step(s)
Could we add the string interpolation example in the spec/visual/samples/ceylon
for future reference?
diff --git spec/visual/samples/ceylon spec/visual/samples/ceylon
index b6d7a122..20cf4608 100644
--- spec/visual/samples/ceylon
+++ spec/visual/samples/ceylon
@@ -32,3 +32,6 @@ module com.acme.test "1.0.0" {
}
package com.example;
+
+print("Hello, ``person.firstName`` ``person.lastName``, the time is ``Time()``.");
+print("1 + 1 = ``1 + 1``");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I've added this change in the latest push.
lib/rouge/lexers/ceylon.rb
Outdated
rule %r("[^`]*``[^`]*``[^`]*"', String::Interpol | ||
rule %r(\.)([a-z_]\w*)) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: What do you think about addressing both the wrapping and backtracking issue?
rule %r("[^`]*``[^`]*``[^`]*"', String::Interpol | |
rule %r(\.)([a-z_]\w*)) do | |
rule %r("[^`]*``[^`]*``[^`]*"), Literal::String::Interpol | |
rule %r((\.)([a-z_]\w*)) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually for interpolation it's best to try and push a separate state, so you can highlight the contents of the interpolation as well. It can check for the closing interpolation tokens and then mixin the main expression state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jneen. Great point. In fact, I have a follow-up commit locally to move this string interpolation to state
(similar to how the ruby
lexer deal with this). Once this merges, I can create another PR to address interpolation more appropriately. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tancnle I've fixed the wrapping as suggested in the latest push.
14d7962
to
8a54d0b
Compare
8a54d0b
to
2a12a5d
Compare
@tancnle thanks for the prompt response! I've made the two suggested changes to my commit, and pushed a rebased branch. Let me know if there's anything else needed on my end. |
Thanks @thewoolleyman. The PR looks good to me 🚀 |
Fix backtracing in ceylon interpolation regex.
Addresses the third remaining Regex Denial Of Service vulnerability reported in https://hackerone.com/reports/1283484. The other two mentioned in that report have already been addressed as part of #1690.