Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve abnormal urls in compliance with rfc3986 #1482

Merged
merged 1 commit into from
Jul 9, 2021

Conversation

morokosi
Copy link
Contributor

Some webpages may have abnormal URLs i.e. there are more ".." segments in a relative path reference than there are hierarchical levels in the base URL. Currently such extra segments are not resolved and remain as is. However, the extra segments should be removed according to
https://tools.ietf.org/html/rfc3986#section-5.4.2 .
This will fix relative URL resolution and add test cases from rfc3986.

@jhy jhy merged commit 8db724e into jhy:master Jul 9, 2021
@jhy
Copy link
Owner

jhy commented Jul 9, 2021

Cool, thanks!

@jhy jhy added this to the 1.14.1 milestone Jul 9, 2021
@jhy jhy added the improvement label Jul 9, 2021
jhy added a commit that referenced this pull request Jul 9, 2021
@morokosi morokosi deleted the fix-resolve-abnormal-urls branch July 9, 2021 11:17
@jhy
Copy link
Owner

jhy commented Jul 5, 2024

BTW @morokosi for your interest: the original pattern could cause Stack Overflow with a nasty URL with a ton of /../ segments.

I fixed it by changing the original ^/((\\.{1,2}/)+) to ^/(?>(?>\\.\\.?/)+). The SOE was in the { quantifier matching and the ( capture.

#2165

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants