Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode unicode escapes without decoding %25. #3434

Merged
merged 2 commits into from
Feb 23, 2015
Merged

Decode unicode escapes without decoding %25. #3434

merged 2 commits into from
Feb 23, 2015

Conversation

braddunbar
Copy link
Collaborator

Fixes #3426.

The Problem

By using decodeURI to decode unicode escapes, %25 is also decoded. When included in a url parameter, this creates an invalid escape sequence and causes decodeURIComponent to throw.

Sidebar: Should invalid sequences cause the router to throw in the first place? What else can it do?

The Solution

Double escape %25 as %2525 before using decodeURI.

Is this another step down a crazy rabbit hole of URI escaping?

Maybe? I'm pretty happy with the outcome though. I like that we can continue to remove escapes if we come across more problems of this nature. Further, adding it as History#decodeFragment means users can customize it to patch things easily when needed.

I'm going to do a bit more research and try to figure out why the browser escapes unicode in the first place. It effectively prevents us from knowing what was escaped by the user and what was escaped by the browser. 😕

@braddunbar
Copy link
Collaborator Author

For some added fun, PhantomJS displays a behavior not found in any actual browser (the tests pass in all the browsers I could find to test). 😒

@@ -1440,6 +1440,11 @@
return path === this.root && !this.getSearch();
},

// Decode unicode escapes without decoding `%25`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bit more prose here for future-usses who might look at this quizzically? It's always nice to have a bit more context in comments than just looking at the code would give us.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, any reason this has to be on the Router prototype instead of living as a standalone helper function?

edit: duh, missed the sentence about overriding. Necessary though? What else would need to override this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bit more prose here for future-usses who might look at this quizzically?

Happy to!

Also, any reason this has to be on the Router prototype instead of living as a standalone helper function?

I started out with a standalone helper but then I started to think what would need to be done if someone finds another problematic escape sequence. With the helper, you'd need to override #getPath and #navigate, which could get ugly really fast. This way, it's a one liner to replace #decodeFragment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in faea842. It's hard to strike a balance between succinctness and completeness so let me know if I should tweak it! 😃

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent. thanks!

@macgyver
Copy link

macgyver commented Jan 5, 2015

Works in Chrome 41, Firefox 33, Safari 8, Safari for iOS 7.1, and IE8-10 (the results of extractParams are properly decoded strings when the url is a percent-encoded string, even when including %25)

@jashkenas
Copy link
Owner

Looks fine to me -- but it would be nice to get someone familiar with Browser-unicode-escaping to chime in ... if such a unicorn exists.

@akre54
Copy link
Collaborator

akre54 commented Jan 5, 2015

praps @mathiasbynens can help per usual?

@braddunbar
Copy link
Collaborator Author

FYI, PhantomJS is failing because it decodes the entire url, including the fragment and all the percent-encoded characters. This is not behavior exhibited by any browser and I'm reticent to add a fix for it. If we can't trust it to behave like a browser then what good does it do us?

@braddunbar braddunbar mentioned this pull request Jan 7, 2015
@akre54
Copy link
Collaborator

akre54 commented Feb 17, 2015

I'd like to merge this but I'm weary of breaking every Travis build from now on. Do we have a plan around this?

@braddunbar
Copy link
Collaborator Author

My thoughts exactly! I'd love some feedback regarding that. PhantomJS is a nice smoke test but I'm not sure what good it does if it's giving false negatives via behavior not exhibited in any real browser.

@akre54
Copy link
Collaborator

akre54 commented Feb 17, 2015

We've had numerous problems with PhantomJS in the past, and it seems due to slow development most of these bugs are here to stay. @megawac created a pull removing PhantomJS from Underscore in jashkenas/underscore#2054, I wonder if he'll be kind enough to do us the favor here too.

@megawac
Copy link
Collaborator

megawac commented Feb 17, 2015

Was planning on filing an issue with it later today actually about some of
the things that would be involved

On Tue, Feb 17, 2015 at 5:48 PM, Adam Krebs notifications@github.com
wrote:

We've had numerous problems with PhantomJS in the past, and it seems due
to slow development most of these bugs are here to stay. @megawac
https://github.com/megawac created a pull removing PhantomJS from
Underscore in jashkenas/underscore#2054
jashkenas/underscore#2054, I wonder if he'll be
kind enough to do us the favor here too.


Reply to this email directly or view it on GitHub
#3434 (comment).

@akre54
Copy link
Collaborator

akre54 commented Feb 17, 2015

Excellent. Looking forward.

@jridgewell
Copy link
Collaborator

I was working on an alternative solution in my url-encoding branch, but never got polished it into a PR. It's working on Chrome, Safari, Firefox, and IE6 (only IE vm I have). I can have it ready tonight, if you'd like.

@jridgewell
Copy link
Collaborator

Also: this is failing in IE6 IE6 Failure

I've found two issues:

  1. The percent encoded "?", and turned that into a percent decoded location.search:

    location = {
      href: "http://example.com/myyjä/foo%20%25%3F%2f%40%25%20bar",
      hash: "",
      host: "example.com:80",
      search: "?/@% bar",
      fragment: "",
      pathname: "/myyjä/foo%20%",
      protocol: "http:"
    }
  2. The pathname is already decoded, so trying to fragment.replace(/%25/g, '%2525') never finds anything.

@jridgewell
Copy link
Collaborator

Follow up: it's because the anchor tag parser will decode the pathname. Putting it into the hash, then starting up with {pushState: true}, prevents the decode, but requires a bit of boilerplate.

That's not how IE6's actual location behaves. If you were to start a server that responds to any path and navigate to myyjä/foo%20%25%3F%2f%40%25%20bar, location.pathname would not be decoded.

@jashkenas
Copy link
Owner

@braddunbar Any thoughts on @jridgewell's last couple of comments?

@jridgewell
Copy link
Collaborator

I fixed it by setting location.pathname directly. IE6's actual location object doesn't decode the pathname.

@jashkenas
Copy link
Owner

@jridgewell So this is good to merge in your opinion?

@jridgewell
Copy link
Collaborator

I think so.

@jashkenas
Copy link
Owner

I wouldn't bet money on this change not invoking some hellacious old-IE or old-Firefox decoding problem down the road. But hey — here's to optimism.

jashkenas added a commit that referenced this pull request Feb 23, 2015
Decode unicode escapes without decoding `%25`.
@jashkenas jashkenas merged commit e109f6d into jashkenas:master Feb 23, 2015
@braddunbar braddunbar deleted the decode-fragment branch February 23, 2015 16:49
@akre54
Copy link
Collaborator

akre54 commented Feb 24, 2015

It looks like this test is failing in phantomjs, Android 4.0.4 and Safari 5 on Win7.

https://travis-ci.org/jashkenas/backbone/builds/51918113

@jashkenas
Copy link
Owner

I wouldn't bet money on this change not invoking some hellacious old-IE or old-Firefox decoding problem down the road.

Do I win a prize? ;)

@akre54
Copy link
Collaborator

akre54 commented Feb 24, 2015

I dunno. Is 12 hours enough to count as "down the road"?

@braddunbar
Copy link
Collaborator Author

Haha! PhantomJS also failed so my first guess is that it's the same as IE6 and we can get around it by setting pathname directly. I'll take a look later on!

@jashkenas
Copy link
Owner

For what it's worth, this patch and new test are currently failing in IE9.

@braddunbar
Copy link
Collaborator Author

Hmm…looks ok on BrowserStack.

screen shot 2015-05-13 at 5 45 52 pm

@jashkenas
Copy link
Owner

It does, but it breaks on my local VM. Go figure.

@jridgewell
Copy link
Collaborator

All tests are passing on mine. Master on IE8 XP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URIError parsing url Params encoded with encodeURIComponent
6 participants