Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpserver/all: Clean up and standardize request URL handling #1633

Merged
merged 7 commits into from
May 2, 2017

Conversation

mholt
Copy link
Member

@mholt mholt commented Apr 29, 2017

1. What issue(s) is this pull request related to?

I'm hoping this will address issues with #1624, #1584, #1582, #256 (by not breaking it), and others.

2. What does this change do, exactly?

The HTTP server now always creates a context value on the request which
is a copy of the request's URL struct. It should not be modified by
middlewares, but it is safe to get the value out of the request and make
changes to it locally-scoped. Thus, the value in the context always
stores the original request URL information as it was received. Any
rewrites that happen will be to the request's URL field directly.

The HTTP server no longer cleans /sanitizes the request URL. It made too
many strong assumptions and ended up making a lot of middleware more
complicated, including upstream proxying (and fastcgi). To alleviate
this complexity, we no longer change the request URL. Middlewares are
responsible to access the disk safely by using http.Dir or, if not
actually opening files, they can use httpserver.SafePath().

3. What documentation changes (if any) go along with this PR?

Nothing of note, except instructions for middleware authors:

Middlewares should NOT do this:

os.Open(filepath.Join(siteRoot, filepath.FromSlash(r.URL.Path)))

But this is safe:

jailedDisk := http.Dir(siteRoot)
jailedDisk.Open(r.URL.Path)

If not opening files, you can use SafePath():

httpserver.SafePath(siteRoot, r.URL.Path)

But SafePath() is good for when you don't need to open the file.

4. Checklist

  • I have written tests and verified that they fail without my change
  • I have squashed any insignificant commits
  • This change has comments for package types, values, and functions or non-obvious lines of code
  • I am willing to help maintain this change if there are issues with it later

I'm considering having SafePath() take the entire *http.Request rather than just a string (so right now you have to pass req.URL.Path) - but it's not a big deal.

This is a big change, but the highlights are:

  • The only context key relevant to path rewriting is httpserver.OriginalURLCtxKey ("original_url") which stores a copy of the original url.URL value before any middlewares even see the request.
  • The HTTP server no longer sanitizes paths.
  • http.Dir or httpserver.SafePath() should be used by middlewares that want to access the disk.
  • Improved the handling in the static file server as well.
  • Overall, this should lead to much simpler code and fewer problems. (I hope. Please test.)

The HTTP server now always creates a context value on the request which
is a copy of the request's URL struct. It should not be modified by
middlewares, but it is safe to get the value out of the request and make
changes to it locally-scoped. Thus, the value in the context always
stores the original request URL information as it was received. Any
rewrites that happen will be to the request's URL field directly.

The HTTP server no longer cleans /sanitizes the request URL. It made too
many strong assumptions and ended up making a lot of middleware more
complicated, including upstream proxying (and fastcgi). To alleviate
this complexity, we no longer change the request URL. Middlewares are
responsible to access the disk safely by using http.Dir or, if not
actually opening files, they can use httpserver.SafePath().

I'm hoping this will address issues with #1624, #1584, #1582, and others.
@mholt mholt added this to the 0.10.1 milestone Apr 29, 2017
@mholt mholt requested review from abiosoft, tobya and tw4452852 April 29, 2017 22:55
@abiosoft: I still can't figure out exactly what this is for. 😅
if !strings.HasSuffix(r.URL.Path, "/") {
staticfiles.RedirectToDir(w, r)
return 0, nil
u := r.Context().Value(httpserver.OriginalURLCtxKey).(url.URL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe not what we want. If rewrite occurred before, r.URL.Path contains the result and we should use it. Otherwise rewrite middleware will disfunction when combining with browser.

Copy link
Member Author

@mholt mholt Apr 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch. This is tricky. The reason I chose to get the path of the original URL is because we end up causing a redirect with it; the redirect is external, but the rewrite is internal.

I wonder... should we check for a trailing slash using r.URL.Path and then issue the redirect by modifying the original path stored in the context? (What do you think is most correct?)

I can go back to just using the rewritten path (r.URL.Path) - I haven't heard any issues about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original logic is okay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tw4452852 Alrighty -- updated.

for _, ext := range e.Extensions {
if resourceExists(e.Root, urlpath+ext) {
_, err := os.Stat(httpserver.SafePath(e.Root, urlpath) + ext)
if err == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am nitpicking, but I guess this should be fine

if _, err := os.Stat(httpserver.SafePath(e.Root, urlpath) + ext); err == nil {
	r.URL.Path = urlpath + ext
	break
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the two-line version in this case because there's already a lot of logic in that first line. Thanks though!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 😄

Copy link
Collaborator

@tobya tobya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor notes

if u.Path == "" {
u.Path = "/"
}
if u.Path[len(u.Path)-1] != '/' {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use strings.HasSuffix here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂️ Beats me, the standard library does it this way. Maybe a single byte comparison is faster than needing do all the logic of strings.HasSuffix.

if !strings.HasSuffix(r.URL.Path, "/") {
RedirectToDir(w, r)
// ensure there is a trailing slash
if u.Path[len(u.Path)-1] != '/' {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here, any reason not to use strings.HasSuffix

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: no particular reason. Std lib does it this way. /shrug

@tobya
Copy link
Collaborator

tobya commented Apr 30, 2017

I think this provides a much more straightforward way of dealing with OriginalURLCtxKey which is great.

@mholt mholt mentioned this pull request May 1, 2017
//
// If opening a file, use http.Dir instead.
func SafePath(siteRoot, reqPath string) string {
if filepath.Separator != '/' {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filepath.ToSlash helps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, indeed. Updated!

@mholt
Copy link
Member Author

mholt commented May 1, 2017

@tw4452852 Feel free to approve (or request changes to) the PR when you have a chance. I can get this out later today if so. (Will probably do so anyway.) Thanks so much for your time!

@tw4452852
Copy link
Collaborator

LGTM.

@mholt mholt merged commit d5371af into master May 2, 2017
@mholt mholt deleted the preserve-uris branch May 2, 2017 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants