httpserver/all: Clean up and standardize request URL handling #1633

mholt · 2017-04-29T22:55:08Z

1. What issue(s) is this pull request related to?

I'm hoping this will address issues with #1624, #1584, #1582, #256 (by not breaking it), and others.

2. What does this change do, exactly?

The HTTP server now always creates a context value on the request which
is a copy of the request's URL struct. It should not be modified by
middlewares, but it is safe to get the value out of the request and make
changes to it locally-scoped. Thus, the value in the context always
stores the original request URL information as it was received. Any
rewrites that happen will be to the request's URL field directly.

The HTTP server no longer cleans /sanitizes the request URL. It made too
many strong assumptions and ended up making a lot of middleware more
complicated, including upstream proxying (and fastcgi). To alleviate
this complexity, we no longer change the request URL. Middlewares are
responsible to access the disk safely by using http.Dir or, if not
actually opening files, they can use httpserver.SafePath().

3. What documentation changes (if any) go along with this PR?

Nothing of note, except instructions for middleware authors:

Middlewares should NOT do this:

os.Open(filepath.Join(siteRoot, filepath.FromSlash(r.URL.Path)))

But this is safe:

jailedDisk := http.Dir(siteRoot)
jailedDisk.Open(r.URL.Path)

If not opening files, you can use SafePath():

httpserver.SafePath(siteRoot, r.URL.Path)

But SafePath() is good for when you don't need to open the file.

4. Checklist

I have written tests and verified that they fail without my change
I have squashed any insignificant commits
This change has comments for package types, values, and functions or non-obvious lines of code
I am willing to help maintain this change if there are issues with it later

I'm considering having SafePath() take the entire *http.Request rather than just a string (so right now you have to pass req.URL.Path) - but it's not a big deal.

This is a big change, but the highlights are:

The only context key relevant to path rewriting is httpserver.OriginalURLCtxKey ("original_url") which stores a copy of the original url.URL value before any middlewares even see the request.
The HTTP server no longer sanitizes paths.
http.Dir or httpserver.SafePath() should be used by middlewares that want to access the disk.
Improved the handling in the static file server as well.
Overall, this should lead to much simpler code and fewer problems. (I hope. Please test.)

The HTTP server now always creates a context value on the request which is a copy of the request's URL struct. It should not be modified by middlewares, but it is safe to get the value out of the request and make changes to it locally-scoped. Thus, the value in the context always stores the original request URL information as it was received. Any rewrites that happen will be to the request's URL field directly. The HTTP server no longer cleans /sanitizes the request URL. It made too many strong assumptions and ended up making a lot of middleware more complicated, including upstream proxying (and fastcgi). To alleviate this complexity, we no longer change the request URL. Middlewares are responsible to access the disk safely by using http.Dir or, if not actually opening files, they can use httpserver.SafePath(). I'm hoping this will address issues with #1624, #1584, #1582, and others.

@abiosoft

@abiosoft: I still can't figure out exactly what this is for. 😅

tw4452852 · 2017-04-30T02:43:58Z

caddyhttp/browse/browse.go

-	if !strings.HasSuffix(r.URL.Path, "/") {
-		staticfiles.RedirectToDir(w, r)
-		return 0, nil
+	u := r.Context().Value(httpserver.OriginalURLCtxKey).(url.URL)


This maybe not what we want. If rewrite occurred before, r.URL.Path contains the result and we should use it. Otherwise rewrite middleware will disfunction when combining with browser.

Yeah, good catch. This is tricky. The reason I chose to get the path of the original URL is because we end up causing a redirect with it; the redirect is external, but the rewrite is internal.

I wonder... should we check for a trailing slash using r.URL.Path and then issue the redirect by modifying the original path stored in the context? (What do you think is most correct?)

I can go back to just using the rewritten path (r.URL.Path) - I haven't heard any issues about it.

I think the original logic is okay.

@tw4452852 Alrighty -- updated.

elcore · 2017-04-30T09:31:26Z

caddyhttp/extensions/ext.go

 		for _, ext := range e.Extensions {
-			if resourceExists(e.Root, urlpath+ext) {
+			_, err := os.Stat(httpserver.SafePath(e.Root, urlpath) + ext)
+			if err == nil {


I am nitpicking, but I guess this should be fine

if _, err := os.Stat(httpserver.SafePath(e.Root, urlpath) + ext); err == nil { r.URL.Path = urlpath + ext break }

I prefer the two-line version in this case because there's already a lot of logic in that first line. Thanks though!

tobya

Just a couple of minor notes

tobya · 2017-04-30T09:38:19Z

caddyhttp/browse/browse.go

+	if u.Path == "" {
+		u.Path = "/"
+	}
+	if u.Path[len(u.Path)-1] != '/' {


Is there a reason not to use strings.HasSuffix here?

🤷‍♂️ Beats me, the standard library does it this way. Maybe a single byte comparison is faster than needing do all the logic of strings.HasSuffix.

tobya · 2017-04-30T09:40:25Z

caddyhttp/staticfiles/fileserver.go

-		if !strings.HasSuffix(r.URL.Path, "/") {
-			RedirectToDir(w, r)
+		// ensure there is a trailing slash
+		if u.Path[len(u.Path)-1] != '/' {


Similar here, any reason not to use strings.HasSuffix

Same as above: no particular reason. Std lib does it this way. /shrug

tobya · 2017-04-30T09:43:03Z

I think this provides a much more straightforward way of dealing with OriginalURLCtxKey which is great.

tw4452852 · 2017-05-01T01:23:33Z

caddyhttp/httpserver/server.go

+//
+// If opening a file, use http.Dir instead.
+func SafePath(siteRoot, reqPath string) string {
+	if filepath.Separator != '/' {


filepath.ToSlash helps?

Yep, indeed. Updated!

mholt · 2017-05-01T16:18:47Z

@tw4452852 Feel free to approve (or request changes to) the PR when you have a chance. I can get this out later today if so. (Will probably do so anyway.) Thanks so much for your time!

tw4452852 · 2017-05-02T02:40:49Z

LGTM.

mholt added this to the 0.10.1 milestone Apr 29, 2017

mholt requested review from abiosoft, tobya and tw4452852 April 29, 2017 22:55

staticfiles: Fix test on Windows

cb65972

@abiosoft: I still can't figure out exactly what this is for. 😅

mholt mentioned this pull request Apr 29, 2017

Don't sanitize path when it contains escaped parts #1616

Closed

tw4452852 reviewed Apr 30, 2017

View reviewed changes

elcore reviewed Apr 30, 2017

View reviewed changes

tobya reviewed Apr 30, 2017

View reviewed changes

mholt added 2 commits April 30, 2017 08:17

Use (potentially) changed URL for browse redirects, as before

fbdef9f

Merge branch 'master' into preserve-uris

40c7f6d

mholt mentioned this pull request May 1, 2017

Encoded slash in URL #1582

Closed

tw4452852 reviewed May 1, 2017

View reviewed changes

mholt added 3 commits May 1, 2017 07:39

Use filepath.ToSlash, clean up a couple proxy test cases

805cb39

Oops, fix variable name

4256f37

Merge branch 'master' into preserve-uris

606dd9e

tw4452852 approved these changes May 2, 2017

View reviewed changes

mholt merged commit d5371af into master May 2, 2017

mholt deleted the preserve-uris branch May 2, 2017 05:11

mholt mentioned this pull request Jun 7, 2017

Redirect loop when rewriting from directory URI to file URI #1706

Closed

mholt mentioned this pull request Sep 8, 2017

Multiple forward slashes in paths #1859

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

httpserver/all: Clean up and standardize request URL handling #1633

httpserver/all: Clean up and standardize request URL handling #1633

mholt commented Apr 29, 2017 •

edited

Loading

tw4452852 Apr 30, 2017

mholt Apr 30, 2017 •

edited

Loading

tw4452852 Apr 30, 2017

mholt May 1, 2017

elcore Apr 30, 2017

mholt Apr 30, 2017

elcore Apr 30, 2017

tobya left a comment

tobya Apr 30, 2017

mholt Apr 30, 2017

tobya Apr 30, 2017

mholt May 1, 2017

tobya commented Apr 30, 2017

tw4452852 May 1, 2017

mholt May 1, 2017

mholt commented May 1, 2017 •

edited

Loading

tw4452852 commented May 2, 2017

httpserver/all: Clean up and standardize request URL handling #1633

httpserver/all: Clean up and standardize request URL handling #1633

Conversation

mholt commented Apr 29, 2017 • edited Loading

1. What issue(s) is this pull request related to?

2. What does this change do, exactly?

3. What documentation changes (if any) go along with this PR?

4. Checklist

Choose a reason for hiding this comment

mholt Apr 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tobya left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tobya commented Apr 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mholt commented May 1, 2017 • edited Loading

tw4452852 commented May 2, 2017

mholt commented Apr 29, 2017 •

edited

Loading

mholt Apr 30, 2017 •

edited

Loading

mholt commented May 1, 2017 •

edited

Loading