-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Percent-encode illegal characters in user.Info.Extra keys #65799
Conversation
/ok-to-test |
FYI @mikedanese @liggitt I'm encoded %s
unencoded %s
|
@@ -160,6 +161,14 @@ func allHeaderValues(h http.Header, headerNames []string) []string { | |||
return ret | |||
} | |||
|
|||
func unescapeExtraKey(encodedKey string) string { | |||
key, err := url.PathUnescape(encodedKey) // Decode %-encoded bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does PathUnescape work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PathEscape is under-aggressive for our needs, but PathUnescape blindly converts %-encoded bytes back to regular bytes: https://golang.org/src/net/url/url.go?s=5075:5118#L173
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show some examples where url.PathEscape
/url.QueryEscape
are lacking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PathEscape is under-aggressive for our needs
that's unfortunate... what characters does it not escape for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's "@", "=", and ":" are unreserved in path components but forbidden in header keys. I'll add explicit tests for that.
QueryEscape is lacking because we want to maximize compatibility with (i.e. minimize escaping of) existing legal header key strings. We don't want to be over-aggressivley escaping legal strings (which unnecessarily garbles legal keys keys being sent from a new client to an old API server).
I set up a playground to test the interactions between new/old clients/servers with various escaping/unescaping algorithms:
https://play.golang.org/p/eors6oEbRpT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we offload bulk of escaping to Path/QueryEscape and post-process the output to catch any extra characters we care about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does QueryEscape's aggressive escaping break anything?
The '+' -> ' '
seems problematic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The '+' -> ' ' seems problematic
That would be for Unescape, right? QueryEscape is the other way around, which seems fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does QueryEscape's aggressive escaping break anything?
I'm very skeptical about having to maintain complex encoding logic for the sake of making headers less garbled/readable.
Yes, anyone currently using non-alphanumeric strings and mismatched client/server versions.
Also, can we offload bulk of escaping to Path/QueryEscape and post-process the output to catch any extra characters we care about?
Only if it's acceptable to permanently garble keys that are percent-encoded by the caller.
Double-encoding seems more complicated/fragile than simply whitelisting all header chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The '+' -> ' ' seems problematic
That would be for Unescape, right? QueryEscape is the other way around, which seems fine?
If it's acceptable to additionally break old clients sending '+'-containing keys to new API servers, sure.
func unescapeExtraKey(encodedKey string) string { | ||
key, err := url.PathUnescape(encodedKey) // Decode %-encoded bytes. | ||
if err != nil { | ||
return encodedKey // Always record extra strings, even if malformed/unencoded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The why of this comment is not obvious to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the discussion in the original issue, it seemed like it was decided to always record these extra values (even when 'malformed'), so I made it a no-op on error and added test cases for strings like "foo%xxbar" which can't be PathUnescaped.
return fmt.Sprintf("%%%x", b) | ||
} | ||
|
||
func percentEncodeRune(r rune) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A rune is always a byte so you can make this just be
return fmt.Sprintf("%%%x", byte(r))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rune
s are int32
s, but if we are OK with restricting extra key strings to the ascii subset of utf-8, that'd be OK. I deliberately encoded entire runes here, rather than bytes, to avoid extra-garbled wire-format strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, for some reason I thought go strings were UTF-8. Let's encode the whole int32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go strings are utf-8, but utf-8 runes can be 1-4 bytes long ;)
func headerEscape(key string) string { | ||
encoded := "" | ||
for _, r := range key { | ||
encoded += percentEncodeIfIllegal(r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require a lot of allocations? Should we use a byte buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially...
I deliberately chose string concatenation because bytes.Buffer.WriteFoo() returns an error, and I thought that it would be weird to do error handling when such a case does not arise from string concatenation (which I assume panic
s when malloc or something fails)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scratch that, the docs are explicit in that Write(Byte|Rune|String) always returns a nil error. I'll use the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use strings.Builder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
groupHeaders: []string{"X-Remote-Group"}, | ||
extraPrefixHeaders: []string{"X-Remote-Extra-"}, | ||
requestHeaders: http.Header{ | ||
"X-Remote-User": {"Bob"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test specifically with the +
character, to make sure that round trips correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added one in the roundtripper test, but I can do so here, too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
b := key[i] | ||
if shouldEscape(b) { | ||
buf.WriteByte('%') | ||
buf.WriteByte("0123456789abcdef"[b>>4]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs an explanatory comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
func headerKeyEscape(key string) string { | ||
buf := strings.Builder{} | ||
buf.Grow(headerKeyEscapedLen(key)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just write to buf without pre-scanning key to allocate memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this is a micro-optimization, but it was done in response to feedback earlier about repetitive string concatenations being malloc-heavy.
@@ -422,3 +422,130 @@ func (rt *debuggingRoundTripper) RoundTrip(req *http.Request) (*http.Response, e | |||
func (rt *debuggingRoundTripper) WrappedRoundTripper() http.RoundTripper { | |||
return rt.delegatedRoundTripper | |||
} | |||
|
|||
func isLegalHeaderKey(key string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is no longer used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
/retest |
func shouldEscape(b byte) bool { | ||
// url.PathUnescape() returns an error if any '%' is not followed by two | ||
// hexadecimal digits, so we'll intentionally encode it. | ||
return !legalHeaderByte(b) || b == '%' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about removing %
from legalHeaderKeyBytes
and remove this func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I originally had.
I feel that explicitly declaring '%' as header-key-legal, referencing the relevant RFC and appropriated code, and then explicitly encoding it anyway is more readable and less surprising.
for i := 0; i < len(key); i++ { | ||
b := key[i] | ||
if shouldEscape(b) { | ||
// %-encode bytes that should be escaped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant explain how the magic string indexing and shifting below works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if shouldEscape(b) { | ||
// %-encode bytes that should be escaped: | ||
// https://tools.ietf.org/html/rfc3986#section-2.1 | ||
buf.WriteByte('%') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer fmt.Fprintf(buf, "%%%x", b)
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to ensure a proper encoding, it'd have to be fmt.Fprintf(buf, "%%%x%x", b>>4, b&15):
https://play.golang.org/p/gVydKNv7qq-
FWIW the current scheme is how things are done in the golang stdlib: https://golang.org/src/net/url/url.go?s=7512:7544#L304
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
LGTM but someone else should comment on how breaking this is. |
@liggitt @mikedanese ping |
Can you squash? This looks good to me. /approve |
Signed-off-by: Jake Sanders <jsand@google.com>
squashed |
/test pull-kubernetes-e2e-kops-aws |
@liggitt ping |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dekkagaijin, liggitt, mikedanese The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all Tests are more than 96 hours old. Re-running tests. |
thanks for all the work on this. I updated the release note, please take a look. would you mind opening a doc PR against the impersonation and requestheader documentation describing the safe set of characters to use, and against the webhook authenticator doc referencing the safe set of characters to use up through |
@liggitt thanks, will do |
Automatic merge from submit-queue (batch tested with PRs 66225, 66648, 65799, 66630, 66619). If you want to cherry-pick this change to another branch, please follow the instructions here. |
just noticed the doc PR referenced 1.11.2, but this hasn't been picked back to 1.11.x, has it? |
@liggitt whoops... had a few changes in flight and only cherry picked one of them. I'll revise and send you the PR |
…65799-upstream-release-1.11 Automatic merge from submit-queue. Automated cherry pick of #65799: Escape illegal characters in remote extra keys Cherry pick of #65799 on release-1.11. #65799: Escape illegal characters in remote extra keys ```release-note action required: the API server and client-go libraries have been fixed to support additional non-alpha-numeric characters in UserInfo "extra" data keys. Both should be updated in order to properly support extra data containing "/" characters or other characters disallowed in HTTP headers. ```
…65799-upstream-release-1.9 Automated cherry pick of #65799: Escape illegal characters in remote extra keys
This percent-encodes characters in
X-Remote-Extra-
andImpersonate-Extra-
keys which aren't valid for header names per RFC 7230 (plus "%" to avoid breaking keys which contain them). The API server then blindly unescapes these keys.Reviewer note:
Old clients sending keys which were
%
-escaped by the user will have their values unescaped by new API servers. New clients sending keys containing illegal characters (or "%") to old API servers will not have their values unescaped. This version skew incompatibility is a compromise discussed in #63682.Fixes #63682
PTAL @mikedanese
Release note: