-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce SessionState size better with MessagePack + LZ4 #632
Reduce SessionState size better with MessagePack + LZ4 #632
Conversation
@NickMeves It would be good to see some comparison of compressed vs non-compressed sessions to justify the forced LZ4 compression, do you have any notes to show how much benefit we gain from this? |
@JoelSpeed Here is the data I got in testing a few months back with my earlier POC. I think this data is still valid with this iteration of the design:
I also tested configurable compression levels since our compress:decompress ratio is 1:many -- so I was hoping using compression level 9 (slowest but most space savings in theory) would be useful to our execution profile. But it saved like 10 bytes vs lowest level, so that was a dud 😄 |
If I recall, for the hot second I was base64 decoding the segments in the 3 tokens types if able to (header & payload usually) to yield better compression with like fields inside the base64 data got it down to like 2400 characters. So very minor improvement from the LZ4 of base64 tokens that got us to ~2700. LZ4 does pretty well even on the base64 tokens since they all have such similar headers and payload leading characters that they turn into the same runs of base64 encoded data. JWT headers especially. |
I currently have the compression only happening on cookie sessions -- I made it an encode/decode parameter and didn't do it in the Redis store (since large sessions don't really matter there) |
Ok, sounds like an approximate 25% saving, sounds good to me! Do you want me to review this PR as is or wait until it's ready? |
You can start taking a peek -- I think everything that's here is decently ready. I just need to look at how much damage I did to the Another things I'm curious your thoughts @JoelSpeed - Do we support legacy JSON sessions trickling in from V5 & V6.0 versions? Or do we rip the bandaid and only deal with messagepack moving forward and force a user reauthentication? I think most the complexity in this PR is from supporting a JSON fallback and support Redis's former custom encryption logic. |
I will have to take a look and see how complex it is, ideally we would have migration logic so that we don't have to consider this a breaking change, but if it's particularly complex (or happens to land with a bunch of other breaking changes), then we may decide to drop it 🤔 |
9fee41c
to
b0a1d64
Compare
@JoelSpeed I did a big cleanup of Tests are passing now. I just want to leave it in draft for a bit longer until I have a chance to do some manual testing against a dummy httpbin to do some analysis on the cookies. |
And I see you nearly did the same thing here after the fact 😄 Great minds think alike! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Added a few suggestions but nothing major I don't think
Please make sure to manually test the upgrades for this to see if they work, preferably for cookie and redis if you can
A lot of the test changes look very familiar, we are definitely going to have some clashes with some of my PRs 😅 Do you have time to review #577, I'd like to get that in and I think that might change some bits here
pkg/apis/sessions/session_state.go
Outdated
// Compress the packed encoding | ||
// The Compress:Decompress ratio is 1:Many. LZ4 gives fastest decompress speeds | ||
buf := new(bytes.Buffer) | ||
zw := lz4.NewWriter(nil) | ||
zw.Header = lz4.Header{ | ||
BlockMaxSize: 65536, | ||
CompressionLevel: 0, | ||
} | ||
zw.Reset(buf) | ||
|
||
reader := bytes.NewReader(packed) | ||
_, err = io.Copy(zw, reader) | ||
if err != nil { | ||
return nil, err | ||
} | ||
err = zw.Close() | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
compressed, err := ioutil.ReadAll(buf) | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to extract this to a function called something like compress
? (or alternate because that would clash with the function parameter) Would make the flow of the function nicer I think
This function could look like
// Marshal to MessagePack
packed, err := msgpack.Marshal(s)
if err != nil {
return nil, err
}
if compress {
packed, err = compress(packed)
if err != nil {
return nil, err
}
}
return c.Encrypt(packed)
pkg/apis/sessions/session_state.go
Outdated
// LZ4 Decompress | ||
reader := bytes.NewReader(decrypted) | ||
buf := new(bytes.Buffer) | ||
zr := lz4.NewReader(nil) | ||
zr.Reset(reader) | ||
_, err = io.Copy(buf, zr) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
packed, err = ioutil.ReadAll(buf) | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, would be good to see a decompress helper so the if statement can shrink down to
if compressed {
packed, err = decompress(packed)
if err != nil {
return nil, err
}
}
pkg/apis/sessions/session_state.go
Outdated
// Backward compatibility with using unencrypted Email or User | ||
// Decryption errors will leave original string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should drop this now, since we are dropping support for unencrypted sessions in the next release? Everyone should be upgrading via v6 so this should be redundant I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it! -- I have no qualms about using this as the opportunity to purge unencrypted sessions. We should roll this & #575 together then (after V6 gives a rollout window of encrypted sessions + SHA256 HMAC)
That would likely make the forced reauth for stragglers cleanest and result in no potential errors since their session will be tossed out with the SHA1 cookie failing to validate.
if tc.Error { | ||
for name, tc := range testCases { | ||
t.Run(name, func(t *testing.T) { | ||
var err error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May as well declare this inline on 205 _, err := ...
@@ -353,3 +274,24 @@ func TestIntoEncryptAndIntoDecrypt(t *testing.T) { | |||
}) | |||
} | |||
} | |||
|
|||
func compareSessionStates(t *testing.T, left *SessionState, right *SessionState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice extraction to a helper 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Those pesky Time based fields made a pure object compare not work as I had hoped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than left, right, we should call these expected and got? Or expected and actual? That would tie up with the assert right? I think in assert the parameter order is t testing.T, expected interface{}, actual interface{}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one point I use this helper to compare if taking the encode/decode route with 2 different paths still results in sessions states that are equal:
compareSessionStates(t, decoded, decodedCompressed)
That is the only reason I didn't use the expected terminology. But I suppose I am expecting the decoded without compression to equal the decoded with compression -- so I can switch to that naming convention
pkg/sessions/redis/redis_store.go
Outdated
|
||
// legacyV5LoadSessionFromTicket loads the session based on the ticket value | ||
// This falls uses V5 style encryption of Base64 + AES CFB | ||
func (store *SessionStore) legacyV5LoadSessionFromTicket(ctx context.Context, value string) (*sessions.SessionState, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I think I need to circle back on this one later. I don't think I have test coverage for this legacy fallback.
4ef36b4
to
6fec3df
Compare
@NickMeves can you do a rebase on latest master and fix the CI issue? I'd like to see the tests passing on this. Gonna give another review now anyway |
pkg/apis/sessions/session_state.go
Outdated
if err != nil || !utf8.ValidString(*s) { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the decryption succeeded but the string isn't valid UTF8, err is nil here so we return nil, nil
, I don't think that's the intention, perhaps this should be split into two if statements?
pkg/apis/sessions/session_state.go
Outdated
compressed, err := ioutil.ReadAll(buf) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
return compressed, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're not embellishing the error message with any extra detail, you could just return ioutil.ReadAll(buf)
pkg/apis/sessions/session_state.go
Outdated
func DecodeSessionState(data []byte, c encryption.Cipher, compressed bool) (*SessionState, error) { | ||
decrypted, err := c.Decrypt(data) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth embellishing the errors with some extra data so we can tell where the errors came from, eg:
return nil, err | |
return nil, fmt.Errorf("error decrypting data: %v, err") |
I would suggest doing that for the three error returns in this method as a minimum, preferably also in the encode too
pkg/apis/sessions/session_state.go
Outdated
payload, err := ioutil.ReadAll(buf) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
return payload, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, could just return, WDYT?
for name, ss := range testCases { | ||
t.Run(name, func(t *testing.T) { | ||
encoded, err := ss.EncodeSessionState(c, false) | ||
assert.NoError(t, err) | ||
encodedCompressed, err := ss.EncodeSessionState(c, true) | ||
assert.NoError(t, err) | ||
assert.GreaterOrEqual(t, len(encoded), len(encodedCompressed)) | ||
|
||
decoded, err := DecodeSessionState(encoded, c, false) | ||
assert.NoError(t, err) | ||
decodedCompressed, err := DecodeSessionState(encodedCompressed, c, true) | ||
assert.NoError(t, err) | ||
|
||
compareSessionStates(t, decoded, decodedCompressed) | ||
compareSessionStates(t, decoded, &ss) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth possibly checking that trying to decode with a different cipher results in an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be sufficient covered already by these tests in the cipher section, what do you think?
oauth2-proxy/pkg/encryption/cipher_test.go
Line 161 in c4cf15f
func TestGCMtoCFBErrors(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the middle encryption layer in these tests is sorta passed over and we never explicitly test changing the cipher breaks the functions (vs our compression size & different output checks in the middle stage).
I'll give it a whirl adding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up doing this -- and took the opportunity to test with all valid AES secret lengths and ran the tests through all our potential ciphers. Before I was using the const secret & just the AES-CFB cipher.
@@ -353,3 +274,24 @@ func TestIntoEncryptAndIntoDecrypt(t *testing.T) { | |||
}) | |||
} | |||
} | |||
|
|||
func compareSessionStates(t *testing.T, left *SessionState, right *SessionState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than left, right, we should call these expected and got? Or expected and actual? That would tie up with the assert right? I think in assert the parameter order is t testing.T, expected interface{}, actual interface{}
?
6fec3df
to
5f9d5a5
Compare
5f9d5a5
to
a4ee514
Compare
a4ee514 adds the unit tests for the legacy redis session decoding. I tried to get the common test cases unified under |
a4ee514
to
cc77e9b
Compare
370c27a
to
fce8439
Compare
For the codeclimate issues - I want to fix those in a separate PR that moves the persistent store serialization + cookie ticket logic to a central place so all stores can use it (and just need to worry about how to Save/Load/Del key + bytes) |
@JoelSpeed This is ready for the finish line when you have time for a final look. Ran the following interactive tests with the docker-compose environment to confirm the legacy fallback worked as planned: Cookie v5.1.0 settings that didn't result in tokens being sent (ie no cookie_refresh or pass_authorization_header) caused an error that resulted in a reauthentication (as per our discussion in this PR to remove support for that legacy case) As far as compression with the docker-compose contrib environment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic all looks good, left some stylistic comments
- [#632](https://github.com/oauth2-proxy/oauth2-proxy/pull/632) There is backwards compatibility to sessions from v5 | ||
- Any unencrypted sessions from before v5 that only contained a Username & Email will trigger a reauthentication | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this count as a breaking change or an important note? It's not breaking from v6.0.0 is it? Would this require a major version bump or minor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move to Important Notes
-- it is only semi-breaking (is a reauth breaking...?) from a subset of v5 users that had a valid v6 cookie-secret
but didn't have the 4 options that triggered a cipher to be set. It will cause a reauthentication for those JSON sessions that come in with only an unencrypted user/email.
All other v5 use cases had options that had full sessions that would be handled with the fallback or other issues that other v6 changes forced them to re-authentication anyway
pkg/apis/sessions/session_state.go
Outdated
func (s *SessionState) EncodeSessionState(c encryption.Cipher, compress bool) ([]byte, error) { | ||
packed, err := msgpack.Marshal(s) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "error marshalling session state to msgpack") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the built in errors package if you are going to wrap errors, preferred style is fmt.Errorf("<descr>: %w", err)
. Is there any reason to wrap this error? Are we unwrapping it later somewhere?
pkg/apis/sessions/session_state.go
Outdated
var ss SessionState | ||
err = msgpack.Unmarshal(packed, &ss) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "error unmarshalling data to session state") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment re error wrapping as above
pkg/apis/sessions/session_state.go
Outdated
reader := bytes.NewReader(payload) | ||
_, err := io.Copy(zw, reader) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "error copying lz4 stream to buffer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above re error wrap
pkg/apis/sessions/session_state.go
Outdated
} | ||
err = zw.Close() | ||
if err != nil { | ||
return nil, errors.Wrap(err, "error closing lz4 writer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above re error wrap
@@ -1,6 +1,9 @@ | |||
package redis | |||
package redis_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benefit to making this part of the test package? I never really understood why this helps, I assume the compiler ignores all the test code anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna revisit the legacy_helper -- I was thinking trying to consolidate the legacy test cases into 1 area would help DRY up the code... but it caused more headaches than I think its worth for tests we'll likely fully deprecate in V7 and rip out.
It caused circular imports -- which made me need to change the package of the test code that consumed them to *_test
to avoid the circular import compile error.
) | ||
|
||
func TestLegacyV5DecodeSession(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the rest of this test suite uses Ginkgo and Gomega, have you considered using either of those?
You can use gomega assertions withiout Ginkgo by initialising a new Gomega
g := NewWithT(t)
err := ...
g.Expect(err).ToNot(HaveOccurred())
t.Run("", func (t *testing.T) {
// Use a subtest gomega to fail just the subtest
gs := NewWithT(t)
...
})
Not a blocker, but I would prefer the consistency
pkg/sessions/tests/legacy_helpers.go
Outdated
ciphertext := make([]byte, len(value)) | ||
block, err := aes.NewCipher(ticket.Secret) | ||
if err != nil { | ||
return nil, fmt.Errorf("error initiating cipher block %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, fmt.Errorf("error initiating cipher block %s", err) | |
return nil, fmt.Errorf("error initiating cipher block: %v", err) |
pkg/sessions/tests/legacy_helpers.go
Outdated
@@ -0,0 +1,102 @@ | |||
package tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these only used for redis? Should we keep them in the redis package if so?
oauthproxy_test.go
Outdated
if result == hmacauth.ResultNoSignature { | ||
w.Write([]byte("no signature received")) | ||
_, err = w.Write([]byte("no signature received")) | ||
} else if result == hmacauth.ResultMatch { | ||
w.Write([]byte("signatures match")) | ||
_, err = w.Write([]byte("signatures match")) | ||
} else if result == hmacauth.ResultMismatch { | ||
w.Write([]byte("signatures do not match:" + | ||
_, err = w.Write([]byte("signatures do not match:" + | ||
"\n received: " + headerSig + | ||
"\n computed: " + computedSig)) | ||
} else { | ||
panic("Unknown result value: " + result.String()) | ||
panic("unknown result value: " + result.String()) | ||
} | ||
if err != nil { | ||
panic(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could make this a switch that sets the value of a string, then handle the error writing afterwards, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give it a quick whirl! But I think this will get a much bigger overhaul when I tackle all this with this issue: #648
Thinking back, I feel like a few PRs in the past have added headers to the SignatureHeaders yet didn't mention that this would be a breaking change requiring any users of this functionality to update how the backend code validated the signature.
So I'm really starting to think no one uses this -- or any few users are very advanced and know to check the code itself to see what headers they need to hmacauth on the backend (since we don't document it).
ad06a21
to
1595cae
Compare
@JoelSpeed This commit takes care of the circular import issue and gets us back to the old way where our session & redis tests can be in those packages and have access to unit test the internal methods again: Only downside is it sits in the This feels like the least bad option while still DRYing up these repeated legacy V5 tests? |
03e3fcf
to
4a9c9af
Compare
There's a linter issue, need to make the type it's complaining about public, which is ok as this will be temporary. Other than that, and the changelog needing a conflict resolution, all looks good, let's get those fixed and get this merged! |
Assumes ciphers are now mandatory per oauth2-proxy#414. Cookie & Redis sessions can fallback to V5 style JSON in error cases. TODO: session_state.go unit tests & new unit tests for Legacy fallback scenarios.
More aggressively assert.NoError on all validation.Validate(opts) calls to enforce legal options in all our tests. Add additional NoError checks wherever error return values were ignored.
Invalid CFB decryptions can result in garbage data that 1/100 times might cause message pack unmarshal to not fail and instead return an empty session. This adds more rigor to make sure legacy sessions cause appropriate errors.
Refactor common legacy JSON test cases to a legacy helpers area under session store tests.
Placing these helpers under the sessions pkg removed all the circular import uses in housing it under the session store area.
4a9c9af
to
dd19d53
Compare
@JoelSpeed Ready to roll! 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The MessagePack + Compression portion of overhauling sessions encoding. (#523)
Description
(Controversial) Makes the compression stage mandatory and always happens. I think we either do LZ4 compression or we don't, no need for optional command line flag bloat.
This assumes ciphers are mandatory moving forward: with #414
Encode/Decode codepath itself is rather light. Most of the added code is supporting detecting a legacy V5 JSON session coming in and handling the fallback.
This PR also streamlines the RedisSessionStore rogue encryption of sessions to use the GCMCipher that is now available.
Motivation and Context
Lots of users having issues with large cookie sessions, particularly with Azure.
This moves to encoding with MessagePack to give a binary encoding format (JSON required the encrypted cookie to be Base64 encoded to be a valid string for JSON, increasing cookie size by 33%)
Additionally, it moves encryption to be on the full session state object once - instead of on a per field basis. This saves the 16 bytes where all encrypted fields store the initialization vector for each field.
LZ4 compression selected due to its super fast decompress performance. Since we likely mint a session cookie once, and then constantly read it (and lz4 decompress) for its lifespan -- lz4 had the best profile of various compression algorithms.
How Has This Been Tested?
Unit tests added for session state.
NOTE: leaving this PR in draft until I address
oauth2proxy_test.go
issues. Lots of unit tests rans without acipher
since that was a previously allowed codepath. This PR firmly removes the ability to have no cipher in SessionState code (matching what our options validator now does as well). Just need to update the affected unit tests.Checklist: