-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crl provider: Static and FileWatcher provider implementations #6670
Conversation
…g connections based on CRL check outcomes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good! Have a few questions/comments
@@ -594,14 +619,14 @@ func (s) TestClientServerHandshake(t *testing.T) { | |||
// server custom check fails | |||
{ | |||
desc: "Client sets peer cert, reload root function with verifyFuncGood; Server sets bad custom check; mutualTLS", | |||
clientCert: []tls.Certificate{cs.ClientCert1}, | |||
clientGetRoot: getRootCAsForClient, | |||
clientCert: []tls.Certificate{cs.ClientCert3}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name the certs for what they are rather than just being numbered?
e.g. if ClientCert3
is a cert that is revoked, just name is ClientRevokedCert
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to leave it as is for this PR since it fits the current pattern - security/advancedtls/internal/testutils/testutils.go
Definitely something to address during 'Unifying API' effort (will be a separate PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should continue to add bad code that we plan to refactor later just to keep with a bad existing pattern - let's go ahead and name these variables with something meaningful rather than making more work for ourselves down the road.
In fact, I'd argue the current pattern isn't much of a pattern - it's just lazy variable naming. For there to be a pattern there would need to be some meaning to Cert1
, Cert2
, etc
clientVerifyFunc: clientVerifyFuncGood, | ||
clientVType: CertVerification, | ||
clientExpectHandshakeError: true, | ||
serverMutualTLS: true, | ||
serverCert: []tls.Certificate{cs.ServerCert1}, | ||
serverGetRoot: getRootCAsForServer, | ||
serverCert: []tls.Certificate{cs.ServerCert3}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about variable naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to leave it as is for this PR since it fits the current pattern - security/advancedtls/internal/testutils/testutils.go
Definitely something to address during 'Unifying API' effort (will be a separate PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other thread
@@ -594,14 +619,14 @@ func (s) TestClientServerHandshake(t *testing.T) { | |||
// server custom check fails | |||
{ | |||
desc: "Client sets peer cert, reload root function with verifyFuncGood; Server sets bad custom check; mutualTLS", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you changed this test case and it's not a new one - just curious as to why you changed it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good catch, I was experimenting with some chains and accidentally submitted (it doesn't influence the outputs). Will revert it back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -704,6 +729,37 @@ func (s) TestClientServerHandshake(t *testing.T) { | |||
Cache: cache, | |||
}, | |||
}, | |||
// Client: set valid credentials with the revocation config | |||
// Server: set valid credentials with the revocation config | |||
// Expected Behavior: success, because non of the certificate chains sent in the connection are revoked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: non
-> none
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You fixed the wrong non
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
security/advancedtls/crl_provider.go
Outdated
failCounter := 0 | ||
for _, file := range files { | ||
filePath := fmt.Sprintf("%s/%s", p.opts.CRLDirectory, file.Name()) | ||
err := p.addCRL(filePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some thread safety around these updates, or we need to use a backing map type that is thread safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for _, tt := range tests { | ||
t.Run(tt.desc, func(t *testing.T) { | ||
for _, c := range tt.certs { | ||
crl, err := p.CRL(c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also check that the correct CRL is returned, rather than just that a non-nil value is returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point but I'd prefer to keep it very simple here and do detailed checks on integration test level
// and the validation of Options configuration. The configurations include empty | ||
// one, non existing CRLDirectory, invalid RefreshDuration, and the correct one. | ||
func TestFileWatcherCRLProviderConfig(t *testing.T) { | ||
if _, err := MakeFileWatcherCRLProvider(Options{}); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can split the if
cases here into separate tests, or make this a Table Test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have 3 separate test methods - to check config, CRL api, and directory scanning. I think config one can also be fit to Table test pattern (like the other 2) but I think it's a little bit easier to read it in the current format
// and the validation of Options configuration. The configurations include empty | ||
// one, non existing CRLDirectory, invalid RefreshDuration, and the correct one. | ||
func TestFileWatcherCRLProviderConfig(t *testing.T) { | ||
if _, err := MakeFileWatcherCRLProvider(Options{}); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And let's check that the errors
being returned are the expected errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at how tests are organized for credentials/tls/certprovider/pemfile/watcher_test.go
I think it's an overkill for the config tests.
// intermediate chains, as well as a chain without CRL for issuer, and check | ||
// that it’s correctly processed. Additionally, we also check if number of | ||
// invocations of custom callback is correct. | ||
func TestFileWatcherCRLProvider(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test verifying the async refreshing goroutine works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, it'll require playing with the timers so I'm planning to have a separate PR for that (like e2e test)
security/advancedtls/crl_provider.go
Outdated
@@ -136,9 +135,13 @@ func (o *FileWatcherOptions) validate() error { | |||
return fmt.Errorf("advancedtls: CRLDirectory %v is not readable: %v", o.CRLDirectory, err) | |||
} | |||
// Checks related to RefreshDuration. | |||
if o.RefreshDuration < time.Minute { | |||
if o.RefreshDuration <= 0 { | |||
grpclogLogger.Warningf("RefreshDuration is not set or negative: provided value %v, default value %v will be used.", o.RefreshDuration, defaultCRLRefreshDuration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably not a warning to use the default value...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except the last nit. And also investigating the race test failure.
security/advancedtls/crl_provider.go
Outdated
@@ -136,7 +136,7 @@ func (o *FileWatcherOptions) validate() error { | |||
} | |||
// Checks related to RefreshDuration. | |||
if o.RefreshDuration <= 0 { | |||
grpclogLogger.Warningf("RefreshDuration is not set or negative: provided value %v, default value %v will be used.", o.RefreshDuration, defaultCRLRefreshDuration) | |||
grpclogLogger.Infof("RefreshDuration is not set or negative: provided value %v, default value %v will be used.", o.RefreshDuration, defaultCRLRefreshDuration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't think this necessarily needs to be logged at all? Especially in the case where you're just using the default, anything being logged would be spam. Maybe if you had a negative number? But who would even do such a thing? IMO:
if dur == 0 {
// use default, silently
} else if dur < 1m {
// use 1m and info or warning that we're using 1m instead of the illegal value they used
}
See https://github.com/grpc/grpc-go/blob/master/Documentation/log_levels.md for some guidance we try to follow regarding this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
const nonCRLFilesUnderCRLDirectory = 5 | ||
nonCRLFilesSet := make(map[string]struct{}) | ||
//var callbackMutex sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -191,9 +197,11 @@ func (s) TestFileWatcherCRLProvider(t *testing.T) { | |||
} | |||
}) | |||
} | |||
nonCRLMutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does calling p.Close
here, before the check, not remove the need for the mutex? If not, why not? I would think that would eliminate the concurrency concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a very good point - current p.Close
impl waits for gouroutines to end. I tested locally and -race
is green
@@ -141,7 +141,6 @@ func (s) TestFileWatcherCRLProvider(t *testing.T) { | |||
t.Parallel() | |||
const nonCRLFilesUnderCRLDirectory = 5 | |||
nonCRLFilesSet := make(map[string]struct{}) | |||
//var callbackMutex sync.Mutex | |||
var nonCRLMutex sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can you delete this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is no need to protect this local to the test map anymore. We do the read once provider's goroutine is done
p.scanMutex.Lock() | ||
defer p.scanMutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything similar here? We really shouldn't need mutexes while testing conditions, or else they might be flaky if we lose a race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm not sure - I need the same instance of Provider for all the table tests, and I need to make the check cmp.Diff(len(p.crls), tt.expectedEntries)
before the next test case scan, that's why I use mutex from p.ScanCRLDirectory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, you're taking the mutex from the thing you're testing.
Can you call p.CRL()
instead of directly accessing p.crls
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's doable - I can provide a list of filenames to be scanned ([]string{"2.crl", "3.crl"}
and list of CRLs to be found in the map by calling p.CRL()
Some downside - the current approach ensures the map contains 'exactly x' entries (len(p.crls)
) but the new one will be about 'at least x' entries. Is that ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, CRL()
takes a parameter and you'd need to call it multiple times.
This is OK, then, though I'd prefer stricter validation, i.e. checking the contents of the map, or at least the keys, and not just the length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the refactoring to the stricter validation and I think it looks better now (plus no more calling mutexes inside tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gtcooke94 might want to take another pass after all the changes, but LGTM for me.
p.scanMutex.Lock() | ||
defer p.scanMutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, CRL()
takes a parameter and you'd need to call it multiple times.
This is OK, then, though I'd prefer stricter validation, i.e. checking the contents of the map, or at least the keys, and not just the length.
// are not independent from each other. | ||
func (s) TestFileWatcherCRLProviderDirectoryScan(t *testing.T) { | ||
sourcePath := testdata.Path("crl") | ||
targetPath := createTmpDir(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we remove this directory once the tests are finished? Or are all the Go versions we support recent enough for us to use https://godoc.corp.google.com/pkg/testing#T.TempDir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added the 'remove dir' approach.
) * rename certificateListExt to CRL * CRLProvider file * Add CRLProvider to RevocationConfig * Beginning refactor of CRL handling * Shell of StaticCRLProvider * basic static crl provider test * use loadCRL helper * refactor of CRL loading * Table tests * Table tests * Add tests with Static CRL provider * New certs to be used for CRL tests. Added test for passing and failing connections based on CRL check outcomes * Main functionality of File Watcher (Directory) CRL provider * Refactor async go routine, validate() func, add unit tests * Custom error callback, related unit tests * Error callback test improvement * Comments for StaticCRLProvider * Comments for public API * go mod tidy * Comments for tests * Fix vet errors * Change Static provider behavior to match C Core, address other PR comments * Data race fix * Test helper fn change * Address PR comments * Address PR comments (part 2) * Migration from context to channel for controlling crl reloading goroutine * Align in-memory CRL updates during directory scan to C++ behavior * Improve comments for ScanCRLDirectory * Base test case for Scan CRL Directory file manipulations * full set of cases for CRL directory content manipulation * Add comment for table test structure * Fix for go.mod and go.sum * Empty directoru workaround * Delete deprecated crl functionality * Restoring deprecated crl files * Fit to grpctest.Tester pattern * Update readme for crl provider tests * Address PR comments * Revert "Restoring deprecated crl files" This reverts commit 5643760. * Revert "Resolve conflicts with upstream - deletion of deprecated crl" This reverts commit e013064, reversing changes made to 21f4301. Revert deletion * Update link for gRFC proposal * Address PR comments * Address PR comments part 1 * Address PR comments part 2 * Address PR comments part 3 * Fix for go.mod and go.sum * Fix comment typo * Fix for gRFC tag * Add more details to CRL api godoc comments. * Address PR comments * Address PR comments * Delete crl_deprecated.go and crl_deprecated_test.go * Delete testdate/crl/provider/filewatcher directory and .gitignore under it * Race test fix * Address PR comments * Address PR comments * Refactor directory reloader test from checking size of crl map to querying individual entries approach * Add extra case for RefreshDuration config test * Update cpmment for table test structure * Unexport scan scanCRLDirectory, drop related mutex, update the comments * Update API comments, clear tmp dir after the tests --------- Co-authored-by: Gregory Cooke <gregorycooke@google.com>
Part of CRL Provider effort. The functionality in this PR includes a built in Static and FileWatcher provider implementations. It also gives users a tool to implement their own providers
RELEASE NOTES: N/A