Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: support certificate chain validation #97266

Merged

Conversation

robbiemcmichael
Copy link
Contributor

@robbiemcmichael robbiemcmichael commented Dec 13, 2020

What type of PR is this?

/kind bug

What this PR does / why we need it:

Fixes an issue where some kubeadm phases fail if a certificate file contains a certificate chain with one or more intermediate CA certificates. This two-tier CA configuration is required to allow a short expiry to be used on the CA used to sign certificates without resorting to frequent manual rotations of the root CA.

The proposal is to change the validation algorithm from requiring that a certificate was signed directly by the root CA to requiring that there is a valid certificate chain back to the root CA. The fix itself is quite simple, most of the complexity is in the PKI helper functions required to unit test this change properly.

Which issue(s) this PR fixes:

Fixes the first part of kubernetes/kubeadm#2360

Special notes for your reviewer:

N/A

Does this PR introduce a user-facing change?:

kubeadm: add support for certificate chain validation. When using kubeadm in external CA mode, this allows an intermediate CA to be used to sign the certificates. The intermediate CA certificate must be appended to each signed certificate for this to work correctly.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

N/A

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 13, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @robbiemcmichael!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @robbiemcmichael. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 13, 2020
@robbiemcmichael robbiemcmichael marked this pull request as ready for review December 13, 2020 15:01
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 13, 2020
}

cert := certs[0]
intermediates := certs[1:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, orders in chains are not guaranteed, thus this feels applicable to your use case only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a chain contains no intermediates, this line will panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, orders in chains are not guaranteed, thus this feels applicable to your use case only.

I can't find anything explicit in RFC 5280 or RFC 7468 to say that the leaf certificate must be first. However, I can see that the crypto/tls library makes this assumption in its certificate verification logic as well an explicit comment elsewhere that the leaf certificate should be first.

Since I can't find an explicit statement in the standards I'm going to have to fall back on the argument that if it's good enough for Golang's standard TLS implementation then it's good enough for the certificate verification helpers in kubeadm. The Kubernetes components will be using the TLS implementation anyway, so it makes sense that the kubeadm phases which perform certificate verification behave the same way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a chain contains no intermediates, this line will panic.

It shouldn't panic, if there's only a single certificate then intermediates will be an empty slice. Here's a quick example:

package main

import "fmt"

func main() {
    x := []int{1}
    fmt.Printf("Head: %d\n", x[0])
    fmt.Printf("Tail: %v\n", x[1:])
}

This produces:

Head: 1
Tail: []

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC5286 7.4.2 describes the chain order:

The sender's certificate MUST come first in the list. Each following certificate MUST directly certify the one preceding it. Because certificate validation requires that root keys be distributed independently, the self-signed certificate that specifies the root certificate authority MAY be omitted from the chain, under the assumption that the remote end must already possess it in order to validate it in any case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't panic,

my mistake.

RFC5286 7.4.2 describes the chain order:

thanks for the info!

@@ -556,32 +625,39 @@ func NewPrivateKey(keyType x509.PublicKeyAlgorithm) (crypto.Signer, error) {
}

// NewSignedCert creates a signed certificate using the given CA certificate and key
func NewSignedCert(cfg *CertConfig, key crypto.Signer, caCert *x509.Certificate, caKey crypto.Signer) (*x509.Certificate, error) {
func NewSignedCert(isCA bool, cfg *CertConfig, key crypto.Signer, caCert *x509.Certificate, caKey crypto.Signer) (*x509.Certificate, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this bool should be the last arg ideally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change made in this commit: 6f90532

t.Errorf(
"failed to create intermediate CA cert and key with an error: %v",
err,
)
Copy link
Member

@neolit123 neolit123 Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this t.Errorf does not need to span on multiple lines.

(EDIT: applies to a number of instances bellow too)

Copy link
Contributor Author

@robbiemcmichael robbiemcmichael Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All requested formatting changes have been made in this commit: d43e524

The original style that I went with is more consistent with the rest of the existing code (see this example) but I also prefer the shortened version as long as we're not going for complete consistency.

if err != nil {
t.Errorf(
"failed to write cert bundle: %v",
err,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

return errors.Wrapf(err, "certificate %s is not signed by corresponding CA", l.uxName)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should exclude whitespace changes from the PR.


if _, err := cert.Verify(verifyOptions); err != nil {
return err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change doesn't seem backwards compatible to clusters created with older versions of kubeadm.
calling VerifyCertChain would now result in an error if the intermediate pool is empty, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is backwards compatible since the Verify function doesn't require the intermediate pool to contain any certificates, it only needs to contain the certificates required to construct a valid chain. No additional certificates are required when the leaf certificate is signed directly by the root CA, so this function will succeed.

I've also only changed the verification logic and haven't changed anything to do with the way that kubeadm generates certificates, so the standard setup will still be where the root CA directly signs the leaf certificate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an extra test case to prove that this is backwards compatible: 511f498

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this should work fine AFAIK.

@@ -79,14 +80,29 @@ func NewCertificateAuthority(config *CertConfig) (*x509.Certificate, crypto.Sign
return cert, key, nil
}

// NewIntermediateCertificateAuthority creates new certificate and private key for an intermediate certificate authority
func NewIntermediateCertificateAuthority(parentCert *x509.Certificate, parentKey crypto.Signer, config *CertConfig) (*x509.Certificate, crypto.Signer, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config should be the first argument.

maybe the better API here is to have a single NewCertificateAuthority function that decides on self-signed vs signed-by-parent depending on a nil parentCert, but having a separate function is fine too.

Copy link
Contributor Author

@robbiemcmichael robbiemcmichael Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since creating an intermediate CA is an uncommon use case (it's only used in the tests for the certificate chain verification) I think it's preferable not to force callers of NewCertificateAuthority (of which there are quite a few) to have to specify a nil certificate and key.

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbiemcmichael
thank you for the PR, i think the change is not necessarily bad, but i have concerns on how this plays with existing setups of kubeadm. have you though about what happens during upgrade on cert renew?

also unless i'm wrong the fact that the intermediate CA follows a leaf in your setup is not standardized, is it?

for the above and similar reasons, that is why we ask contributors to write a detailed proposal (KEP):
https://github.com/kubernetes/enhancements/tree/master/keps

/cc @randomvariable @fabriziopandini

@neolit123
Copy link
Member

/cc @wallrj
PTAL too

@k8s-ci-robot
Copy link
Contributor

@neolit123: GitHub didn't allow me to request PR reviews from the following users: wallrj.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @wallrj
PTAL too

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@neolit123
Copy link
Member

/priority backlog

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Dec 16, 2020
@neolit123
Copy link
Member

neolit123 commented Dec 16, 2020

in the OP, under:

Does this PR introduce a user-facing change?:

please add the following instead of NONE:

kubeadm: add support for ... <describe the feature with a couple of sentences>

this will add a release note for 1.21.

@@ -79,14 +80,29 @@ func NewCertificateAuthority(config *CertConfig) (*x509.Certificate, crypto.Sign
return cert, key, nil
}

// NewIntermediateCertificateAuthority creates new certificate and private key for an intermediate certificate authority
func NewIntermediateCertificateAuthority(parentCert *x509.Certificate, parentKey crypto.Signer, config *CertConfig) (*x509.Certificate, crypto.Signer, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change made in this commit:

latest commit still has config as the last function argument.

should be:

NewIntermediateCertificateAuthority(config *CertConfig, parentCert *x509.Certificate, parentKey crypto.Signer)...

Copy link
Contributor Author

@robbiemcmichael robbiemcmichael Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the comment about the change being made was intended for the isCA bool being moved to be the last argument. I've edited the comments so that it makes sense now.

As for moving config to the last argument, I've gone with this ordering for consistency with th existing function NewCertAndKey. Let me know if you would prefer it to be left as is for consistency, be different or whether I should update both to move config to the last argument.

Copy link
Member

@neolit123 neolit123 Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's leave the ordering as is, for consistency with NewCertAndKey.
the pkiutils package has no API guaranties yet (public usage), but we may have to review some of these semantics at some point.

serial, err := cryptorand.Int(cryptorand.Reader, new(big.Int).SetInt64(math.MaxInt64))
if err != nil {
return nil, err
}
if len(cfg.CommonName) == 0 {
return nil, errors.New("must specify a CommonName")
}
if len(cfg.Usages) == 0 {
if !isCA && len(cfg.Usages) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we should just remove the mandatory len(cfg.Usages) > 0 here?
i believe currently we are passing either x509.ExtKeyUsageClientAuth or x509.ExtKeyUsageServerAuth to this function via config.

Copy link
Contributor Author

@robbiemcmichael robbiemcmichael Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config is passed through from NewCertAndKey so it's probably cleaner if I just perform this check there instead.

Let me know if you prefer the version in this commit otherwise I'm happy to remove it entirely: 464418c

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config is passed through from NewCertAndKey so it's probably cleaner if I just perform this check there instead.

agreed.

KeyUsage: keyUsage,
ExtKeyUsage: cfg.Usages,
BasicConstraintsValid: isCA,
IsCA: isCA,
Copy link
Member

@neolit123 neolit123 Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://golang.org/pkg/crypto/x509/

// BasicConstraintsValid indicates whether IsCA, MaxPathLen,
// and MaxPathLenZero are valid.

the documentation is a bit vague on what "valid" means, but should we assume BasicConstraintsValid should be always bound to isCA?

e.g. here it's not:

tmpl.BasicConstraintsValid = true
tmpl.IsCA = false

can we pass true always to it and an explicit value to MaxPathLen and MaxPathLenZero=false?

i'm assuming this is where it's checked:
https://github.com/golang/go/blob/master/src/crypto/x509/verify.go#L721-L726

Copy link
Contributor Author

@robbiemcmichael robbiemcmichael Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did it this way to make sure I didn't make any changes to the existing code which signs leaf certificates. My interpretation of the docs is that it would probably work to set BasicConstraintsValid to true. We shouldn't need to explicitly set MaxPathLen and MaxPathLenZero since the default values of 0 and false respectively are "interpreted as MaxPathLen not being set" according the docs.

If you're happy for this change to be made to the leaf certificate signing then it's in this commit: 081104c

If you want to play it safe then I can revert the commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my vote goes to set it always to true even for non-CA certificates.
BasicConstraintsValid seems like a Golang wrapper over the Basic Constraints extension, to conform checks in crypto/x509/verify.go and it doesn't seem like a breaking change.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Dec 17, 2020
@robbiemcmichael
Copy link
Contributor Author

Thanks for the review again @neolit123, I think I've made the changes or otherwise responded to all your comments but let me know if I've missed anything.

please keep the commits squashed to 2.

implementation
tests

I've just been using the extra commits to aid people in seeing the changes during the review process. I'll separate it into these two commits once you're happy that I've addressed all your comments.

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbiemcmichael i went over the change again and it SGTM.
since @randomvariable is OK with this too, we can proceed with merging once you squash the commits.

thanks!
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123, robbiemcmichael

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 21, 2020
@neolit123
Copy link
Member

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 21, 2020
Fixes an issue where some kubeadm phases fail if a certificate file
contains a certificate chain with one or more intermediate CA
certificates. The validation algorithm has been changed from requiring
that a certificate was signed directly by the root CA to requiring that
there is a valid certificate chain back to the root CA.
@robbiemcmichael robbiemcmichael force-pushed the kubeadm-validate-cert-chains branch from 625491f to 9022f24 Compare December 24, 2020 15:51
@robbiemcmichael
Copy link
Contributor Author

@neolit123 thanks again for the review, commits have been squashed and separated into implementation and tests as requested.

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 24, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 24, 2020
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 88a05df into kubernetes:master Dec 24, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Dec 24, 2020
@robbiemcmichael robbiemcmichael deleted the kubeadm-validate-cert-chains branch January 5, 2021 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants