Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secure node -> master communication #3168

Closed
davidopp opened this issue Dec 29, 2014 · 26 comments
Closed

Secure node -> master communication #3168

davidopp opened this issue Dec 29, 2014 · 26 comments
Assignees
Labels
area/nodecontroller area/security priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@davidopp
Copy link
Member

In some hosting environments/configurations, the network traffic between node and master may traverse the public Internet. As a result, we'd like to secure the communication between the node components (e.g. kubelet and proxy) and master. To avoid the complexity of securing the kubelet API, we'd like to secure the node -> master communication, but not the reverse. This simplification has a downside; it means all communication between kubelet and master would have to be initiated by the kubelet. For example, we'd have to change health checks to be initiated by the kubelet, which in turn raises a question of how to do flow control (master apply backpressure when it becomes overloaded).

@davidopp
Copy link
Member Author

As part of this, we should harden the master against properly authenticated but malformed requests (in other words, make sure the master gracefully handles bugs in the node components).

@erictune
Copy link
Member

To ensure we handle malformed requests, we should have a fuzz test than
runs against the master api, including endpoints used by the kubelets.

To handle a software fault where one or more kubelets are sending requests
at a fairly high rate, we should implement per-source-IP rate limits. This
should be easy to implement.

Defending against very high aggregate request rates (when the master is
lots of CPU just saying no to connection request) is harder and I think we
should leave it out of scope for now.

On Mon, Dec 29, 2014 at 3:34 PM, davidopp notifications@github.com wrote:

As part of this, we should harden the master against properly
authenticated but malformed requests (in other words, make sure the master
gracefully handles bugs in the node components).


Reply to this email directly or view it on GitHub
#3168 (comment)
.

@erictune
Copy link
Member

This is closely related to #2483 and to discussion in pull #846. I'll leave it up to @davidopp to devide if this issue is a duplicate or not.

@alex-mohr
Copy link
Contributor

I'd like to leave this open as specifically tracking secure master <-> kubelet communication and farm out other independent parts to separate issues (fuzz testing, (D)DoS prevention, and the wide variety of issues in #2483).

@alex-mohr alex-mohr assigned roberthbailey and unassigned j3ffml Feb 4, 2015
@alex-mohr alex-mohr added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Feb 4, 2015
@bgrant0607
Copy link
Member

#156 is a proposal to make apiserver and controllers not contact kubelet. I am now in favor of that proposal.

@j3ffml
Copy link
Contributor

j3ffml commented Feb 13, 2015

I think we still need a secure channel from apiserver to kubelet to do things like exec in a container and stream container logs.

@bgrant0607
Copy link
Member

Proxy/bastion cases. Fair enough.

@derekwaynecarr
Copy link
Member

/cc @liggitt @deads2k

@erictune
Copy link
Member

Let's continue discussion of "exec in container and stream container logs" on #156

@roberthbailey
Copy link
Contributor

@liggitt @a-robinson

Here is my plan to get us to "Static Clustering" as defined in clustering.md:

  1. Configure the kubelet to use HTTPS
    • Generate a self signed certificate on the kubelet, bind to the port using TLS.
    • The master will connect over HTTPS without verifying the certificate.
  2. Modify the kubelet to register itself with the master
    • For GCE/GKE, the master location is currently passed as the name of the master VM which is resolved into the internal IP of the master. For non-salt based platforms this is currently TBD until I investigate further.
    • The kubelet will pass its public key and location to the master, which will be stored persistently in etcd. The master can either use the current means to accept the kubelet into the cluster or implement the "insecure-always-approve" policy (as it will not yet be able to verify the certificate provided by the kubelet).
    • The master will now connect over HTTPS and verify that the certificate for the kubelet has the correct public key. Note that the kubelet will still not verify the master.
  3. Distribute certs to the master/nodes during cluster creation
    • For GCE/GKE this can be through the GCE metadata server or using gcloud ssh. For other providers this can be via ssh or another side channel.
    • For GCE we will use a single “cluster” certificate for all of the nodes instead of generating a separate certificate for each node to support managed instance groups.
    • Instead of generating self signed certificates, the certificates on the master and nodes will now be signed by the same certificate authority.
    • When the kubelet registers with the master, the master can verify that the provided certificate to register as well as the client certificate in the TLS handshake are signed by a known CA (the same one that signed the master certificate). The kubelet can verify that the server certificate provided by the master was signed by a known CA.
    • When the master connects to the kubelet over HTTPS the kubelet can verify that the client certificate provided by the master is signed by a known CA (the same one that signed the kubelet certificate) and the master can verify that the server certificate provided by the kubelet was signed by a known CA.

@zmerlynn zmerlynn added this to the v1.0 milestone Feb 26, 2015
@smarterclayton
Copy link
Contributor

@liggitt I think this is the best place to describe the things we were talking about w.r.t. securing the kubelet.

@liggitt
Copy link
Member

liggitt commented Mar 5, 2015

KubeletConfig{}/NewMainKubelet() already take a client.Client used to call the master API. That already provides a way to give the kubelet the master CA and credentials to use against the API (cert, token, etc).

I'd like to start by plumbing TLS, server cert, and server key options from the KubeletConfig down to the server start. First stab at that is here: #5104

Note that there are still places that assume http and 10250 (like minion.ResourceLocation) that need to know the particular scheme and port for a given node. I think that means the node API object should probably contain that info in addition to the hostname.

@liggitt
Copy link
Member

liggitt commented Mar 5, 2015

Before doing the kubelet will pass its public key and location to the master, could we update the Node API object to keep track of the location (scheme, host, port)? That seems like a much smaller change that would prereq what is described in #3168 (comment), but would enable manually registering nodes using https or alternate ports (and would also let us fix minion.ResourceLocation)

@roberthbailey
Copy link
Contributor

@LiGgit I'm finally getting some time to work on this issue and it seems like you've gotten a bit of a start for me. Thanks!

The list items above were a bit hand-wavy and weren't meant to represent consecutive PRs but rather the general plan forward, knowing that it will need to be tweaked as I dig into the details for each step. For your specific question, when the node registers with the master, we can certainly store a bit of extra information about how to contact the node, including port (I hadn't considered storing the scheme, since I'd just assumed everything would need to move to https but we can discuss that when we get there).

@alex-mohr alex-mohr assigned cjcullen and unassigned roberthbailey Jun 3, 2015
@alex-mohr alex-mohr added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 3, 2015
@alex-mohr
Copy link
Contributor

Update: @cjcullen is actively driving the work to secure the master -> kubelet communication for the proxy -- CJ, can you please mention this issue for the relevant PRs so we can track them as they land? (And thanks also to @brendandburns for helping out!)

@timothysc
Copy link
Member

cc @rrati

@cjcullen
Copy link
Member

cjcullen commented Jun 9, 2015

SSH proxy code is in. The only remaining question is whether to leave 10250 open to the public internet (in which case we'd still need to secure it) or close it off and only listen inside the cluster.
@a-robinson @vishh

@roberthbailey
Copy link
Contributor

We've decided not to leave 10250 open to the internet, so I'm bumping this to the v1.0-post milestone (since we still want to add better security to the kubelet's http endpoint, just not for 1.0).

@roberthbailey roberthbailey modified the milestones: v1.0-post, v1.0 Jun 9, 2015
@roberthbailey roberthbailey added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jun 23, 2015
@bgrant0607 bgrant0607 removed this from the v1.0-post milestone Jul 24, 2015
@wingedkiwi
Copy link

I'm trying to follow the status of this issue. Is the master -> kubelet connection still insecure?

@roberthbailey
Copy link
Contributor

The master -> kubelet connection is not secure enough to be run across the internet. The master currently only connects to the kubelet for proxying user requests that need to be forwarded to the kubelet (e.g. hitting the /api/v1/proxy/ endpoint or using kubectl exec / kubectl logs). In most cases, this is done over a local secure network. For GKE, we use ssh tunnels to securely put packets onto the cluster's network without exposing the kubelet's web server to the internet.

Remaining work: The kubelet needs to serve its https endpoint with a certificate that is signed by the cluster CA. Right now it uses a self signed cert for its web server, even though it uses a client certificate signed by the CA as credentials to authenticate to the master. The master needs to have a client certificate signed by the cluster CA to present to the kubelet when connecting.

We are also looking at moving the proxying functionality out of the master, which would entirely remove the need for the master to connect to the kubelet, making the work to secure the kubelet unnecessary. I'm not sure whether that will land first or if it's still worth trying to secure the master -> kubelet communications.

@smarterclayton
Copy link
Contributor

The proxying function still has to go somewhere, so I would assume that has
to be secured? :) Is there an issue for the move out?

On Fri, Jul 24, 2015 at 12:26 PM, Robert Bailey notifications@github.com
wrote:

The master -> kubelet connection is not secure enough to be run across the
internet. The master currently only connects to the kubelet for proxying
user requests that need to be forwarded to the kubelet (e.g. hitting the
/api/v1/proxy/ endpoint or using kubectl exec / kubectl logs). In most
cases, this is done over a local secure network. For GKE, we use ssh
tunnels to securely put packets onto the cluster's network without exposing
the kubelet's web server to the internet.

Remaining work: The kubelet needs to serve its https endpoint with a
certificate that is signed by the cluster CA. Right now it uses a self
signed cert for its web server, even though it uses a client certificate
signed by the CA as credentials to authenticate to the master. The master
needs to have a client certificate signed by the cluster CA to present to
the kubelet when connecting.

We are also looking at moving the proxying functionality out of the
master, which would entirely remove the need for the master to connect to
the kubelet, making the work to secure the kubelet unnecessary. I'm not
sure whether that will land first or if it's still worth trying to secure
the master -> kubelet communications.


Reply to this email directly or view it on GitHub
#3168 (comment)
.

Clayton Coleman | Lead Engineer, OpenShift

@roberthbailey
Copy link
Contributor

There's been quite a bit of discussion internally (not necessarily conclusive one way or another), but I don't know if there's an issue open.... looking.

@roberthbailey
Copy link
Contributor

Looks like there are (at least) two: #10209 and #3481.

@roberthbailey
Copy link
Contributor

I've created #11816 to discuss securing the master -> node communication, so I'm going to close this issue in favor of that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nodecontroller area/security priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

No branches or pull requests