-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial design doc for load-balancers & public-port services #7447
Conversation
On Tue, Apr 28, 2015 at 9:56 AM, Justin Santa Barbara <
Cool. I like.
I'd vote for moving publicIP's from spec to status (and if necessary
It breaks my heart, but agree with limiting scope for v1.0. Adding it I'll read your doc today, and comment there too :-) Q I can imagine that we might want to configure a system-pod that runs
|
@@ -0,0 +1,278 @@ | |||
# Load balancing & publicly accessible services |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be docs/design/public_ips.md or similar.
I spent some time thinking about the general service status issue. I think we've got two parts of the story for services mostly thought through There are a bunch of ways:
Today in Kube there's no way to find out what the DNS name of your service is - we could define it by convention, but it's not discoverable by tools or end users. In OpenShift we have a route; one or more DNS names (plus port or path or protocol) that themselves are located on an IP or a stable DNS name. We point to the service, but there's no way the service can describe the different ways it is currently being pointed to. Why not create an ingress list? It's possibly more varied that endpoints, and it could be a separate resource, but at least right now it feels more like:
It's a fairly annoying problem that a client can't find out which internal point to talk to, or discover external paths. Exposing some sort of structured object like this (as service status or as a separate resource) would allow clients to determine how to talk to them. |
And to be clear, what is above is a sketch of the discovery side of services - actual objects (GCE API load balancer, OpenShift Route, future FloatingIP object) would still exist and have controllers that registered/announced themselves into the service. |
1. API server assigns a hostPort (if the load balancer needs one), accepts and persists | ||
1. LB Controller wakes up, allocates a load-balancer, and `POST`s `/api/v1/namespaces/foo/services/bar` `Service{ Status{ publicIps: [ "1.2.3.4" ] }}` | ||
1. Service REST sets `Service.Status.publicIps = [ "1.2.3.4" ]` | ||
1. kube-proxy wakes up and sets iptables to receive on 1.2.3.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except when the load balancer allocates a DNS name, and it needs to listen on a host port? As written, this seems muddled together with the public case, rather than one being nested in the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was definitely muddled here. Making loadbalancer => public simplifies this, I think. I wrote out the two main cases (GCE, AWS) separately.
They use very different models
@smarterclayton I agree that figuring out the ingress method for a service is too complicated right now. I like the very structured method you suggested, though I don't think I'll have enough time to implement it. I think the two main use cases here are an external program consuming a service, and a service published through a load-balancer for general Internet usage.
Definitely something to improve in V2. What I suspect we'll end up doing in V1 or V1.1 is publishing some helpful records into skydns, and setting visibility==public on skydns. |
The two things I think we have to do for 1.0 is
|
@quinton-hoole I don't think I understand "move portalIP allocation to allow multi-master setups" I believe this could be done synchronously, by essentially storing the currently-in-memory allocation pool in etcd. We could pack the data into one key and use a compare-and-swap, or we could have one key per portalIP. In one of my PRs I created a pool helper class (pool_allocator.go) - I had in the back of my mind that we would likely change the mechanism and wanted to avoid copying & pasting: Is this the problem you're talking about, and do you think we could use etcd in this way? |
I'll assure you meant me. Basically yes. We've talked about it in 6 or 7 different issues - the current thought is a CAS from a controller because we don't want to waste time solving contention / issues right now (want to get it out of crit path faster). If you want to tackle it (in whatever way you want) please do - it's on the short list of issues from the Red Hat side for 1.0 but we haven't gotten to it yet.
|
Ooops, sorry Clayton (& Quinton!). That'll teach me to read threads on my phone... I don't think moving to etcd is necessary for this particular piece of work? But it sounds like if I allocate a public-port, I should use something like that utility class, so that we can more easily replace it for an etcd-backed pool for multi-master. |
|
||
1. **namespace** The service is only accessible to pods belonging to the same namespace. This is similar | ||
to _cluster_, but with additional firewall rules. | ||
1. **dmz** The service is accessible to pods that are in the DMZ. This is similar to _cluster_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could someone point me to prior discussion on dmz? I don't understand why this isn't namespace-scoped visibility, plus another mechanism to restrict access to services outside the namespace, which seems like something that belongs in SecurityContext.
Given how much th eimpl drifted from this, I think we should nix this doc, and instead update services.md to detail the final implementation. Killing this PR. |
FAO: @thockin @smarterclayton @quinton-hoole @ArtfulCoder
(Probably of interest also to @a-robinson @benmccann and @bakins)
Initial design doc for public-port services & load-balancers.
I have tried to synthesize all the discussion in #6910, but am also mindful of something that we can hope to achieve in the V1 timeframe.
A big simplification was that @a-robinson pointed out that the cloudprovider implementation can return the IPs that are associated with a load balancer (and that this is a good change long-term for reconciling failed create operations). That moves the reported IPs from Spec to Status, which then means we don't need to address the object-update-scoping problems now. I think.
I kept publicIPs more-or-less as-is, but they move from Spec to Status. Does this address your concerns with them @thockin? And then do we need loadbalancerIPs?
There has been some discussion of whether we should rename portalIP to something more self-evident. I think that is unrelated (although my vote is for clusterIP). While we're talking naming, I think I prefer publicPort to hostPort. And is sidecar-pod the right word?
Also not in-scope: adding phases / lifecycle events to services.
I can imagine that we might want to configure a system-pod that runs haproxy & listens on port 80 & 443. This would be very useful for Digital Ocean. I think this would be specially configured, and would bypass the API checks. I don't think we need to do anything for that just yet, so also out-of-scope.