Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services v2 (ip-per-service) #1402

Merged
merged 6 commits into from
Oct 16, 2014
Merged

Services v2 (ip-per-service) #1402

merged 6 commits into from
Oct 16, 2014

Conversation

thockin
Copy link
Member

@thockin thockin commented Sep 22, 2014

DO NOT COMMIT. Needs test updates and a new e2e test. Some FIXMEs still. 95% complete.

Add support for IP-per-service behavior. This creates virtual IP addresses for each service, and uses iptables to route traffic "magically".

The first few commits are relatively independent, and could be extracted to different PRs, if you feel like making me suffer.

See #1107 for background.

@thockin thockin force-pushed the services_v2 branch 13 times, most recently from f23d4f6 to fa5e46d Compare September 29, 2014 20:45
@thockin
Copy link
Member Author

thockin commented Sep 29, 2014

Anyone who was looking at this - I think it is almost done. I need to do better unit and e2e testing of the final solution, and there are a couple of FIXMEs left (any salt experts want to help?).

I am happy to start entertaining review comments at this time.

@thockin
Copy link
Member Author

thockin commented Sep 29, 2014

@eyakubovich We should consider how this affects flannel

@thockin thockin force-pushed the services_v2 branch 2 times, most recently from bcb4538 to 0d8b76d Compare September 30, 2014 04:21
@eyakubovich
Copy link

At first look I think it should be OK. If flannel is executed with --ip-masq, it installs a masquerade rule for traffic going from a flannel network to the outside world (same stuff Docker does). This PR is all about DNAT while flannel is doing SNAT so they should not conflict.


We expect that using iptables for portals will work at small scale, but will
not scale to large clusters with thousands of services. We hope that, by that
time, we will have moved to a model wherein each pod declares which services
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one possible solution, but I don't think it's a foregone conclusion that we'll go that way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we intercepted DNS resolutions, could we lazily set up iptables rules at DNS resolution time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make that optimization if we decided it was important to keep global portals, no pre-declarations, and scaling was actually a problem. IMO it's too clever by half, but it's worth keeping in our pocket, I suppose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we shouldn't do the optimization now. My point was that requiring users to declare services used isn't the only solution, and I have concerns about the impact that would have on ease of use and use-case coverage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should sketch out what the user flow for both user-directed services and global services looks like and determine whether we want both or just one. Also have the discussion about how the ordering problems as implemented today for global services and pods affects actual user interaction.

----- Original Message -----

+Services detailed diagram
+
+## Shortcomings
+
+Part of the service specification is a createExternalLoadBalancer
flag,
+which tells the master to make an external load balancer that points to
the
+service. In order to do this today, the service proxy must answer on a
known
+(i.e. not random) port. In this case, the service port is promoted to the
+proxy port. This means that is is still possible for users to collide
with
+each others services or with other pods. We expect most services will
not
+set this flag, mitigating the exposure.
+
+We expect that using iptables for portals will work at small scale, but
will
+not scale to large clusters with thousands of services. We hope that, by
that
+time, we will have moved to a model wherein each pod declares which
services

I agree that we shouldn't do the optimization now. My point was that
requiring users to declare services used isn't the only solution, and I have
concerns about the impact that would have on ease of use and use-case
coverage.


Reply to this email directly or view it on GitHub:
https://github.com/GoogleCloudPlatform/kubernetes/pull/1402/files#r18324012

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit more what you want to see? I'm too close to it, I
think.

On Thu, Oct 2, 2014 at 7:10 AM, Clayton Coleman notifications@github.com
wrote:

In docs/services.md:

+Services detailed diagram
+
+## Shortcomings
+
+Part of the service specification is a createExternalLoadBalancer flag,
+which tells the master to make an external load balancer that points to the
+service. In order to do this today, the service proxy must answer on a known
+(i.e. not random) port. In this case, the service port is promoted to the
+proxy port. This means that is is still possible for users to collide with
+each others services or with other pods. We expect most services will not
+set this flag, mitigating the exposure.
+
+We expect that using iptables for portals will work at small scale, but will
+not scale to large clusters with thousands of services. We hope that, by that
+time, we will have moved to a model wherein each pod declares which services

Maybe we should sketch out what the user flow for both user-directed
services and global services looks like and determine whether we want both
or just one. Also have the discussion about how the ordering problems as
implemented today for global services and pods affects actual user
interaction.
... <#148d1324a4b6f164_>
----- Original Message -----

  • +Services detailed diagram > + > +##
    Shortcomings > + > +Part of the service specification is a
    createExternalLoadBalancer > flag, > +which tells the master to make an
    external load balancer that points to > the > +service. In order to do this
    today, the service proxy must answer on a > known > +(i.e. not random)
    port. In this case, the service port is promoted to the > +proxy port. This
    means that is is still possible for users to collide > with > +each others
    services or with other pods. We expect most services will > not > +set
    this flag, mitigating the exposure. > + > +We expect that using iptables
    for portals will work at small scale, but > will > +not scale to large
    clusters with thousands of services. We hope that, by > that > +time, we
    will have moved to a model wherein each pod declares which > services I
    agree that we shouldn't do the optimization now. My point was that
    requiring users to declare services used isn't the only solution, and I
    have concerns about the impact that would have on ease of use and use-case
    coverage. --- Reply to this email directly or view it on GitHub:
    https://github.com/GoogleCloudPlatform/kubernetes/pull/1402/files#r18324012

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/pull/1402/files#r18341089
.

@thockin thockin changed the title WIP: Services v2 (ip-per-service) Services v2 (ip-per-service) Oct 3, 2014
@thockin thockin force-pushed the services_v2 branch 2 times, most recently from 77d42a5 to 690dba9 Compare October 3, 2014 20:12
@thockin
Copy link
Member Author

thockin commented Oct 3, 2014

I know this is an enormous PR - I can break it up into smaller PRs, though the commits are pretty clean.

I'd really appreciate someone looking at at least the first few commits, though I think it is functionally complete modulo some improved testing.

@smarterclayton do you have some iptables comprehension? Any other volunteers?

@bgrant0607
Copy link
Member

@thockin I can take a look.

@thockin
Copy link
Member Author

thockin commented Oct 3, 2014

I welcome all comers :) The first 4 or 5 commits are self-contained and
not too subtle

On Fri, Oct 3, 2014 at 1:33 PM, bgrant0607 notifications@github.com wrote:

@thockin https://github.com/thockin I can take a look.

Reply to this email directly or view it on GitHub
#1402 (comment)
.

@@ -96,9 +96,9 @@ There are 4 ways that a container manifest can be provided to the Kubelet:

### Kubernetes Proxy

Each node also runs a simple network proxy. This reflects `services` as defined in the Kubernetes API on each node and can do simple TCP stream forwarding or round robin TCP forwarding across a set of backends.
Each node also runs a simple network proxy. This reflects `services` (see [here](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/serices.md) for more details) as defined in the Kubernetes API on each node and can do simple TCP and UDP stream forwarding (round robin) across a set of backends.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I have found that relative references work better than absolute ones -- docs/....md. Also, you have a typo: serices

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push

# $3: service port
function wait_for_service_down() {
for i in $(seq 1 20); do
$(ssh-to-node "${test_node}" "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

back at you :)

@thockin
Copy link
Member Author

thockin commented Oct 15, 2014

e2e test passes on GCE every time. It still fails sometimes on vagrant - failing to connect in places that just shouldn't fail. I give up and am punting it to the Vagrant folks. @derekwaynecarr @pweil- - your ball.

@derekwaynecarr
Copy link
Member

I am fine merging with the current behavior and we can separately address the consistency of the results.

There has been some talk about looking into a libvirt based VM to improve performance over VirtualBox for Linux based devs or kernel devs in general.

On Oct 15, 2014, at 6:56 PM, Tim Hockin notifications@github.com wrote:

e2e test passes on GCE every time. It still fails sometimes on vagrant - failing to connect in places that just shouldn't fail. I give up and am punting it to the Vagrant folks. @derekwaynecarr @pweil- - your ball.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member Author

thockin commented Oct 16, 2014

As far as I can tell this PR is done.

@thockin thockin force-pushed the services_v2 branch 2 times, most recently from 41aef32 to 9b5162c Compare October 16, 2014 02:55
@pweil-
Copy link
Contributor

pweil- commented Oct 16, 2014

I agree with @derekwaynecarr. We can work on the specific env issues separately. LGTM

@thockin
Copy link
Member Author

thockin commented Oct 16, 2014

Bombs away!

thockin added a commit that referenced this pull request Oct 16, 2014
Services v2 (ip-per-service)
@thockin thockin merged commit 3436775 into kubernetes:master Oct 16, 2014
@thockin thockin mentioned this pull request Oct 16, 2014
@thockin thockin deleted the services_v2 branch October 17, 2014 06:17
pietern pushed a commit to pietern/kubernetes that referenced this pull request Oct 29, 2014
jbeda pushed a commit to jbeda/kubernetes that referenced this pull request Oct 30, 2014
In preparation for kubernetes#1402.

(cherry picked from commit fa24fac)
spothanis pushed a commit to spothanis/kubernetes that referenced this pull request Mar 24, 2015
In preparation for kubernetes#1402.

(cherry picked from commit fa24fac)
gnufied pushed a commit to gnufied/kubernetes that referenced this pull request Nov 14, 2022
…-openshift-master

Bug OCPBUGS-2991: Disable expansion in SC, if driver does not support it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.