Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better ExternalName support #46332

Merged
merged 11 commits into from
Oct 27, 2023

Conversation

howardjohn
Copy link
Member

@howardjohn howardjohn commented Aug 4, 2023

Github won't let me use my old PR (#37365) so making a new one...
Fixes #37331

Example config:

apiVersion: v1
kind: Service
metadata:
  name: b-ext
  namespace: echo
spec:
  ports:
  - name: tcp
    port: 80
    targetPort: 18080
  type: ExternalName
  externalName: b.echo.svc.cluster.local

Kubernetes behavior: b-ext creates a CNAME record for b. Port is not relevant.

Istio (today): Create a DNS cluster for each port, which ends up resolving to Cluster IP. Match on Host header (http), SNI (tls), or just port (TCP) (meaning it captures all traffic on that port).

Issues:

  • Port matters in Istio but not Kubernetes
  • We get minimal functionality because we only have the cluster IP. No load balancing, auto mtls, etc
  • TCP doesn't work well since there is no VIP - instead we capture ALL traffic on the configured port

This PR changes the implementation to more closely align with Kubernetes.

ExternalName becomes a new "Alias" service type internally. An alias simply adds extra matching logic to existing Services. For example, above we would add a match for b-ext everywhere we match for b. This works for HTTP and TLS; for TCP it is already not needed as it works today (envoy will see traffic to b and b-ext as the same). If there was no Service/ServiceEntry defined at all for the target externalName, then we would end up just passing through.

Note that port from ExternalName is not used at all, just like in Kubernetes.

If we have a destination in VirtualService referring to an ExternalName service explicitly, this will now no longer work unless they have also defined the referenced externalName as a concrete service. For example:

apiVersion: v1
kind: Service
metadata:
  name: google
spec:
  type: ExternalName
  externalName: google.com

If I forward to this in a VS, it will work IFF I have a ServiceEntry defined for google.com. As such, it is generally not recommended to do this (you could just point to google.com directly), but can be useful when use third party Ingress configs, etc. Other Ingress implementations have extremely inconsistent support for ExternalName -- I couldn't find two implementations that behaved the same way -- so there isn't much of a standard to follow here.

One major difference in functionality with this change is that the ExternalName service is now just an alias, not its own "thing" (Envoy cluster). This means we cannot apply different rules for it. However, this can be done with standard DR subsets.

TODO:

  • Add tests. Critical ones are Sidecar scoping
  • Add feature flag to retain old behavior
  • Add release note
  • How should VS behave? if I bind to the alias service, does it apply or not?

@istio-testing
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@istio-testing istio-testing added do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 4, 2023
@howardjohn
Copy link
Member Author

/test all

1 similar comment
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch 2 times, most recently from b9448bf to 8637065 Compare August 4, 2023 22:21
@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch 2 times, most recently from ab57ba5 to 2696ac3 Compare August 14, 2023 20:19
@istio-testing istio-testing added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 14, 2023
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch 2 times, most recently from 0de2b95 to df12c75 Compare August 14, 2023 23:33
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch from df12c75 to 25cf713 Compare August 15, 2023 16:02
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch from 25cf713 to 936d646 Compare August 15, 2023 17:44
@howardjohn
Copy link
Member Author

/test all

@linsun
Copy link
Member

linsun commented Aug 16, 2023

If we have a destination in VirtualService referring to an ExternalName service explicitly, this will now no longer work unless they have also defined the referenced externalName as a concrete service.

What do you mean by referenced externlName? The first example with ports declared?

@howardjohn
Copy link
Member Author

If we have a destination in VirtualService referring to an ExternalName service explicitly, this will now no longer work unless they have also defined the referenced externalName as a concrete service.

What do you mean by referenced externlName? The first example with ports declared?

apiVersion: v1
kind: Service
metadata:
  name: b-ext
  namespace: echo
spec:
  ports:
  - name: tcp
    port: 80
    targetPort: 18080
  type: ExternalName
  externalName: b.com

if I reference b-ext as a VS destination: If b.com is a ServiceEntry, this works. If its not, it will not work.

@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Aug 31, 2023
@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch from 936d646 to 9361515 Compare September 15, 2023 20:12
@istio-testing istio-testing removed the needs-rebase Indicates a PR needs to be rebased before being merged label Sep 15, 2023
@howardjohn
Copy link
Member Author

/ok-to-test

@istio-testing istio-testing added the ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member. label Sep 19, 2023
@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch from 9361515 to 690c921 Compare September 19, 2023 16:00
@howardjohn
Copy link
Member Author

/test all

@howardjohn
Copy link
Member Author

I can see this pr can resolve the hijack traffic to given port in external name service. But not sure how it resolves

ingress #30547

We have an explicit test that this works: see routing.go gateway test under externalNameCases. I have also manually tested their setup and knative (they used knative; knative makes extensive use of ExternalName) and it worked.

DNS floods like

The DNS floods are caused by the old code creating STRICT_DNS clusters for the services. This results in Svc*Ports per svc*Proxies/TTL RPS.

If you have 30s ttl, 1000 external name svc with 2 ports each, and 2000 proxies, that is 133k QPS. We have seen this level of DNS traffic caused by ExternalName in real world production clusters (obviously not for long, as the DNS server cost was astronomical, so they dropped Istio...).

This all happens even under zero traffic.

This PR drops this to zero additional overhead by Istio. While users will still trigger DNS queries in their applications, this is not a source of this level of DNS traffic, and is something already present in existing users clusters.

@howardjohn howardjohn force-pushed the pilot/service-entry-alias branch from 1c3e530 to f8a5a02 Compare October 25, 2023 22:35
@hzxuzhonghu
Copy link
Member

ExternalName service does not work well for tls https because of host, this is documented in k8s doc. There is no need for istio to define the correct behavior, actually it is difficult to define which one is right.

For DNS floods, if we intercept the DNS resolve from envoy to local istio DNS proxy, then it is more simple. #47567

And I would also suggest adding a knob for external name service support. So users can completely disable it

@howardjohn
Copy link
Member Author

ExternalName service does not work well for tls https because of host, this is documented in k8s doc. There is no need for istio to define the correct behavior, actually it is difficult to define which one is right.

Agree, we should not invent new behavior here, just keep existing k8s behavior (as this PR does)

For DNS floods, if we intercept the DNS resolve from envoy to local istio DNS proxy, then it is more simple. #47567

This does not fully resolve the issue, only where they are using trivial aliasing to other services (pointing to a real external service like google.com , as is most common, will not work this way) and using the experimental DNS proxy feature. Additionally, it requires dramatic changes to the DNS architecture which I don't think are really feasible given it introduces a circular dependency.

Again, DNS flooding is only 1 of 5+ reasons to do this PR. So while it's nice we could mitigate the issue in some cases without this PR, I don't believe it's a compelling reason to not move forward with this PR.

And I would also suggest adding a knob for external name service support. So users can completely disable it

I don't mind that, but this PR is still critical. ExternalName is a core v1 Kubernetes feature, and Istiod must support it reasonably out of the box. The current poor support has repeatedly come up as one of the biggest blockers.

@howardjohn
Copy link
Member Author

@hzxuzhonghu can you take another look so we can keep moving forward? I am worried this will mo d the release back if we keep discussing at 1 message per day

@hzxuzhonghu
Copy link
Member

@howardjohn
For users who can not accept the behavior change and also want to resolve the issues, what should we do?
And there are no other matters, except the resolveServiceAliases. Can you add a benchmark for it, so we can evaluate how long it takes for 1000, 5000, 10000 services.

@howardjohn
Copy link
Member Author

howardjohn commented Oct 26, 2023 via email

@howardjohn
Copy link
Member Author

@hzxuzhonghu added benchmarks. 100 and 5k ExternalName services tested:

name                                    old time/op        new time/op        delta
InitPushContext/externalname-100-8             108µs ±14%         135µs ±15%  +24.81%  (p=0.000 n=8+10)
InitPushContext/externalname-5000-8           12.9ms ±10%        11.4ms ± 9%  -12.06%  (p=0.000 n=10+9)
RouteGeneration/externalname-100-8             400µs ± 2%           7µs ± 2%  -98.34%  (p=0.000 n=9+8)
RouteGeneration/externalname-5000-8           32.6ms ±12%         0.0ms ±15%  -99.94%  (p=0.000 n=10+9)
ClusterGeneration/externalname-100-8          1.22ms ± 3%        0.03ms ±32%  -97.41%  (p=0.000 n=9+10)
ClusterGeneration/externalname-5000-8         66.0ms ± 3%         0.0ms ±16%  -99.93%  (p=0.000 n=9+10)
ListenerGeneration/externalname-100-8          105µs ± 2%         138µs ±24%  +31.60%  (p=0.000 n=9+10)
ListenerGeneration/externalname-5000-8        1.39ms ± 7%        0.75ms ±17%  -45.88%  (p=0.000 n=9+9)

name                                    old alloc/op       new alloc/op       delta
InitPushContext/externalname-100-8            68.8kB ± 1%       114.1kB ± 0%  +65.78%  (p=0.000 n=10+10)
InitPushContext/externalname-5000-8           8.02MB ± 5%        8.21MB ± 5%     ~     (p=0.182 n=10+9)
RouteGeneration/externalname-100-8             368kB ± 0%           5kB ± 0%  -98.67%  (p=0.000 n=10+10)
RouteGeneration/externalname-5000-8           17.3MB ± 0%         0.0MB ± 0%  -99.97%  (p=0.000 n=10+10)
ClusterGeneration/externalname-100-8           574kB ± 0%          16kB ± 0%  -97.21%  (p=0.000 n=9+10)
ClusterGeneration/externalname-5000-8         28.5MB ± 0%         0.0MB ± 0%  -99.94%  (p=0.000 n=10+8)
ListenerGeneration/externalname-100-8         64.2kB ± 0%        64.2kB ± 0%   +0.00%  (p=0.009 n=9+9)
ListenerGeneration/externalname-5000-8         143kB ± 0%         143kB ± 0%     ~     (p=0.123 n=9+10)

name                                    old allocs/op      new allocs/op      delta
InitPushContext/externalname-100-8               579 ± 0%           401 ± 0%  -30.74%  (p=0.000 n=10+10)
InitPushContext/externalname-5000-8            25.2k ± 0%         15.4k ± 0%  -39.16%  (p=0.000 n=10+9)
RouteGeneration/externalname-100-8             2.80k ± 0%         0.05k ± 0%  -98.21%  (p=0.000 n=10+10)
RouteGeneration/externalname-5000-8             135k ± 0%            0k ± 0%  -99.96%  (p=0.000 n=10+10)
ClusterGeneration/externalname-100-8           9.23k ± 0%         0.22k ± 0%  -97.64%  (p=0.000 n=10+10)
ClusterGeneration/externalname-5000-8           457k ± 0%            0k ± 0%  -99.95%  (p=0.000 n=10+10)
ListenerGeneration/externalname-100-8            630 ± 0%           630 ± 0%     ~     (all equal)
ListenerGeneration/externalname-5000-8         5.53k ± 0%         5.53k ± 0%     ~     (all equal)

name                                    old kb/msg         new kb/msg         delta
RouteGeneration/externalname-100-8              41.0 ± 0%           0.4 ± 0%  -99.00%  (p=0.000 n=10+10)
RouteGeneration/externalname-5000-8            2.07k ± 0%         0.00k ± 0%  -99.98%  (p=0.000 n=10+10)
ClusterGeneration/externalname-100-8            59.2 ± 0%           1.8 ± 0%  -97.01%  (p=0.000 n=10+10)
ClusterGeneration/externalname-5000-8          2.93k ± 0%         0.00k ± 0%  -99.94%  (p=0.000 n=10+10)
ListenerGeneration/externalname-100-8           14.7 ± 0%          14.7 ± 0%     ~     (all equal)
ListenerGeneration/externalname-5000-8          14.7 ± 0%          14.7 ± 0%     ~     (all equal)

name                                    old resources/msg  new resources/msg  delta
RouteGeneration/externalname-100-8              1.00 ± 0%          1.00 ± 0%     ~     (all equal)
RouteGeneration/externalname-5000-8             1.00 ± 0%          1.00 ± 0%     ~     (all equal)
ClusterGeneration/externalname-100-8             105 ± 0%             5 ± 0%  -95.24%  (p=0.000 n=10+10)
ClusterGeneration/externalname-5000-8          5.00k ± 0%         0.01k ± 0%  -99.90%  (p=0.000 n=10+10)
ListenerGeneration/externalname-100-8           3.00 ± 0%          3.00 ± 0%     ~     (all equal)
ListenerGeneration/externalname-5000-8          3.00 ± 0%          3.00 ± 0%     ~     (all equal)

So in init PushContext (which I think is where your concerns were) there is a slight increase for a small number of ExternalName. if we start to have a lot, its actually better.

Here is zero externalNames (no diff):

name                    old time/op    new time/op    delta
InitPushContext/http-8     142µs ±13%     141µs ±12%   ~     (p=0.931 n=9+9)

name                    old alloc/op   new alloc/op   delta
InitPushContext/http-8    89.0kB ± 0%    89.1kB ± 1%   ~     (p=0.188 n=9+10)

name                    old allocs/op  new allocs/op  delta
InitPushContext/http-8       671 ± 0%       671 ± 0%   ~     (all equal)

The big difference is in XDS gen, which is incredibly more efficient in this PR (since we generate way way less XDS)

So overall I think this will be a big improvement in performance for most cases

@hzxuzhonghu
Copy link
Member

Great test, i have one doubt why InitPushContext have a better performance?

@howardjohn
Copy link
Member Author

Great test, i have one doubt why InitPushContext have a better performance?

There is another codepath we hit for the service otherwise that was slow. I forget what function exactly

@costinm
Copy link
Contributor

costinm commented Oct 27, 2023 via email

@costinm
Copy link
Contributor

costinm commented Oct 27, 2023 via email

@hzxuzhonghu
Copy link
Member

@costinm As your comment, In a k8s env, it seems there is no need to create a k8s service at all. We can have a discussion via slack though.

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@howardjohn Thanks, imo, we can do some more code cleanup later

@istio-testing istio-testing merged commit 65d15b6 into istio:master Oct 27, 2023
1 check passed
@istio-testing
Copy link
Collaborator

In response to a cherrypick label: new pull request created: #47613

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherrypick/release-1.20 Set this label on a PR to auto-merge it to the release-1.20 branch ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better support ExternalName
10 participants