Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial ambient dual stack support #51606

Merged
merged 4 commits into from
Jun 24, 2024

Conversation

howardjohn
Copy link
Member

@howardjohn howardjohn commented Jun 17, 2024

This has a few changes:

  • pass all pod IPs down WDS
  • Pass all service VIPs down WDS
  • Add a new IPFamilies field to service. This is used for two things:
    1. DNS lookups for headless services MUST not return unsupported families. Without this field, it is impossible to know which workload IPs to filter
    2. When we select a workload IP after resolving a service, we must pick a supported family. This is possible to do without for non-headless services (by inferring from the VIPs field), but easier to be explicit.
  • Enable IPv6 in the CNI by default. This should work in all cases, even on IPv4 only clusters. Note pure IPv4 nodes (no kernel support for ipv6 at all) is already broken on ambient (Issue trying ambient mode on an ipv4-only k8s cluster ztunnel#1131)

Corresponding ztunnel change: istio/ztunnel#1161

@istio-testing
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@istio-testing istio-testing added do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 17, 2024
@bleggett
Copy link
Contributor

bleggett commented Jun 17, 2024

Ah ok - single stack IPv6 worked in ambient (I think we got at least one successful test) but we still have gotchas/bugs around "things might have more than one IP".

(hopefully not too many of those)

@bleggett
Copy link
Contributor

bleggett commented Jun 17, 2024

I would rather not use dualstack terminology in ambient tho - we shouldn't need to "support" dualstack. We should support V4 and V6 (together or independently, and probably tested and configured together and/or independently).

If we have bugs when support for both is enabled, that's probably just A Bug, and not really "support for a feature" we lack.

@howardjohn
Copy link
Member Author

TBH I disagree. Its totally different to have 2 IPs vs 1.

But also this is a PR not user facing, so we can have the docs folks bikeshed it when the time comes 🙂

@bleggett
Copy link
Contributor

TBH I disagree. Its totally different to have 2 IPs vs 1.

It is, my point is "the lack of support for more than one IP" is just a bug, especially in a K8S context.

But also this is a PR not user facing, so we can have the docs folks bikeshed it when the time comes 🙂

It's fine as long as we have tests for either/or/both - I'm less concerned with users here and more concerned with how the language we use affects how we view the problem.

@istio-testing istio-testing added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 18, 2024
@howardjohn howardjohn added the do-not-merge/hold Block automatic merging of a PR. label Jun 18, 2024
@howardjohn
Copy link
Member Author

/test all

Copy link
Contributor

@louiscryan louiscryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems safe from an API/upgrade/back-compat perspective.

@@ -193,6 +193,17 @@ func (a *index) constructService(svc *v1.Service, w *Waypoint) *workloadapi.Serv
Mode: workloadapi.LoadBalancing_STRICT,
}
}
ipPolicy := workloadapi.IPFamilies_AUTOMATIC
if len(svc.Spec.IPFamilies) == 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems potentially a little brittle, can users do silly things like put v1.IPv4Protocol in the list twice ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No the validation is pretty strict. We could make it more robust anyways though

@howardjohn howardjohn added the release-notes-none Indicates a PR that does not require release notes. label Jun 18, 2024
@howardjohn
Copy link
Member Author

/retest

@howardjohn howardjohn marked this pull request as ready for review June 19, 2024 00:22
@howardjohn howardjohn requested review from a team as code owners June 19, 2024 00:22
@istio-testing istio-testing removed the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Jun 19, 2024
cips = []string{svc.Spec.ClusterIP}
}
for _, cip := range cips {
if cip != "" && cip != v1.ClusterIPNone {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add ut for this function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#51665 adds some basic ones, will make it easier to add new ones. if we can merge that it will help and Ill update this PR with a test

@@ -155,7 +155,20 @@ func (a *index) podWorkloadBuilder(
if (!IsPodRunning(p) && !IsPodPending(p)) || p.Spec.HostNetwork {
return nil
}
podIP, err := netip.ParseAddr(p.Status.PodIP)
k8sPodIPs := p.Status.PodIPs
if len(k8sPodIPs) == 0 && p.Status.PodIP != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question, when Status.PodIPs can be empty? I guess we donot need to check it since have checked Running status before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly defense in depth against weird Pods, shouldn't really happen. In particular our tests often just set podIp and not PodIps.

But any Pod can be arbitrarily modified by user, so we cannot assume it's valid

Copy link
Contributor

@bleggett bleggett Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also IIRC the k8s docs say that if PodIPs is non-empty, the first entry in the list is PodIP - they don't declare any guarantees that PodIPs will be non-empty. I think this is mostly for back compat with old K8S versions though. So yeah better to check.

@@ -81,6 +79,20 @@ message Service {
// Note: this applies only to connecting directly to the workload; when waypoints are used, the waypoint's load_balancing
// configuration is used.
LoadBalancing load_balancing = 8;

// IP families provides configuration about the IP families this service supports.
IPFamilies ip_families = 9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed, can infer from addresses

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately for headless services there is no vip. If it wasn't for that I wouldn't have added it probably .

see istio/ztunnel#1152

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is headless processed in ztunnel now, it has no ip associated on service.

It should be matched by a workload ip. Since workload has the ip, can we infer from it. Anyway, i am not sure how you want ztunnel to do with dual stack

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for DNS. The linked ztunnel PR has the ztunnel implementation and tests

@@ -193,6 +193,17 @@ func (a *index) constructService(svc *v1.Service, w *Waypoint) *workloadapi.Serv
Mode: workloadapi.LoadBalancing_STRICT,
}
}
ipFamily := workloadapi.IPFamilies_AUTOMATIC
Copy link
Contributor

@bleggett bleggett Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ - are we emulating the k8s ipfamily or ipfamilypolicy API here, or both?

It seems like mostly this is just emulating ipfamilypolicy semantics? Do we plan to hardcode the behavior, or additionally expose ipfamily style APIs for the relevant resources we have in Istio, to match k8s?

"dualstack" is just two problems AFAICT:

  • should you assign more than 1 IP to (some resource with IPs) -> (k8s ipfamilypolicy), I think you said we will probably always assign both if V6 enabled.
  • can you handle more than 1 IP assigned to (some resource with IPs).
  • if you can, how do you priority-order the list of IPs on that resource (k8sipfamily ordering semantics)

The only bit we really care about as A Problem is the ordering of that IP list for non-K8S/Istio-only resources, right? Doesn't matter whether that list contains 1 V4 and 1 V6, 2 V4s, or 5 V6s, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ipFamilyPolicy. This is actually mostly about joining two resources (Service and Pod), and which IPs you should join. Since k8s already pre-filters the IPs (and assigns the IPs), its not like we need to do "which IPs do we assign" , etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we need to do that for stuff like SEs eventually, though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really only need this for headless services selecting pods, which there is no SE equivalent. So I don't think we really do

@@ -155,7 +155,20 @@ func (a *index) podWorkloadBuilder(
if (!IsPodRunning(p) && !IsPodPending(p)) || p.Spec.HostNetwork {
return nil
}
podIP, err := netip.ParseAddr(p.Status.PodIP)
k8sPodIPs := p.Status.PodIPs
if len(k8sPodIPs) == 0 && p.Status.PodIP != "" {
Copy link
Contributor

@bleggett bleggett Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also IIRC the k8s docs say that if PodIPs is non-empty, the first entry in the list is PodIP - they don't declare any guarantees that PodIPs will be non-empty. I think this is mostly for back compat with old K8S versions though. So yeah better to check.

@howardjohn howardjohn removed the do-not-merge/hold Block automatic merging of a PR. label Jun 24, 2024
@istio-testing istio-testing merged commit dc132ea into istio:master Jun 24, 2024
28 checks passed
luksa pushed a commit to luksa/istio that referenced this pull request Oct 14, 2024
* upstream/master:
  Cleanup adsc (istio#51678)
  Initial support for network gateways in ambient krt (istio#51578)
  kclient: use internal indexes instead of external (istio#51576)
  Automator: update ztunnel@master in istio/istio@master (istio#51592)
  Initial ambient dual stack support (istio#51606)
  ambient: support explicit EndpointSlice endpoints (istio#51591)
  align the labels in the pull request tpl to the issue's (istio#51641)
  Automator: update proxy@master in istio/istio@master (istio#51686)
  Don't reset the jwks refresh ticker on URI fetch errors (istio#51637)
  ambient: properly mark pod as not ready when it is shutting down (istio#51650)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes-none Indicates a PR that does not require release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants