Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update workload network when ns network label has changed #47574

Merged
merged 11 commits into from
Nov 22, 2023
14 changes: 14 additions & 0 deletions pilot/pkg/serviceregistry/kube/controller/ambientindex.go
Original file line number Diff line number Diff line change
Expand Up @@ -955,6 +955,20 @@ func (c *Controller) AdditionalPodSubscriptions(
return shouldSubscribe
}

// syncAllWorkloadsForAmbient refreshes all ambient workloads.
func (c *Controller) syncAllWorkloadsForAmbient() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this needs to process k8s gateway resources since Waypoints currently have a network in their key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be related to the render logic of waypoint template? What I've noticed is that when we set values.global.network and process the waypoint template, only the ISTIO_META_NETWORK environment variable is set; the topology.istio.io/network label is not. This results in the network being evaluated differently for waypoints compared to sidecars.

// 1. check the pod/workloadEntry label
if nw := labels[label.TopologyNetwork.Name]; nw != "" {
return network.ID(nw)
}
// 2. check the system namespace labels
if nw := c.networkFromSystemNamespace(); nw != "" {
return nw
}

If the waypoint will have the same behavior as the sidecar injection, then its network will not be affected by this PR. Otherwise, this PR will change the network of the waypoint, even if the ISTIO_META_NETWORK environment variable is set ( from topology.istio.io/network) - which is incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Right now it's really broken. We inject that network label on the Pod, but the waypoint logic looks at the k8s Gateway which likely won't have the network label.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have observed at least once a case where a new Istiod processes the Waypint/Gateway resource first, then the namespace. This results in the ambientindex storing /waypointIP when it should be network/waypointIP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in the ambientindex storing /waypointIP when it should be network/waypointIP.

This makes sense based on my tests with the waypoint network. If we can inject the network label similarly like sidecar injection, then regardless of whether the gateway resource is processed first, the network key of the waypoint will not be affected. I'll take a look to see if we can handle this in the template or deploymentcontroller.

Copy link
Member Author

@hanxiaop hanxiaop Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So would we edit the deployment on network change causing a rollout? Network change should be very rare/close to 1-time so I would be okay with this.

This wouldn't. Originally when the network was changed, we had to rollout the deployment to push new network config change, but now we can dynamically update it to ztunnel without a rollout.

Have to be careful that we don't trigger the deployment when istiod hasn't determined the network properly on startup.

I'll verify this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any update on "verify" this @hanxiaop ?

Copy link
Member Author

@hanxiaop hanxiaop Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@linsun This PR looks good. It works only when the istio-system network label changes, but not during startup. The traffic behavior aligns with the description provided.
However I do see some issues for the ztunnel network setup during the test.

Copy link
Member Author

@hanxiaop hanxiaop Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one issue with this PR where the ztunnel network may be inaccurately represented in the ztunnel config dump: since it processes all workloads after the network label changes, the ztunnel pod network might also be incorrectly altered - even if its network has already been set. Originally, without this PR, the ztunnel network might also be inaccurate if the pod was updated(like annotation updates), as they both share the same syncPod logic. BTW this won't affect the actual traffic behavior of ztunnel since the network is configured as environmental variables during installation and not changed, only the workload configdump shows the incorrect network value.

With some research, I think this is not directly related to this PR. However, we need to address these issues:

  1. We currently lack a uniformed method for setting the ztunnel network during installation, so the network of ztunnel's workload info is altered incorrectly.
  2. The behavior of the ztunnel network for network="" is different from that in proxies. In proxies, network="" implies that the network is reachable. In contrast, for ztunnel, it simply means the network is an empty string without any value. If it does not exactly match the network, the traffic will be unrecognized.

May need @stevenctl to confirm if these are problems. The first one is related to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ztunnel, now it can only get the network from ENV NETWORK

if c.ambientIndex != nil {
var namespaces []string
if c.opts.DiscoveryNamespacesFilter != nil {
namespaces = c.opts.DiscoveryNamespacesFilter.GetMembers().UnsortedList()
}
for _, ns := range namespaces {
pods := c.podsClient.List(ns, klabels.Everything())
c.ambientIndex.HandleSelectedNamespace(ns, pods, c)
}
}
}

func workloadNameAndType(pod *v1.Pod) (string, workloadapi.WorkloadType) {
objMeta, typeMeta := kubeutil.GetDeployMetaFromPod(pod)
switch typeMeta.Kind {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -986,8 +986,10 @@ func newAmbientTestServer(t *testing.T, clusterID cluster.ID, networkID network.
CRDs: []schema.GroupVersionResource{gvr.KubernetesGateway},
ConfigController: cfg,
MeshWatcher: mesh.NewFixedWatcher(&meshconfig.MeshConfig{RootNamespace: systemNS}),
NetworksWatcher: mesh.NewFixedNetworksWatcher(nil),
ClusterID: clusterID,
ConfigCluster: true,
SystemNamespace: systemNS,
})
controller.network = networkID
pc := clienttest.Wrap(t, controller.podsClient)
Expand Down
2 changes: 2 additions & 0 deletions pilot/pkg/serviceregistry/kube/controller/fake.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ type FakeControllerOptions struct {
SkipRun bool
ConfigController model.ConfigStoreController
ConfigCluster bool
SystemNamespace string
}

type FakeController struct {
Expand Down Expand Up @@ -92,6 +93,7 @@ func NewFakeControllerWithOptions(t test.Failer, opts FakeControllerOptions) (*F
MeshServiceController: meshServiceController,
ConfigCluster: opts.ConfigCluster,
ConfigController: opts.ConfigController,
SystemNamespace: opts.SystemNamespace,
}
c := NewController(opts.Client, options)
meshServiceController.AddRegistry(c)
Expand Down
5 changes: 5 additions & 0 deletions pilot/pkg/serviceregistry/kube/controller/network.go
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,11 @@ func (c *Controller) onNetworkChange() {
log.Errorf("one or more errors force-syncing endpoints: %v", err)
}
c.reloadNetworkGateways()
// This is to ensure the ambient workloads are updated dynamically, aligning them with the current network settings.
// With this, the pod do not need to restart when the network configuration changes.
if features.EnableAmbientControllers {
c.syncAllWorkloadsForAmbient()
}
}

// reloadMeshNetworks will read the mesh networks configuration to setup
Expand Down
79 changes: 79 additions & 0 deletions pilot/pkg/serviceregistry/kube/controller/network_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,9 @@ import (
"istio.io/istio/pkg/kube/kclient"
"istio.io/istio/pkg/kube/kclient/clienttest"
"istio.io/istio/pkg/test"
"istio.io/istio/pkg/test/util/assert"
"istio.io/istio/pkg/test/util/retry"
"istio.io/istio/pkg/util/sets"
)

func TestNetworkUpdateTriggers(t *testing.T) {
Expand Down Expand Up @@ -229,3 +231,80 @@ func addMeshNetworksFromRegistryGateway(t *testing.T, c *FakeController, watcher
},
}})
}

func TestSyncAllWorkloadsFromAmbient(t *testing.T) {
test.SetForTest(t, &features.EnableAmbientControllers, true)

s := newAmbientTestServer(t, testC, "")

s.addPods(t, "127.0.0.1", "pod1", "sa1", map[string]string{"app": "a"}, nil, true, corev1.PodRunning)
s.assertAddresses(t, s.addrXdsName("127.0.0.1"), "pod1")

s.addPods(t, "127.0.0.2", "pod2", "sa2", map[string]string{"app": "a"}, nil, true, corev1.PodRunning)
s.assertAddresses(t, s.addrXdsName("127.0.0.2"), "pod2")

createOrUpdateNamespace(t, s.controller, testNS, "")
createOrUpdateNamespace(t, s.controller, systemNS, "")

expectWorkloadNetwork := func(t *testing.T, c *FakeController, network string) {
podNames := sets.New[string]("pod1", "pod2")
addresses := c.ambientIndex.All()
for _, addr := range addresses {
wl := addr.GetWorkload()
if wl == nil {
continue
}
if !podNames.Contains(wl.Name) {
continue
}
assert.Equal(t, network, addr.GetWorkload().Network)
}
}

t.Run("change namespace network to nw1", func(t *testing.T) {
createOrUpdateNamespace(t, s.controller, systemNS, "nw1")
retry.UntilSuccessOrFail(t, func() error {
if s.controller.networkFromSystemNamespace() != "nw1" {
return fmt.Errorf("network not updated")
}
return nil
}, retry.Timeout(5*time.Second), retry.Delay(10*time.Millisecond))
expectWorkloadNetwork(t, s.controller, "nw1")
})

t.Run("change namespace network to nw2", func(t *testing.T) {
createOrUpdateNamespace(t, s.controller, systemNS, "nw2")
retry.UntilSuccessOrFail(t, func() error {
if s.controller.networkFromSystemNamespace() != "nw2" {
return fmt.Errorf("network not updated")
}
return nil
}, retry.Timeout(5*time.Second), retry.Delay(10*time.Millisecond))
expectWorkloadNetwork(t, s.controller, "nw2")
})

t.Run("manually change namespace network to nw3, and update meshNetworks", func(t *testing.T) {
s.controller.setNetworkFromNamespace(&corev1.Namespace{
ObjectMeta: metav1.ObjectMeta{
Name: systemNS,
Labels: map[string]string{
label.TopologyNetwork.Name: "ns3",
hanxiaop marked this conversation as resolved.
Show resolved Hide resolved
},
},
})
addMeshNetworksFromRegistryGateway(t, s.controller, s.controller.meshNetworksWatcher)
expectWorkloadNetwork(t, s.controller, "ns3")
hanxiaop marked this conversation as resolved.
Show resolved Hide resolved
})
}

func createOrUpdateNamespace(t *testing.T, c *FakeController, name, network string) {
namespace := &corev1.Namespace{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Labels: map[string]string{
label.TopologyNetwork.Name: network,
},
},
}
clienttest.Wrap(t, c.namespaces).CreateOrUpdate(namespace)
}