-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* taint controller and unit tests * command line, leader election and makefile configuration for taint controller * e2e tests on taint controller * go module updates, cni taint controller deployment and Manifests updates fixed compatibility issue in e2e test and fixed leader election authentication usue defined communication media in operator fixed comments issue in istio community adopted cache for restful api remove ginkgo update manifest configmaps updated all of them to updated yaml packages formatted e2e_taint tests Update manifests/charts/istio-cni/templates/clusterrolebinding.yaml Co-authored-by: stewartbutler <stewart@ethosnet.net> Update cni/test/e2e-taint/e2e_taint_test.go Co-authored-by: stewartbutler <stewart@ethosnet.net> Update manifests/charts/istio-cni/templates/configmap-cni.yaml Co-authored-by: stewartbutler <stewart@ethosnet.net> Update manifests/charts/istio-cni/values.yaml Co-authored-by: stewartbutler <stewart@ethosnet.net> reshaped leader-election time naming and election tine issues typo fixing * leader election graceful shut down * add restart logic to leader election for recovery * reformat the code according to review * Update cni/pkg/taint/README.md Co-authored-by: John Howard <howardjohn@google.com> updated version by fixing issues provided by istio community lint issues updated version by fixing issues provided by istio community updated version by fixing issues provided by istio community updated version by fixing issues provided by istio community markdown lint changed configmap settings to list and updated end to end tests docs * moved end to end tests to integration tests * make gen * removed end to end tests for current cases
- Loading branch information
ZhiHanZ
authored
Aug 14, 2020
1 parent
52d5fe5
commit ca6ce00
Showing
19 changed files
with
2,453 additions
and
329 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
The `istio-cni-taint` binary. Can be run as a standalone command line tool | ||
or as a daemon. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,224 @@ | ||
// Copyright 2020 Istio Authors | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
// A simple daemonset binary to repair pods that are crashlooping | ||
// after winning a race condition against istio-cni | ||
|
||
package main | ||
|
||
import ( | ||
"context" | ||
"os" | ||
"os/signal" | ||
"syscall" | ||
"time" | ||
|
||
"github.com/google/uuid" | ||
"github.com/spf13/cobra" | ||
"github.com/spf13/pflag" | ||
"github.com/spf13/viper" | ||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
client "k8s.io/client-go/kubernetes" | ||
"k8s.io/client-go/tools/clientcmd" | ||
"k8s.io/client-go/tools/leaderelection" | ||
"k8s.io/client-go/tools/leaderelection/resourcelock" | ||
|
||
"istio.io/istio/cni/pkg/taint" | ||
"istio.io/pkg/log" | ||
) | ||
|
||
type ControllerOptions struct { | ||
RunAsDaemon bool `json:"run_as_daemon"` | ||
TaintOptions *taint.Options `json:"taint_options"` | ||
} | ||
|
||
const ( | ||
LeassLockName = "istio-taint-lock" | ||
LeaseLockNamespace = "kube-system" | ||
) | ||
|
||
var ( | ||
loggingOptions = log.DefaultOptions() | ||
|
||
rootCmd = &cobra.Command{ | ||
Use: "taint controller", | ||
Short: "taint controller command line interface", | ||
SilenceUsage: true, | ||
Long: `Istio CNI taint controller used for monitoring on the readiness of configmap defined | ||
Critical labels, this can be run as a standalone command line tool or as a daemon. | ||
If it run as command line tool, it will check the readiness of critical labels defined in configmap | ||
if some critical labels are not ready, taint the corresponding node. otherwise, it will run as a | ||
kubernetes controller and checking on the readiness of critical label | ||
`, | ||
PersistentPreRunE: configureLogging, | ||
Run: func(cmd *cobra.Command, args []string) { | ||
// Parse args to settings | ||
options := parseFlags() | ||
clientSet, err := clientSetup() | ||
if err != nil { | ||
log.Fatalf("Could not construct clientSet: %s", err) | ||
} | ||
taintSetter, err := taint.NewTaintSetter(clientSet, options.TaintOptions) | ||
if err != nil { | ||
log.Fatalf("Could not construct taint setter: %s", err) | ||
} | ||
logCurrentOptions(taintSetter, options) | ||
tc, err := taint.NewTaintSetterController(taintSetter) | ||
if err != nil { | ||
log.Fatalf("Fatal error constructing taint controller: %+v", err) | ||
} | ||
if !options.RunAsDaemon { | ||
nodeReadinessCheck(tc) | ||
return | ||
} | ||
id := uuid.New().String() | ||
stopCh := make(chan struct{}) | ||
lock := &resourcelock.LeaseLock{ | ||
LeaseMeta: metav1.ObjectMeta{ | ||
Name: LeassLockName, | ||
Namespace: LeaseLockNamespace, | ||
}, | ||
Client: clientSet.CoordinationV1(), | ||
LockConfig: resourcelock.ResourceLockConfig{ | ||
Identity: id, | ||
}, | ||
} | ||
leadelectionCallback := leaderelection.LeaderCallbacks{ | ||
OnStartedLeading: func(ctx context.Context) { | ||
//once leader elected it should taint all nodes at first to prevent race condition | ||
tc.RegistTaints() | ||
tc.Run(ctx.Done()) //graceful shut down | ||
}, | ||
OnStoppedLeading: func() { | ||
// when leader failed, log leader failure and restart leader election | ||
log.Infof("leader lost: %s", id) | ||
}, | ||
OnNewLeader: func(identity string) { | ||
// we're notified when new leader elected | ||
if identity == id { | ||
// I just got the lock | ||
return | ||
} | ||
log.Infof("new leader elected: %s", identity) | ||
}, | ||
} | ||
func(stopCh <-chan struct{}) { | ||
for { | ||
ctx, cancel := context.WithCancel(context.Background()) | ||
ch := make(chan os.Signal, 1) | ||
signal.Notify(ch, os.Interrupt, syscall.SIGTERM) | ||
go func() { | ||
<-ch | ||
log.Info("Received termination, signaling shutdown") | ||
cancel() | ||
}() | ||
leaderelection.RunOrDie(ctx, leaderelection.LeaderElectionConfig{ | ||
Lock: lock, | ||
ReleaseOnCancel: true, | ||
LeaseDuration: 60 * time.Second, | ||
RenewDeadline: 15 * time.Second, | ||
RetryPeriod: 5 * time.Second, | ||
Callbacks: leadelectionCallback, | ||
}) | ||
select { | ||
case <-stopCh: | ||
return | ||
default: | ||
cancel() | ||
log.Errorf("leader election lost due to exception happened") | ||
} | ||
} | ||
}(stopCh) | ||
}, | ||
} | ||
) | ||
|
||
// Parse command line options | ||
func parseFlags() (options *ControllerOptions) { | ||
// Parse command line flags | ||
//configmap name Options | ||
|
||
pflag.String("configmap-namespace", "kube-system", "the namespace of critical pod definition configmap") | ||
pflag.String("configmap-name", "single", "the name of critical pod definition configmap") | ||
pflag.Bool("run-as-daemon", true, "Controller will run in a loop") | ||
pflag.Bool("help", false, "Print usage information") | ||
|
||
pflag.Parse() | ||
if err := viper.BindPFlags(pflag.CommandLine); err != nil { | ||
log.Fatal("Error parsing command line args: %+v") | ||
} | ||
|
||
if viper.GetBool("help") { | ||
pflag.Usage() | ||
os.Exit(0) | ||
} | ||
|
||
viper.SetEnvPrefix("TAINT") | ||
viper.AutomaticEnv() | ||
// Pull runtime args into structs | ||
options = &ControllerOptions{ | ||
RunAsDaemon: viper.GetBool("run-as-daemon"), | ||
TaintOptions: &taint.Options{ | ||
ConfigmapName: viper.GetString("configmap-name"), | ||
ConfigmapNamespace: viper.GetString("configmap-namespace"), | ||
}, | ||
} | ||
|
||
return | ||
} | ||
|
||
// Set up Kubernetes client using kubeconfig (or in-cluster config if no file provided) | ||
func clientSetup() (clientset *client.Clientset, err error) { | ||
loadingRules := clientcmd.NewDefaultClientConfigLoadingRules() | ||
configOverrides := &clientcmd.ConfigOverrides{} | ||
kubeConfig := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(loadingRules, configOverrides) | ||
config, err := kubeConfig.ClientConfig() | ||
if err != nil { | ||
return | ||
} | ||
clientset, err = client.NewForConfig(config) | ||
return | ||
} | ||
|
||
// Log human-readable output describing the current filter and option selection | ||
func logCurrentOptions(ts *taint.Setter, options *ControllerOptions) { | ||
if options.RunAsDaemon { | ||
log.Infof("Controller Option: Running as a Daemon.") | ||
} | ||
for _, cs := range ts.Configs() { | ||
log.Infof("ConfigSetting %s", cs) | ||
} | ||
} | ||
|
||
//check all node, taint all unready node | ||
func nodeReadinessCheck(tc *taint.Controller) { | ||
nodes := tc.ListAllNode() | ||
for _, node := range nodes { | ||
err := tc.ProcessNode(node) | ||
if err != nil { | ||
log.Fatalf("error: %+v in node %v", err.Error(), node.Name) | ||
} | ||
} | ||
} | ||
func configureLogging(_ *cobra.Command, _ []string) error { | ||
if err := log.Configure(loggingOptions); err != nil { | ||
return err | ||
} | ||
return nil | ||
} | ||
func main() { | ||
if err := rootCmd.Execute(); err != nil { | ||
os.Exit(-1) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# Istio Node Readiness Controller | ||
|
||
This package is used to add readiness taint to prevent race condition | ||
during critical daemonset pod installation | ||
|
||
## How it works | ||
|
||
It will load the configmap defined by the user which contains | ||
the information of critical labels and their namespaces | ||
Controller will monitoring on all nodes and pods with critical labels | ||
and namespaces defined in the configuration map | ||
it will taint the node if some of the critical labels | ||
are not setup in current node | ||
and when all critical labels in given node is set up. | ||
it will untaint the node to allow non-critical pods to register | ||
|
||
## What it will do and not do | ||
|
||
It is a complementary package to repair controller | ||
because repair controller itself cannot prevent daemonset failure | ||
and when istio-cni daemonset becomes unready, | ||
it is not able to install iptable rules to pods and introduce race condition | ||
thus it **must work together with istio-cni-repair controller** | ||
|
||
It support much more generalized setting in node readiness | ||
checking, thus user can define their own configuration maps for | ||
more complicated readiness Check | ||
|
||
## How to use | ||
|
||
### create a configmap to define critical labels and their namespace | ||
|
||
Configmap defines the namespace and label selector for critical pod, | ||
and in default it should be located in kube-system namespace with | ||
name node.readiness to let controller find them automatically | ||
An example of configmap | ||
Layout | ||
|
||
```bash | ||
./ | ||
configs/ | ||
config | ||
``` | ||
|
||
```bash | ||
config file | ||
|
||
- name: istio-cni | ||
selector: app=istio | ||
namespace: kube-system | ||
``` | ||
|
||
command to create the configmap | ||
sample output of configmap | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: "istio-cni-taint" | ||
namespace: "kube-system" | ||
labels: | ||
app: istio-cni | ||
data: | ||
config: | | ||
- name: istio-cni | ||
selector: app=istio | ||
namespace: kube-system | ||
``` | ||
supports multi label in one | ||
```yaml | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: "istio-cni-taint" | ||
namespace: "kube-system" | ||
labels: | ||
app: istio-cni | ||
data: | ||
config: | | ||
- name: istio-cni | ||
selector: app=istio, app=istio-cni | ||
namespace: kube-system | ||
``` | ||
### config the critical pods and add node readiness tolerations to it | ||
```yaml | ||
Kind: Daemonset | ||
Metadata: | ||
Name: istio-critical-pod | ||
Labels: | ||
app: istio | ||
Spec: | ||
...more... | ||
Toleration: | ||
Key: NodeReadiness | ||
Operator: Exists | ||
Effect: NoSchedule | ||
``` | ||
### build it as binary | ||
the command line interface is in `cni/cmd/istio-cni-taint/main.go` | ||
using command | ||
|
||
```bash | ||
make istioctl | ||
``` | ||
|
||
it will generate the binary version of command-line interface controller | ||
|
||
### run command line interface for debugging and tests | ||
|
||
find the istio-cni-taint binary in your output directory | ||
run the following command to start controller | ||
|
||
```bash | ||
istio-cni-taint | ||
``` | ||
|
||
If you want to customize nodes' readiness taint you should taint them by yourself | ||
|
||
```bash | ||
kubectl taint nodes <node-name> NodeReadiness:NoSchedule | ||
``` | ||
|
||
and you need to set `--register-with-taints` option in kubelet to set | ||
readiness taint to newly added node | ||
|
||
```bash | ||
kubelet --register-with-taints=NodeReadiness:NoSchedule | ||
``` |
Oops, something went wrong.