Skip to content

Commit

Permalink
Performance improvements to concurrent Pod creation, and deletion ope…
Browse files Browse the repository at this point in the history
…rations
  • Loading branch information
Levovar committed Sep 17, 2019
1 parent ef80ace commit c49b5d9
Show file tree
Hide file tree
Showing 6 changed files with 23 additions and 13 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ buildah bud -t webhook:latest integration/docker/webhook
```
builds the respective containers. Afterwards these containers can be directly integrated into a running Kubernetes cluster!
## Deployment
The method of deploying the whole DANM suite into a Kubernetes cluster is the following.
The method of deploying the whole DANM suite into a Kubernetes cluster is the following.
**1A. Extend the Kubernetes API with the DanmNet and DanmEp CRD objects for a simplified network management experience by executing the following command from the project's root directory:**
```
kubectl create -f integration/crds/lightweight
Expand Down Expand Up @@ -586,7 +586,7 @@ Network administrators can configure NetworkType: NetworkID mappings into the Te
Thus it becomes guaranteed that the tenant user's network will use the right CNI configuration file during Pod creation!
#### List of validation rules
##### DanmNet
Every CREATE, and PUT DanmNet operation is subject to the following validation rules:
Every CREATE, and ~~PUT~~ (see [https://github.com/nokia/danm/issues/144](https://github.com/nokia/danm/issues/144)) DanmNet operation is subject to the following validation rules:

1. spec.Options.Cidr must be supplied in a valid IPv4 CIDR notation
2. all gateway addresses belonging to an entry of spec.Options.Routes shall be in the defined IPv4 CIDR
Expand All @@ -608,7 +608,7 @@ Every CREATE, and PUT DanmNet operation is subject to the following validation r

Not complying with any of these rules results in the denial of the provisioning operation.
##### TenantNetwork
Every CREATE, and PUT TenantNetwork operation is subject to the DanmNet validation rules no. 1-9, 11, 12.
Every CREATE, and ~~PUT~~ (see [https://github.com/nokia/danm/issues/144](https://github.com/nokia/danm/issues/144)) TenantNetwork operation is subject to the DanmNet validation rules no. 1-9, 11, 12.
In addition TenantNetwork provisioning has the following extra rules:

1. spec.Options.Vlan cannot be provided
Expand All @@ -622,7 +622,7 @@ Every DELETE TenantNetwork operation is subject to the DanmNet validation rule n

Not complying with any of these rules results in the denial of the provisioning operation.
##### ClusterNetwork
Every CREATE, and PUT ClusterNetwork operation is subject to the DanmNet validation rules no. 1-11, 13-14.
Every CREATE, and ~~PUT~~ (see [https://github.com/nokia/danm/issues/144](https://github.com/nokia/danm/issues/144)) ClusterNetwork operation is subject to the DanmNet validation rules no. 1-11, 13-14.

Every DELETE ClusterNetwork operation is subject to the DanmNet validation rule no.15.

Expand Down
8 changes: 5 additions & 3 deletions cmd/danm/danm.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ import (
func main() {
var err error
f, err := os.OpenFile("/var/log/danm.log", os.O_RDWR | os.O_CREATE | os.O_APPEND, 0640)
if err == nil {
log.SetOutput(f)
defer f.Close()
if err != nil {
log.Println("ERROR: cannot create log file, because:" + err.Error())
}
defer f.Close()
log.SetOutput(f)
log.SetFlags(log.LstdFlags | log.Lmicroseconds)
skel.PluginMain(metacni.CreateInterfaces, metacni.GetInterfaces, metacni.DeleteInterfaces, datastructs.SupportedCniVersions, "")
}
3 changes: 2 additions & 1 deletion integration/manifests/webhook/webhook.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ webhooks:
# Configure your pre-generated certificate matching the details of your environment
caBundle: ${CA_BUNDLE}
rules:
- operations: ["CREATE","UPDATE"]
# UPDATE IS TEMPORARILY REMOVED DUE TO:https://github.com/nokia/danm/issues/144
- operations: ["CREATE"]
apiGroups: ["danm.k8s.io"]
apiVersions: ["v1"]
resources: ["danmnets","clusternetworks","tenantnetworks"]
Expand Down
3 changes: 2 additions & 1 deletion pkg/netcontrol/netcontrol.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ const(
TenantNetworkKind = "TenantNetwork"
ClusterNetworkKind = "ClusterNetwork"
)

// NetWatcher represents an object watching the K8s API for changes in all three network management API paths
// Upon the reception of a notification it handles the related VxLAN/VLAN/RT creation/deletions on the host
type NetWatcher struct {
Expand Down Expand Up @@ -342,7 +343,7 @@ func PutNetwork(danmClient danmclientset.Interface, dnet *danmtypes.DanmNet) (bo
return wasResourceAlreadyUpdated, errors.New("can't refresh network object because it has an invalid type:" + dnet.TypeMeta.Kind)
}
if err != nil {
if strings.Contains(err.Error(),datastructs.OptimisticLockErrorMsg) {
if strings.Contains(err.Error(), datastructs.OptimisticLockErrorMsg) {
wasResourceAlreadyUpdated = true
return wasResourceAlreadyUpdated, nil
}
Expand Down
11 changes: 8 additions & 3 deletions pkg/syncher/syncher.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,17 @@ package syncher
import (
"errors"
"fmt"
"strconv"
"strings"
"sync"
"time"
"github.com/containernetworking/cni/pkg/types/current"
)

const (
MaximumAllowedTime = 3000
)

type cniOpResult struct {
CniName string
OpResult error
Expand Down Expand Up @@ -39,8 +44,8 @@ func (synch *Syncher) PushResult(cniName string, opRes error, cniRes *current.Re
}

func (synch *Syncher) GetAggregatedResult() error {
//Time-out Pod creation if a plugin did not provide result within 10 seconds
for i := 0; i < 1000; i++ {
//Time-out Pod creation if plugins did not provide results within the configured timeframe
for i := 0; i < MaximumAllowedTime; i++ {
if synch.ExpectedNumOfResults > len(synch.CniResults) {
time.Sleep(10 * time.Millisecond)
continue
Expand All @@ -50,7 +55,7 @@ func (synch *Syncher) GetAggregatedResult() error {
}
return nil
}
return errors.New("CNI operation timed-out after 10 seconds")
return errors.New("CNI operation timed-out after " + strconv.Itoa(MaximumAllowedTime) + " seconds")
}

func (synch *Syncher) wasAnyOperationErroneous() bool {
Expand Down
3 changes: 2 additions & 1 deletion test/uts/syncher_test/syncher_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@ import (
"github.com/containernetworking/cni/pkg/types/current"
"github.com/nokia/danm/pkg/syncher"
)

const (
timeout = 10
timeout = syncher.MaximumAllowedTime/100
)

type result struct {
Expand Down

0 comments on commit c49b5d9

Please sign in to comment.