Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report InstanceID for vSphere Cloud Provider as UUID obtained from product_serial file #59519

Merged

Conversation

abrarshivani
Copy link
Contributor

@abrarshivani abrarshivani commented Feb 8, 2018

What this PR does / why we need it:
vSphere Cloud Provider is not able to find the nodes for VMs created on vSphere v1.6.5. Kubelet fetches SystemUUID from file /sys/class/dmi/id/product_uuid. vSphere Cloud Provider uses this uuid as VM identifier to get node information from vCenter. vCenter v1.6.5 doesn't recognize this uuids, as a result, nodes are not found.

UUID present in file /sys/class/dmi/id/product_serial is recognized by vCenter. Yet, Kubelet doesn't report this. Therefore, in this PR InstanceID is reported as UUID which is fetched from file
/sys/class/dmi/id/product_serial.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #58927

Special notes for your reviewer:
Internally review here: vmware-archive#452

Tested:
Launched K8s cluster using kubeadm (Used Ubuntu VM compatible with vSphere version 6.5.)
Note: Installed Ubuntu from ISO
Observed following:

Master
> cat /sys/class/dmi/id/product_uuid
743F0E42-84EA-A2F9-7736-6106BB5DBF6B

> cat /sys/class/dmi/id/product_serial
VMware-42 0e 3f 74 ea 84 f9 a2-77 36 61 06 bb 5d bf 6b

Node
> cat /sys/class/dmi/id/product_uuid
956E0E42-CC9D-3D89-9757-F27CEB539B76

> cat /sys/class/dmi/id/product_serial
VMware-42 0e 6e 95 9d cc 89 3d-97 57 f2 7c eb 53 9b 76

With this fix controller manager was able to find the nodes.
controller manager logs

{"log":"I0205 22:43:00.106416       1 nodemanager.go:183] Found node ubuntu-node as vm=VirtualMachine:vm-95 in vc=10.161.120.115 and datacenter=vcqaDC\n","stream":"stderr","time":"2018-02-05T22:43:00.421010375Z"}

Release note:

vSphere Cloud Provider supports VMs provisioned on vSphere v1.6.5

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 8, 2018
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2018
@abrarshivani
Copy link
Contributor Author

/assign @divyenpatel
/assign @BaluDontu

@divyenpatel
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abrarshivani, divyenpatel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these OWNERS Files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit c0a337d into kubernetes:master Feb 8, 2018
k8s-github-robot pushed a commit that referenced this pull request Feb 24, 2018
…-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #59519: Report InstanceID for vSphere Cloud Provider as UUID obtained from product_serial file

Cherry pick of #59519 on release-1.9.

#59519: Report InstanceID for vSphere Cloud Provider as UUID obtained from product_serial file
rhockenbury pushed a commit to rhockenbury/kubernetes that referenced this pull request Mar 26, 2019
Certain versions of vSphere do not have the same value for product_uuid
and product_serial. This mimics the change in kubernetes#59519.

Fixes kubernetes#74888
rjaini added a commit to msazurestackworkloads/kubernetes that referenced this pull request Apr 15, 2019
* test: remove k8s.io/apiextensions-apiserver from framework

There are two reason why this is useful:

1. less code to vendor into external users of the framework

The following dependencies become obsolete due to this change (from `dep`):

(8/23) Removed unused project github.com/grpc-ecosystem/go-grpc-prometheus
(9/23) Removed unused project github.com/coreos/etcd
(10/23) Removed unused project github.com/globalsign/mgo
(11/23) Removed unused project github.com/go-openapi/strfmt
(12/23) Removed unused project github.com/asaskevich/govalidator
(13/23) Removed unused project github.com/mitchellh/mapstructure
(14/23) Removed unused project github.com/NYTimes/gziphandler
(15/23) Removed unused project gopkg.in/natefinch/lumberjack.v2
(16/23) Removed unused project github.com/go-openapi/errors
(17/23) Removed unused project github.com/go-openapi/analysis
(18/23) Removed unused project github.com/go-openapi/runtime
(19/23) Removed unused project sigs.k8s.io/structured-merge-diff
(20/23) Removed unused project github.com/go-openapi/validate
(21/23) Removed unused project github.com/coreos/go-systemd
(22/23) Removed unused project github.com/go-openapi/loads
(23/23) Removed unused project github.com/munnerz/goautoneg

2. works around kubernetes#75338
   which currently breaks vendoring

Some recent changes to crd_util.go must now be pulling in the broken
k8s.io/apiextensions-apiserver packages, because it was still working
in revision 2e90d92 (as demonstrated by
https://github.com/intel/pmem-CSI/tree/586ae281ac2810cb4da6f1e160cf165c7daf0d80).

* update Bazel files

* test: fix golint warnings in crd_util.go

Because the code was moved, golint is now active. Because users of the
code must adapt to the new location of the code, it makes sense to
also change the API at the same time to address the style comments
from golint ("struct field ApiGroup should be APIGroup", same for
ApiExtensionClient).

* fix race condition issue for smb mount on windows

change var name

* stop vsphere cloud provider from spamming logs with `failed to patch IP`
Fixes: kubernetes#75236

* Remove reference to USE_RELEASE_NODE_BINARIES.

This variable was used for development purposes and was accidentally
introduced in
kubernetes@f0f7829.

This is its only use in the tree:
https://github.com/kubernetes/kubernetes/search?q=USE_RELEASE_NODE_BINARIES&unscoped_q=USE_RELEASE_NODE_BINARIES

* Clear conntrack entries on 0 -> 1 endpoint transition with externalIPs

As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior

This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>

* Move to golang 1.12.1 official image

We used 1.12.0 + hack to download 1.12.1 binaries as we were in a rush
on friday since the images were not published at that time. Let's remove
the hack now and republish the kube-cross image

Change-Id: I3ffff3283b6ca755320adfca3c8f4a36dc1c2b9e

* fix-kubeadm-init-output

* Mark audit e2e tests as flaky

* Bump kube-cross image to 1.12.1-2

* Restore username and password kubectl flags

* build/gci: bump CNI version to 0.7.5

* Add/Update CHANGELOG-1.14.md for v1.14.0-rc.1.

* Restore machine readability to the print-join-command output

The output of `kubeadm token create --print-join-command` should be
usable by batch scripts. This issue was pointed out in:

kubernetes/kubeadm#1454

* bump required minimum go version to 1.12.1 (strings package compatibility)

* Bump go-openapi/jsonpointer and go-openapi/jsonreference versions

xref: kubernetes#75653

Signed-off-by: Jorge Alarcon Ochoa <alarcj137@gmail.com>

* Kubernetes version v1.14.1-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.0.

* 1.14 release notes fixes

* Do not delete existing VS and RS when starting

* Update Cluster Autscaler version to 1.14.0

No changes since 1.14.0-beta.2
Changelog: https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.14.0

* Fix Windows to read VM UUIDs from serial numbers

Certain versions of vSphere do not have the same value for product_uuid
and product_serial. This mimics the change in kubernetes#59519.

Fixes kubernetes#74888

* godeps: update vmware/govmomi to v0.20 release

* vSphere: add token auth support for tags client

SAML auth support for the vCenter rest API endpoint came to govmomi
a bit after Zone support came to vSphere Cloud Provider.

Fixes kubernetes#75511

* vsphere: govmomi rest API simulator requires authentication

* gce: configure: validate SA has storage scope

If the VM SA doesn't have storage scope associated, don't use the
token in the curl request or the request will fail with 403.

* fix-external-etcd

* Update gcp images with security patches

[stackdriver addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[fluentd-gcp addon] Bump fluentd-gcp-scaler to v0.5.1 to pick up security fixes.
[fluentd-gcp addon] Bump event-exporter to v0.2.4 to pick up security fixes.
[fluentd-gcp addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[metatada-proxy addon] Bump prometheus-to-sd v0.5.0 to pick up security fixes.

* kubeadm: fix "upgrade plan" not working without k8s version

If the k8s version argument passed to "upgrade plan" is missing
the logic should perform the following actions:
- fetch a "stable" version from the internet.
- if that fails, fallback to the local client version.

Currentely the logic fails because the cfg.KubernetesVersion is
defaulted to the version of the existing cluster, which
then causes an early exit without any ugprade suggestions.

See app/cmd/upgrade/common.go::enforceRequirements():
  configutil.FetchInitConfigurationFromCluster(..)

Fix that by passing the explicit user value that can also be "".
This will then make the "offline getter" treat it as an explicit
desired upgrade target.

In the future it might be best to invert this logic:
- if no user k8s version argument is passed - default to the kubeadm
version.
- if labels are passed (e.g. "stable"), fetch a version from the
internet.

* Disable GCE agent address management on Windows nodes.

With this metadata key set, "GCEWindowsAgent: GCE address manager
status: disabled" will appear in the VM's serial port output during
boot.

Tested:
PROJECT=${CLOUDSDK_CORE_PROJECT} KUBE_GCE_ENABLE_IP_ALIASES=true NUM_WINDOWS_NODES=2 NUM_NODES=2 KUBERNETES_NODE_PLATFORM=windows go run ./hack/e2e.go -- --up
cluster/gce/windows/smoke-test.sh

cat > iis.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: iis
  labels:
    app: iis
spec:
  containers:
  - image: mcr.microsoft.com/windows/servercore/iis
    imagePullPolicy: IfNotPresent
    name: iis-server
    ports:
    - containerPort: 80
      protocol: TCP
  nodeSelector:
    beta.kubernetes.io/os: windows
  tolerations:
  - effect: NoSchedule
    key: node.kubernetes.io/os
    operator: Equal
    value: windows1809
EOF

kubectl create -f iis.yaml
kubectl expose pod iis --type=LoadBalancer --name=iis
kubectl get services
curl http://<service external IP address>

* kube-aggregator: bump openapi aggregation log level

* Explicitly flush headers when proxying

* fix-kubeadm-upgrade-12-13-14

* GCE/Windows: disable stackdriver logging agent

The logging service could not be stopped at times, causing node startup
failures. Disable it until the issue is fixed.

* Finish saving test results on failure

The conformance image should be saving its results
regardless of the results of the tests. However,
with errexit set, when ginkgo gets test failures
it exits 1 which prevents saving the results
for Sonobuoy to pick up.

Fixes: kubernetes#76036

* Avoid panic in cronjob sorting

This change handles the case where the ith cronjob may have its start
time set to nil.

Previously, the Less method could cause a panic in case the ith
cronjob had its start time set to nil, but the jth cronjob did not. It
would panic when calling Before on a nil StartTime.

* Removed cleanup for non-current kube-proxy modes in newProxyServer()

* Depricated --cleanup-ipvs flag in kube-proxy

* Fixed old function signature in kube-proxy tests.

* Revert "Deprecated --cleanup-ipvs flag in kube-proxy"

This reverts commit 4f1bb2b.

* Revert "Fixed old function signature in kube-proxy tests."

This reverts commit 29ba1b0.

* Fixed --cleanup-ipvs help text

* Fix empty array expansion error in cluster/gce/util.sh

Empty array expansion causes "unbound variable" error in
bash 4.2 and bash 4.3.
rjaini added a commit to msazurestackworkloads/kubernetes that referenced this pull request May 22, 2019
* test: remove k8s.io/apiextensions-apiserver from framework

There are two reason why this is useful:

1. less code to vendor into external users of the framework

The following dependencies become obsolete due to this change (from `dep`):

(8/23) Removed unused project github.com/grpc-ecosystem/go-grpc-prometheus
(9/23) Removed unused project github.com/coreos/etcd
(10/23) Removed unused project github.com/globalsign/mgo
(11/23) Removed unused project github.com/go-openapi/strfmt
(12/23) Removed unused project github.com/asaskevich/govalidator
(13/23) Removed unused project github.com/mitchellh/mapstructure
(14/23) Removed unused project github.com/NYTimes/gziphandler
(15/23) Removed unused project gopkg.in/natefinch/lumberjack.v2
(16/23) Removed unused project github.com/go-openapi/errors
(17/23) Removed unused project github.com/go-openapi/analysis
(18/23) Removed unused project github.com/go-openapi/runtime
(19/23) Removed unused project sigs.k8s.io/structured-merge-diff
(20/23) Removed unused project github.com/go-openapi/validate
(21/23) Removed unused project github.com/coreos/go-systemd
(22/23) Removed unused project github.com/go-openapi/loads
(23/23) Removed unused project github.com/munnerz/goautoneg

2. works around kubernetes#75338
   which currently breaks vendoring

Some recent changes to crd_util.go must now be pulling in the broken
k8s.io/apiextensions-apiserver packages, because it was still working
in revision 2e90d92 (as demonstrated by
https://github.com/intel/pmem-CSI/tree/586ae281ac2810cb4da6f1e160cf165c7daf0d80).

* update Bazel files

* test: fix golint warnings in crd_util.go

Because the code was moved, golint is now active. Because users of the
code must adapt to the new location of the code, it makes sense to
also change the API at the same time to address the style comments
from golint ("struct field ApiGroup should be APIGroup", same for
ApiExtensionClient).

* fix race condition issue for smb mount on windows

change var name

* stop vsphere cloud provider from spamming logs with `failed to patch IP`
Fixes: kubernetes#75236

* Remove reference to USE_RELEASE_NODE_BINARIES.

This variable was used for development purposes and was accidentally
introduced in
kubernetes@f0f7829.

This is its only use in the tree:
https://github.com/kubernetes/kubernetes/search?q=USE_RELEASE_NODE_BINARIES&unscoped_q=USE_RELEASE_NODE_BINARIES

* Clear conntrack entries on 0 -> 1 endpoint transition with externalIPs

As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior

This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>

* Move to golang 1.12.1 official image

We used 1.12.0 + hack to download 1.12.1 binaries as we were in a rush
on friday since the images were not published at that time. Let's remove
the hack now and republish the kube-cross image

Change-Id: I3ffff3283b6ca755320adfca3c8f4a36dc1c2b9e

* fix-kubeadm-init-output

* Mark audit e2e tests as flaky

* Bump kube-cross image to 1.12.1-2

* Restore username and password kubectl flags

* build/gci: bump CNI version to 0.7.5

* Add/Update CHANGELOG-1.14.md for v1.14.0-rc.1.

* Restore machine readability to the print-join-command output

The output of `kubeadm token create --print-join-command` should be
usable by batch scripts. This issue was pointed out in:

kubernetes/kubeadm#1454

* bump required minimum go version to 1.12.1 (strings package compatibility)

* Bump go-openapi/jsonpointer and go-openapi/jsonreference versions

xref: kubernetes#75653

Signed-off-by: Jorge Alarcon Ochoa <alarcj137@gmail.com>

* Kubernetes version v1.14.1-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.0.

* 1.14 release notes fixes

* Add flag to enable strict ARP

* Do not delete existing VS and RS when starting

* Update Cluster Autscaler version to 1.14.0

No changes since 1.14.0-beta.2
Changelog: https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.14.0

* Fix Windows to read VM UUIDs from serial numbers

Certain versions of vSphere do not have the same value for product_uuid
and product_serial. This mimics the change in kubernetes#59519.

Fixes kubernetes#74888

* godeps: update vmware/govmomi to v0.20 release

* vSphere: add token auth support for tags client

SAML auth support for the vCenter rest API endpoint came to govmomi
a bit after Zone support came to vSphere Cloud Provider.

Fixes kubernetes#75511

* vsphere: govmomi rest API simulator requires authentication

* gce: configure: validate SA has storage scope

If the VM SA doesn't have storage scope associated, don't use the
token in the curl request or the request will fail with 403.

* fix-external-etcd

* Update gcp images with security patches

[stackdriver addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[fluentd-gcp addon] Bump fluentd-gcp-scaler to v0.5.1 to pick up security fixes.
[fluentd-gcp addon] Bump event-exporter to v0.2.4 to pick up security fixes.
[fluentd-gcp addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[metatada-proxy addon] Bump prometheus-to-sd v0.5.0 to pick up security fixes.

* kubeadm: fix "upgrade plan" not working without k8s version

If the k8s version argument passed to "upgrade plan" is missing
the logic should perform the following actions:
- fetch a "stable" version from the internet.
- if that fails, fallback to the local client version.

Currentely the logic fails because the cfg.KubernetesVersion is
defaulted to the version of the existing cluster, which
then causes an early exit without any ugprade suggestions.

See app/cmd/upgrade/common.go::enforceRequirements():
  configutil.FetchInitConfigurationFromCluster(..)

Fix that by passing the explicit user value that can also be "".
This will then make the "offline getter" treat it as an explicit
desired upgrade target.

In the future it might be best to invert this logic:
- if no user k8s version argument is passed - default to the kubeadm
version.
- if labels are passed (e.g. "stable"), fetch a version from the
internet.

* Disable GCE agent address management on Windows nodes.

With this metadata key set, "GCEWindowsAgent: GCE address manager
status: disabled" will appear in the VM's serial port output during
boot.

Tested:
PROJECT=${CLOUDSDK_CORE_PROJECT} KUBE_GCE_ENABLE_IP_ALIASES=true NUM_WINDOWS_NODES=2 NUM_NODES=2 KUBERNETES_NODE_PLATFORM=windows go run ./hack/e2e.go -- --up
cluster/gce/windows/smoke-test.sh

cat > iis.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: iis
  labels:
    app: iis
spec:
  containers:
  - image: mcr.microsoft.com/windows/servercore/iis
    imagePullPolicy: IfNotPresent
    name: iis-server
    ports:
    - containerPort: 80
      protocol: TCP
  nodeSelector:
    beta.kubernetes.io/os: windows
  tolerations:
  - effect: NoSchedule
    key: node.kubernetes.io/os
    operator: Equal
    value: windows1809
EOF

kubectl create -f iis.yaml
kubectl expose pod iis --type=LoadBalancer --name=iis
kubectl get services
curl http://<service external IP address>

* kube-aggregator: bump openapi aggregation log level

* Explicitly flush headers when proxying

* fix-kubeadm-upgrade-12-13-14

* GCE/Windows: disable stackdriver logging agent

The logging service could not be stopped at times, causing node startup
failures. Disable it until the issue is fixed.

* Finish saving test results on failure

The conformance image should be saving its results
regardless of the results of the tests. However,
with errexit set, when ginkgo gets test failures
it exits 1 which prevents saving the results
for Sonobuoy to pick up.

Fixes: kubernetes#76036

* Avoid panic in cronjob sorting

This change handles the case where the ith cronjob may have its start
time set to nil.

Previously, the Less method could cause a panic in case the ith
cronjob had its start time set to nil, but the jth cronjob did not. It
would panic when calling Before on a nil StartTime.

* Removed cleanup for non-current kube-proxy modes in newProxyServer()

* Depricated --cleanup-ipvs flag in kube-proxy

* Fixed old function signature in kube-proxy tests.

* Revert "Deprecated --cleanup-ipvs flag in kube-proxy"

This reverts commit 4f1bb2b.

* Revert "Fixed old function signature in kube-proxy tests."

This reverts commit 29ba1b0.

* Fixed --cleanup-ipvs help text

* Check for required name parameter in dynamic client

The Create, Delete, Get, Patch, Update and UpdateStatus
methods in the dynamic client all expect the name
parameter to be non-empty, but did not validate this
requirement, which could lead to a panic. Add explicit
checks to these methods.

* Fix empty array expansion error in cluster/gce/util.sh

Empty array expansion causes "unbound variable" error in
bash 4.2 and bash 4.3.

* Improve volume operation metrics

* Add e2e tests

* ensuring that logic is checking for differences in listener

* Kubernetes version v1.14.2-beta.0 openapi-spec file updates

* Delete only unscheduled pods if node doesn't exist anymore.

* Add/Update CHANGELOG-1.14.md for v1.14.1.

* Use Node-Problem-Detector v0.6.3 on GCI

* proxy: Take into account exclude CIDRs while deleting legacy real servers

* kubeadm: Don't error out on join with --cri-socket override

In the case where newControlPlane is true we don't go through
getNodeRegistration() and initcfg.NodeRegistration.CRISocket is empty.
This forces DetectCRISocket() to be called later on, and if there is more than
one CRI installed on the system, it will error out, while asking for the user
to provide an override for the CRI socket. Even if the user provides an
override, the call to DetectCRISocket() can happen too early and thus ignore it
(while still erroring out).
However, if newControlPlane == true, initcfg.NodeRegistration is not used at
all and it's overwritten later on.
Thus it's necessary to supply some default value, that will avoid the call to
DetectCRISocket() and as initcfg.NodeRegistration is discarded, setting
whatever value here is harmless.

Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>

* Bump coreos/go-semver

The https://github.com/coreos/go-semver/ dependency has formally release
v0.3.0 at commit e214231b295a8ea9479f11b70b35d5acf3556d9b.  This is the
commit point we've been using, but the hack/verify-godeps.sh script
notices the discrepancy and causes ci-kubernetes-verify job to fail.

Fixes: kubernetes#76526

Signed-off-by: Tim Pepper <tpepper@vmware.com>

* Fix Azure SLB support for multiple backend pools

Azure VM and vmssVM support multiple backend pools for the same SLB, but
not for different LBs.

* Restore metrics-server using of IP addresses

This preference list matches is used to pick prefered field from k8s
node object. It was introduced in metrics-server 0.3 and changed default
behaviour to use DNS instead of IP addresses. It was merged into k8s
1.12 and caused breaking change by introducing dependency on DNS
configuration.

* refactor detach azure disk retry operation

* move disk lock process to azure cloud provider

fix comments

fix import keymux check error

add unit test for attach/detach disk funcs

* Fix concurrent map access in Portworx create volume call

Fixes kubernetes#76340

Signed-off-by: Harsh Desai <harsh@portworx.com>

* Fix race condition between actual and desired state in kublet volume manager

This PR fixes the issue kubernetes#75345. This fix modified the checking volume in
actual state when validating whether volume can be removed from desired state or not. Only if volume status is already mounted in actual state, it can be removed from desired state.
For the case of mounting fails always, it can still work because the
check also validate whether pod still exist in pod manager. In case of
mount fails, pod should be able to removed from pod manager so that
volume can also be removed from desired state.

* fix validation message: apiServerEndpoints -> apiServerEndpoint

* add shareName param in azure file storage class

skip create azure file if it exists

* Update Cluster Autoscaler to 1.14.2

* Create the "internal" firewall rule for kubemark master.

This is equivalent to the "internal" firewall rule that is created for
the regular masters.
The main reason for doing it is to allow prometheus scraping metrics
from various kubemark master components, e.g. kubelet.

Ref. kubernetes/perf-tests#503

* fix disk list corruption issue

* Restrict builds to officially supported platforms

Prior to this change, including windows/amd64 in KUBE_BUILD_PLATFORMS
would, for example, attempt to build the server binaries/tars/images for
Windows, which is not supported. This can break downstream build steps.

* Fix verify godeps failure

github.com/evanphx/json-patch added a new tag at the same sha this
morning: https://github.com/evanphx/json-patch/releases/tag/v4.2.0

This confused godeps. This PR updates our file to match godeps
expectation.

Fixes issue 77238

* Upgrade Stackdriver Logging Agent addon image from 1.6.0 to 1.6.8.

* Test kubectl cp escape

* Properly handle links in tar

* Bump debian-iptables versions to v11.0.2.

* os exit when option is true

* Pin GCE Windows node image to 1809 v20190312.

This is to work around
kubernetes#76666.

* Update the dynamic volume limit in GCE PD

Currently GCE PD support 128 maximum disks attached to a node for all
machines types except shared-core. This PR updates the limit number to
date.

Change-Id: Id9dfdbd24763b6b4138935842c246b1803838b78

* Use consistent imageRef during container startup

* Replace vmss update API with instance-level update API

commit

* Cleanup codes that not required any more

* Add unit tests

* Upgrade compute API to version 2019-03-01

* Update vendors

* Fix issues because of rebase

* Pick up security patches for fluentd-gcp-scaler by upgrading to version 0.5.2

* Short-circuit quota admission rejection on zero-delta updates

* Accept admission request if resource is being deleted

* Error when etcd3 watch finds delete event with nil prevKV

* Bump addon-manager to v9.0.1 - Rebase image on debian-base:v1.0.0.

* Remove terminated pod from summary api.

Signed-off-by: Lantao Liu <lantaol@google.com>

* Expect the correct object type to be removed

* check if Memory is not nil for container stats

* Update to go 1.12.4

* Update to go 1.12.5

* Some remaining fixes.
rjaini added a commit to msazurestackworkloads/kubernetes that referenced this pull request Jun 20, 2019
* test: remove k8s.io/apiextensions-apiserver from framework

There are two reason why this is useful:

1. less code to vendor into external users of the framework

The following dependencies become obsolete due to this change (from `dep`):

(8/23) Removed unused project github.com/grpc-ecosystem/go-grpc-prometheus
(9/23) Removed unused project github.com/coreos/etcd
(10/23) Removed unused project github.com/globalsign/mgo
(11/23) Removed unused project github.com/go-openapi/strfmt
(12/23) Removed unused project github.com/asaskevich/govalidator
(13/23) Removed unused project github.com/mitchellh/mapstructure
(14/23) Removed unused project github.com/NYTimes/gziphandler
(15/23) Removed unused project gopkg.in/natefinch/lumberjack.v2
(16/23) Removed unused project github.com/go-openapi/errors
(17/23) Removed unused project github.com/go-openapi/analysis
(18/23) Removed unused project github.com/go-openapi/runtime
(19/23) Removed unused project sigs.k8s.io/structured-merge-diff
(20/23) Removed unused project github.com/go-openapi/validate
(21/23) Removed unused project github.com/coreos/go-systemd
(22/23) Removed unused project github.com/go-openapi/loads
(23/23) Removed unused project github.com/munnerz/goautoneg

2. works around kubernetes#75338
   which currently breaks vendoring

Some recent changes to crd_util.go must now be pulling in the broken
k8s.io/apiextensions-apiserver packages, because it was still working
in revision 2e90d92 (as demonstrated by
https://github.com/intel/pmem-CSI/tree/586ae281ac2810cb4da6f1e160cf165c7daf0d80).

* update Bazel files

* test: fix golint warnings in crd_util.go

Because the code was moved, golint is now active. Because users of the
code must adapt to the new location of the code, it makes sense to
also change the API at the same time to address the style comments
from golint ("struct field ApiGroup should be APIGroup", same for
ApiExtensionClient).

* fix race condition issue for smb mount on windows

change var name

* stop vsphere cloud provider from spamming logs with `failed to patch IP`
Fixes: kubernetes#75236

* Remove reference to USE_RELEASE_NODE_BINARIES.

This variable was used for development purposes and was accidentally
introduced in
kubernetes@f0f7829.

This is its only use in the tree:
https://github.com/kubernetes/kubernetes/search?q=USE_RELEASE_NODE_BINARIES&unscoped_q=USE_RELEASE_NODE_BINARIES

* Clear conntrack entries on 0 -> 1 endpoint transition with externalIPs

As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior

This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>

* Move to golang 1.12.1 official image

We used 1.12.0 + hack to download 1.12.1 binaries as we were in a rush
on friday since the images were not published at that time. Let's remove
the hack now and republish the kube-cross image

Change-Id: I3ffff3283b6ca755320adfca3c8f4a36dc1c2b9e

* fix-kubeadm-init-output

* Mark audit e2e tests as flaky

* Bump kube-cross image to 1.12.1-2

* Restore username and password kubectl flags

* build/gci: bump CNI version to 0.7.5

* Add/Update CHANGELOG-1.14.md for v1.14.0-rc.1.

* Restore machine readability to the print-join-command output

The output of `kubeadm token create --print-join-command` should be
usable by batch scripts. This issue was pointed out in:

kubernetes/kubeadm#1454

* bump required minimum go version to 1.12.1 (strings package compatibility)

* Bump go-openapi/jsonpointer and go-openapi/jsonreference versions

xref: kubernetes#75653

Signed-off-by: Jorge Alarcon Ochoa <alarcj137@gmail.com>

* Kubernetes version v1.14.1-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.0.

* 1.14 release notes fixes

* Add flag to enable strict ARP

* Do not delete existing VS and RS when starting

* Update Cluster Autscaler version to 1.14.0

No changes since 1.14.0-beta.2
Changelog: https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.14.0

* Fix Windows to read VM UUIDs from serial numbers

Certain versions of vSphere do not have the same value for product_uuid
and product_serial. This mimics the change in kubernetes#59519.

Fixes kubernetes#74888

* godeps: update vmware/govmomi to v0.20 release

* vSphere: add token auth support for tags client

SAML auth support for the vCenter rest API endpoint came to govmomi
a bit after Zone support came to vSphere Cloud Provider.

Fixes kubernetes#75511

* vsphere: govmomi rest API simulator requires authentication

* gce: configure: validate SA has storage scope

If the VM SA doesn't have storage scope associated, don't use the
token in the curl request or the request will fail with 403.

* fix-external-etcd

* Update gcp images with security patches

[stackdriver addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[fluentd-gcp addon] Bump fluentd-gcp-scaler to v0.5.1 to pick up security fixes.
[fluentd-gcp addon] Bump event-exporter to v0.2.4 to pick up security fixes.
[fluentd-gcp addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[metatada-proxy addon] Bump prometheus-to-sd v0.5.0 to pick up security fixes.

* kubeadm: fix "upgrade plan" not working without k8s version

If the k8s version argument passed to "upgrade plan" is missing
the logic should perform the following actions:
- fetch a "stable" version from the internet.
- if that fails, fallback to the local client version.

Currentely the logic fails because the cfg.KubernetesVersion is
defaulted to the version of the existing cluster, which
then causes an early exit without any ugprade suggestions.

See app/cmd/upgrade/common.go::enforceRequirements():
  configutil.FetchInitConfigurationFromCluster(..)

Fix that by passing the explicit user value that can also be "".
This will then make the "offline getter" treat it as an explicit
desired upgrade target.

In the future it might be best to invert this logic:
- if no user k8s version argument is passed - default to the kubeadm
version.
- if labels are passed (e.g. "stable"), fetch a version from the
internet.

* Disable GCE agent address management on Windows nodes.

With this metadata key set, "GCEWindowsAgent: GCE address manager
status: disabled" will appear in the VM's serial port output during
boot.

Tested:
PROJECT=${CLOUDSDK_CORE_PROJECT} KUBE_GCE_ENABLE_IP_ALIASES=true NUM_WINDOWS_NODES=2 NUM_NODES=2 KUBERNETES_NODE_PLATFORM=windows go run ./hack/e2e.go -- --up
cluster/gce/windows/smoke-test.sh

cat > iis.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: iis
  labels:
    app: iis
spec:
  containers:
  - image: mcr.microsoft.com/windows/servercore/iis
    imagePullPolicy: IfNotPresent
    name: iis-server
    ports:
    - containerPort: 80
      protocol: TCP
  nodeSelector:
    beta.kubernetes.io/os: windows
  tolerations:
  - effect: NoSchedule
    key: node.kubernetes.io/os
    operator: Equal
    value: windows1809
EOF

kubectl create -f iis.yaml
kubectl expose pod iis --type=LoadBalancer --name=iis
kubectl get services
curl http://<service external IP address>

* kube-aggregator: bump openapi aggregation log level

* Explicitly flush headers when proxying

* fix-kubeadm-upgrade-12-13-14

* GCE/Windows: disable stackdriver logging agent

The logging service could not be stopped at times, causing node startup
failures. Disable it until the issue is fixed.

* Finish saving test results on failure

The conformance image should be saving its results
regardless of the results of the tests. However,
with errexit set, when ginkgo gets test failures
it exits 1 which prevents saving the results
for Sonobuoy to pick up.

Fixes: kubernetes#76036

* Avoid panic in cronjob sorting

This change handles the case where the ith cronjob may have its start
time set to nil.

Previously, the Less method could cause a panic in case the ith
cronjob had its start time set to nil, but the jth cronjob did not. It
would panic when calling Before on a nil StartTime.

* Removed cleanup for non-current kube-proxy modes in newProxyServer()

* Depricated --cleanup-ipvs flag in kube-proxy

* Fixed old function signature in kube-proxy tests.

* Revert "Deprecated --cleanup-ipvs flag in kube-proxy"

This reverts commit 4f1bb2b.

* Revert "Fixed old function signature in kube-proxy tests."

This reverts commit 29ba1b0.

* Fixed --cleanup-ipvs help text

* Check for required name parameter in dynamic client

The Create, Delete, Get, Patch, Update and UpdateStatus
methods in the dynamic client all expect the name
parameter to be non-empty, but did not validate this
requirement, which could lead to a panic. Add explicit
checks to these methods.

* Fix empty array expansion error in cluster/gce/util.sh

Empty array expansion causes "unbound variable" error in
bash 4.2 and bash 4.3.

* Improve volume operation metrics

* Add e2e tests

* ensuring that logic is checking for differences in listener

* Kubernetes version v1.14.2-beta.0 openapi-spec file updates

* Delete only unscheduled pods if node doesn't exist anymore.

* Add/Update CHANGELOG-1.14.md for v1.14.1.

* Use Node-Problem-Detector v0.6.3 on GCI

* proxy: Take into account exclude CIDRs while deleting legacy real servers

* kubeadm: Don't error out on join with --cri-socket override

In the case where newControlPlane is true we don't go through
getNodeRegistration() and initcfg.NodeRegistration.CRISocket is empty.
This forces DetectCRISocket() to be called later on, and if there is more than
one CRI installed on the system, it will error out, while asking for the user
to provide an override for the CRI socket. Even if the user provides an
override, the call to DetectCRISocket() can happen too early and thus ignore it
(while still erroring out).
However, if newControlPlane == true, initcfg.NodeRegistration is not used at
all and it's overwritten later on.
Thus it's necessary to supply some default value, that will avoid the call to
DetectCRISocket() and as initcfg.NodeRegistration is discarded, setting
whatever value here is harmless.

Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>

* Bump coreos/go-semver

The https://github.com/coreos/go-semver/ dependency has formally release
v0.3.0 at commit e214231b295a8ea9479f11b70b35d5acf3556d9b.  This is the
commit point we've been using, but the hack/verify-godeps.sh script
notices the discrepancy and causes ci-kubernetes-verify job to fail.

Fixes: kubernetes#76526

Signed-off-by: Tim Pepper <tpepper@vmware.com>

* Fix Azure SLB support for multiple backend pools

Azure VM and vmssVM support multiple backend pools for the same SLB, but
not for different LBs.

* Restore metrics-server using of IP addresses

This preference list matches is used to pick prefered field from k8s
node object. It was introduced in metrics-server 0.3 and changed default
behaviour to use DNS instead of IP addresses. It was merged into k8s
1.12 and caused breaking change by introducing dependency on DNS
configuration.

* refactor detach azure disk retry operation

* move disk lock process to azure cloud provider

fix comments

fix import keymux check error

add unit test for attach/detach disk funcs

* Fix concurrent map access in Portworx create volume call

Fixes kubernetes#76340

Signed-off-by: Harsh Desai <harsh@portworx.com>

* Fix race condition between actual and desired state in kublet volume manager

This PR fixes the issue kubernetes#75345. This fix modified the checking volume in
actual state when validating whether volume can be removed from desired state or not. Only if volume status is already mounted in actual state, it can be removed from desired state.
For the case of mounting fails always, it can still work because the
check also validate whether pod still exist in pod manager. In case of
mount fails, pod should be able to removed from pod manager so that
volume can also be removed from desired state.

* fix validation message: apiServerEndpoints -> apiServerEndpoint

* add shareName param in azure file storage class

skip create azure file if it exists

* Update Cluster Autoscaler to 1.14.2

* Create the "internal" firewall rule for kubemark master.

This is equivalent to the "internal" firewall rule that is created for
the regular masters.
The main reason for doing it is to allow prometheus scraping metrics
from various kubemark master components, e.g. kubelet.

Ref. kubernetes/perf-tests#503

* fix disk list corruption issue

* Restrict builds to officially supported platforms

Prior to this change, including windows/amd64 in KUBE_BUILD_PLATFORMS
would, for example, attempt to build the server binaries/tars/images for
Windows, which is not supported. This can break downstream build steps.

* Fix verify godeps failure

github.com/evanphx/json-patch added a new tag at the same sha this
morning: https://github.com/evanphx/json-patch/releases/tag/v4.2.0

This confused godeps. This PR updates our file to match godeps
expectation.

Fixes issue 77238

* Upgrade Stackdriver Logging Agent addon image from 1.6.0 to 1.6.8.

* Test kubectl cp escape

* Properly handle links in tar

* Bump debian-iptables versions to v11.0.2.

* os exit when option is true

* Pin GCE Windows node image to 1809 v20190312.

This is to work around
kubernetes#76666.

* Update the dynamic volume limit in GCE PD

Currently GCE PD support 128 maximum disks attached to a node for all
machines types except shared-core. This PR updates the limit number to
date.

Change-Id: Id9dfdbd24763b6b4138935842c246b1803838b78

* Use consistent imageRef during container startup

* Replace vmss update API with instance-level update API

commit

* Cleanup codes that not required any more

* Add unit tests

* Upgrade compute API to version 2019-03-01

* Update vendors

* Fix issues because of rebase

* Pick up security patches for fluentd-gcp-scaler by upgrading to version 0.5.2

* Short-circuit quota admission rejection on zero-delta updates

* Accept admission request if resource is being deleted

* Error when etcd3 watch finds delete event with nil prevKV

* Bump addon-manager to v9.0.1 - Rebase image on debian-base:v1.0.0.

* Remove terminated pod from summary api.

Signed-off-by: Lantao Liu <lantaol@google.com>

* Expect the correct object type to be removed

* check if Memory is not nil for container stats

* Fix eviction dry-run

* Update k8s-dns-node-cache image version

This revised image resolves kubernetes dns#292 by updating the image from `k8s-dns-node-cache:1.15.2` to `k8s-dns-node-cache:1.15.2`

* Update to go 1.12.4

* Update to go 1.12.5

* fix incorrect prometheus metrics

fix left incorrect metrics

* In GuaranteedUpdate, retry on any error if we are working with stale data

* BoundServiceAccountTokenVolume: fix InClusterConfig

* Don't create a RuntimeClassManager without a KubeClient

* Kubernetes version v1.14.3-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.2.

* fix CVE-2019-11244: `kubectl --http-cache=<world-accessible dir>` creates world-writeable cached schema files

* Upgrade Azure network API version to 2018-07-01

* Update godeps

* Terminate watchers when watch cache is destroyed

* honor overridden tokenfile, add InClusterConfig override tests

* Don't use mapfile as it isn't bash 3 compatible

* fix unbound array variable

* fix unbound variable release.sh

* Don't use declare -g in build

* Check KUBE_SERVER_PLATFORMS existence

when compile kubectl on platform other than
linux/amd64, we need to check the KUBE_SERVER_PLATFORMS
array emptiness before assign it.

the example command is:
make WHAT=cmd/kubectl KUBE_BUILD_PLATFORMS="darwin/amd64 windows/amd64"

* Backport of kubernetes#78137: godeps: update vmware/govmomi to v0.20.1

Cannot cherry-pick kubernetes#78137 (go mod vs godep)

Includes fix for SAML token auth with vSphere and zones API

Issue kubernetes#77360

See also: kubernetes#75742

* fix: failed to close kubelet->API connections on heartbeat failure

* Revert "Use consistent imageRef during container startup"

This reverts commit 26e3c86.

* fix azure retry issue when return 2XX with error

fix comments

* Disable graceful termination for udp
rjaini added a commit to msazurestackworkloads/kubernetes that referenced this pull request Jul 11, 2019
* test: remove k8s.io/apiextensions-apiserver from framework

There are two reason why this is useful:

1. less code to vendor into external users of the framework

The following dependencies become obsolete due to this change (from `dep`):

(8/23) Removed unused project github.com/grpc-ecosystem/go-grpc-prometheus
(9/23) Removed unused project github.com/coreos/etcd
(10/23) Removed unused project github.com/globalsign/mgo
(11/23) Removed unused project github.com/go-openapi/strfmt
(12/23) Removed unused project github.com/asaskevich/govalidator
(13/23) Removed unused project github.com/mitchellh/mapstructure
(14/23) Removed unused project github.com/NYTimes/gziphandler
(15/23) Removed unused project gopkg.in/natefinch/lumberjack.v2
(16/23) Removed unused project github.com/go-openapi/errors
(17/23) Removed unused project github.com/go-openapi/analysis
(18/23) Removed unused project github.com/go-openapi/runtime
(19/23) Removed unused project sigs.k8s.io/structured-merge-diff
(20/23) Removed unused project github.com/go-openapi/validate
(21/23) Removed unused project github.com/coreos/go-systemd
(22/23) Removed unused project github.com/go-openapi/loads
(23/23) Removed unused project github.com/munnerz/goautoneg

2. works around kubernetes#75338
   which currently breaks vendoring

Some recent changes to crd_util.go must now be pulling in the broken
k8s.io/apiextensions-apiserver packages, because it was still working
in revision 2e90d92 (as demonstrated by
https://github.com/intel/pmem-CSI/tree/586ae281ac2810cb4da6f1e160cf165c7daf0d80).

* update Bazel files

* test: fix golint warnings in crd_util.go

Because the code was moved, golint is now active. Because users of the
code must adapt to the new location of the code, it makes sense to
also change the API at the same time to address the style comments
from golint ("struct field ApiGroup should be APIGroup", same for
ApiExtensionClient).

* fix race condition issue for smb mount on windows

change var name

* stop vsphere cloud provider from spamming logs with `failed to patch IP`
Fixes: kubernetes#75236

* Remove reference to USE_RELEASE_NODE_BINARIES.

This variable was used for development purposes and was accidentally
introduced in
kubernetes@f0f7829.

This is its only use in the tree:
https://github.com/kubernetes/kubernetes/search?q=USE_RELEASE_NODE_BINARIES&unscoped_q=USE_RELEASE_NODE_BINARIES

* Clear conntrack entries on 0 -> 1 endpoint transition with externalIPs

As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior

This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>

* Move to golang 1.12.1 official image

We used 1.12.0 + hack to download 1.12.1 binaries as we were in a rush
on friday since the images were not published at that time. Let's remove
the hack now and republish the kube-cross image

Change-Id: I3ffff3283b6ca755320adfca3c8f4a36dc1c2b9e

* fix-kubeadm-init-output

* Mark audit e2e tests as flaky

* Bump kube-cross image to 1.12.1-2

* Restore username and password kubectl flags

* build/gci: bump CNI version to 0.7.5

* Add/Update CHANGELOG-1.14.md for v1.14.0-rc.1.

* Restore machine readability to the print-join-command output

The output of `kubeadm token create --print-join-command` should be
usable by batch scripts. This issue was pointed out in:

kubernetes/kubeadm#1454

* bump required minimum go version to 1.12.1 (strings package compatibility)

* Bump go-openapi/jsonpointer and go-openapi/jsonreference versions

xref: kubernetes#75653

Signed-off-by: Jorge Alarcon Ochoa <alarcj137@gmail.com>

* Kubernetes version v1.14.1-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.0.

* 1.14 release notes fixes

* Add flag to enable strict ARP

* Do not delete existing VS and RS when starting

* Update Cluster Autscaler version to 1.14.0

No changes since 1.14.0-beta.2
Changelog: https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.14.0

* Fix Windows to read VM UUIDs from serial numbers

Certain versions of vSphere do not have the same value for product_uuid
and product_serial. This mimics the change in kubernetes#59519.

Fixes kubernetes#74888

* godeps: update vmware/govmomi to v0.20 release

* vSphere: add token auth support for tags client

SAML auth support for the vCenter rest API endpoint came to govmomi
a bit after Zone support came to vSphere Cloud Provider.

Fixes kubernetes#75511

* vsphere: govmomi rest API simulator requires authentication

* gce: configure: validate SA has storage scope

If the VM SA doesn't have storage scope associated, don't use the
token in the curl request or the request will fail with 403.

* fix-external-etcd

* Update gcp images with security patches

[stackdriver addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[fluentd-gcp addon] Bump fluentd-gcp-scaler to v0.5.1 to pick up security fixes.
[fluentd-gcp addon] Bump event-exporter to v0.2.4 to pick up security fixes.
[fluentd-gcp addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[metatada-proxy addon] Bump prometheus-to-sd v0.5.0 to pick up security fixes.

* kubeadm: fix "upgrade plan" not working without k8s version

If the k8s version argument passed to "upgrade plan" is missing
the logic should perform the following actions:
- fetch a "stable" version from the internet.
- if that fails, fallback to the local client version.

Currentely the logic fails because the cfg.KubernetesVersion is
defaulted to the version of the existing cluster, which
then causes an early exit without any ugprade suggestions.

See app/cmd/upgrade/common.go::enforceRequirements():
  configutil.FetchInitConfigurationFromCluster(..)

Fix that by passing the explicit user value that can also be "".
This will then make the "offline getter" treat it as an explicit
desired upgrade target.

In the future it might be best to invert this logic:
- if no user k8s version argument is passed - default to the kubeadm
version.
- if labels are passed (e.g. "stable"), fetch a version from the
internet.

* Disable GCE agent address management on Windows nodes.

With this metadata key set, "GCEWindowsAgent: GCE address manager
status: disabled" will appear in the VM's serial port output during
boot.

Tested:
PROJECT=${CLOUDSDK_CORE_PROJECT} KUBE_GCE_ENABLE_IP_ALIASES=true NUM_WINDOWS_NODES=2 NUM_NODES=2 KUBERNETES_NODE_PLATFORM=windows go run ./hack/e2e.go -- --up
cluster/gce/windows/smoke-test.sh

cat > iis.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: iis
  labels:
    app: iis
spec:
  containers:
  - image: mcr.microsoft.com/windows/servercore/iis
    imagePullPolicy: IfNotPresent
    name: iis-server
    ports:
    - containerPort: 80
      protocol: TCP
  nodeSelector:
    beta.kubernetes.io/os: windows
  tolerations:
  - effect: NoSchedule
    key: node.kubernetes.io/os
    operator: Equal
    value: windows1809
EOF

kubectl create -f iis.yaml
kubectl expose pod iis --type=LoadBalancer --name=iis
kubectl get services
curl http://<service external IP address>

* kube-aggregator: bump openapi aggregation log level

* Explicitly flush headers when proxying

* fix-kubeadm-upgrade-12-13-14

* GCE/Windows: disable stackdriver logging agent

The logging service could not be stopped at times, causing node startup
failures. Disable it until the issue is fixed.

* Finish saving test results on failure

The conformance image should be saving its results
regardless of the results of the tests. However,
with errexit set, when ginkgo gets test failures
it exits 1 which prevents saving the results
for Sonobuoy to pick up.

Fixes: kubernetes#76036

* Avoid panic in cronjob sorting

This change handles the case where the ith cronjob may have its start
time set to nil.

Previously, the Less method could cause a panic in case the ith
cronjob had its start time set to nil, but the jth cronjob did not. It
would panic when calling Before on a nil StartTime.

* Removed cleanup for non-current kube-proxy modes in newProxyServer()

* Depricated --cleanup-ipvs flag in kube-proxy

* Fixed old function signature in kube-proxy tests.

* Revert "Deprecated --cleanup-ipvs flag in kube-proxy"

This reverts commit 4f1bb2b.

* Revert "Fixed old function signature in kube-proxy tests."

This reverts commit 29ba1b0.

* Fixed --cleanup-ipvs help text

* Check for required name parameter in dynamic client

The Create, Delete, Get, Patch, Update and UpdateStatus
methods in the dynamic client all expect the name
parameter to be non-empty, but did not validate this
requirement, which could lead to a panic. Add explicit
checks to these methods.

* Fix empty array expansion error in cluster/gce/util.sh

Empty array expansion causes "unbound variable" error in
bash 4.2 and bash 4.3.

* Improve volume operation metrics

* Add e2e tests

* ensuring that logic is checking for differences in listener

* Kubernetes version v1.14.2-beta.0 openapi-spec file updates

* Delete only unscheduled pods if node doesn't exist anymore.

* Add/Update CHANGELOG-1.14.md for v1.14.1.

* Use Node-Problem-Detector v0.6.3 on GCI

* proxy: Take into account exclude CIDRs while deleting legacy real servers

* kubeadm: Don't error out on join with --cri-socket override

In the case where newControlPlane is true we don't go through
getNodeRegistration() and initcfg.NodeRegistration.CRISocket is empty.
This forces DetectCRISocket() to be called later on, and if there is more than
one CRI installed on the system, it will error out, while asking for the user
to provide an override for the CRI socket. Even if the user provides an
override, the call to DetectCRISocket() can happen too early and thus ignore it
(while still erroring out).
However, if newControlPlane == true, initcfg.NodeRegistration is not used at
all and it's overwritten later on.
Thus it's necessary to supply some default value, that will avoid the call to
DetectCRISocket() and as initcfg.NodeRegistration is discarded, setting
whatever value here is harmless.

Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>

* Bump coreos/go-semver

The https://github.com/coreos/go-semver/ dependency has formally release
v0.3.0 at commit e214231b295a8ea9479f11b70b35d5acf3556d9b.  This is the
commit point we've been using, but the hack/verify-godeps.sh script
notices the discrepancy and causes ci-kubernetes-verify job to fail.

Fixes: kubernetes#76526

Signed-off-by: Tim Pepper <tpepper@vmware.com>

* Fix Azure SLB support for multiple backend pools

Azure VM and vmssVM support multiple backend pools for the same SLB, but
not for different LBs.

* Restore metrics-server using of IP addresses

This preference list matches is used to pick prefered field from k8s
node object. It was introduced in metrics-server 0.3 and changed default
behaviour to use DNS instead of IP addresses. It was merged into k8s
1.12 and caused breaking change by introducing dependency on DNS
configuration.

* refactor detach azure disk retry operation

* move disk lock process to azure cloud provider

fix comments

fix import keymux check error

add unit test for attach/detach disk funcs

* Fix concurrent map access in Portworx create volume call

Fixes kubernetes#76340

Signed-off-by: Harsh Desai <harsh@portworx.com>

* Fix race condition between actual and desired state in kublet volume manager

This PR fixes the issue kubernetes#75345. This fix modified the checking volume in
actual state when validating whether volume can be removed from desired state or not. Only if volume status is already mounted in actual state, it can be removed from desired state.
For the case of mounting fails always, it can still work because the
check also validate whether pod still exist in pod manager. In case of
mount fails, pod should be able to removed from pod manager so that
volume can also be removed from desired state.

* fix validation message: apiServerEndpoints -> apiServerEndpoint

* add shareName param in azure file storage class

skip create azure file if it exists

* Update Cluster Autoscaler to 1.14.2

* Create the "internal" firewall rule for kubemark master.

This is equivalent to the "internal" firewall rule that is created for
the regular masters.
The main reason for doing it is to allow prometheus scraping metrics
from various kubemark master components, e.g. kubelet.

Ref. kubernetes/perf-tests#503

* fix disk list corruption issue

* Restrict builds to officially supported platforms

Prior to this change, including windows/amd64 in KUBE_BUILD_PLATFORMS
would, for example, attempt to build the server binaries/tars/images for
Windows, which is not supported. This can break downstream build steps.

* Fix verify godeps failure

github.com/evanphx/json-patch added a new tag at the same sha this
morning: https://github.com/evanphx/json-patch/releases/tag/v4.2.0

This confused godeps. This PR updates our file to match godeps
expectation.

Fixes issue 77238

* Upgrade Stackdriver Logging Agent addon image from 1.6.0 to 1.6.8.

* Test kubectl cp escape

* Properly handle links in tar

* Bump debian-iptables versions to v11.0.2.

* os exit when option is true

* Pin GCE Windows node image to 1809 v20190312.

This is to work around
kubernetes#76666.

* Update the dynamic volume limit in GCE PD

Currently GCE PD support 128 maximum disks attached to a node for all
machines types except shared-core. This PR updates the limit number to
date.

Change-Id: Id9dfdbd24763b6b4138935842c246b1803838b78

* Use consistent imageRef during container startup

* Replace vmss update API with instance-level update API

commit

* Cleanup codes that not required any more

* Add unit tests

* Upgrade compute API to version 2019-03-01

* Update vendors

* Fix issues because of rebase

* Pick up security patches for fluentd-gcp-scaler by upgrading to version 0.5.2

* Short-circuit quota admission rejection on zero-delta updates

* Accept admission request if resource is being deleted

* Error when etcd3 watch finds delete event with nil prevKV

* Bump addon-manager to v9.0.1 - Rebase image on debian-base:v1.0.0.

* Remove terminated pod from summary api.

Signed-off-by: Lantao Liu <lantaol@google.com>

* Expect the correct object type to be removed

* check if Memory is not nil for container stats

* Fix eviction dry-run

* Update k8s-dns-node-cache image version

This revised image resolves kubernetes dns#292 by updating the image from `k8s-dns-node-cache:1.15.2` to `k8s-dns-node-cache:1.15.2`

* Update to go 1.12.4

* Update to go 1.12.5

* Bump ip-masq-agent version to v2.3.0

* fix incorrect prometheus metrics

fix left incorrect metrics

* In GuaranteedUpdate, retry on any error if we are working with stale data

* BoundServiceAccountTokenVolume: fix InClusterConfig

* Don't create a RuntimeClassManager without a KubeClient

* Kubernetes version v1.14.3-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.2.

* fix CVE-2019-11244: `kubectl --http-cache=<world-accessible dir>` creates world-writeable cached schema files

* Upgrade Azure network API version to 2018-07-01

* Update godeps

* Terminate watchers when watch cache is destroyed

* honor overridden tokenfile, add InClusterConfig override tests

* Don't use mapfile as it isn't bash 3 compatible

* fix unbound array variable

* fix unbound variable release.sh

* Don't use declare -g in build

* Check KUBE_SERVER_PLATFORMS existence

when compile kubectl on platform other than
linux/amd64, we need to check the KUBE_SERVER_PLATFORMS
array emptiness before assign it.

the example command is:
make WHAT=cmd/kubectl KUBE_BUILD_PLATFORMS="darwin/amd64 windows/amd64"

* Backport of kubernetes#78137: godeps: update vmware/govmomi to v0.20.1

Cannot cherry-pick kubernetes#78137 (go mod vs godep)

Includes fix for SAML token auth with vSphere and zones API

Issue kubernetes#77360

See also: kubernetes#75742

* fix: failed to close kubelet->API connections on heartbeat failure

* Revert "Use consistent imageRef during container startup"

This reverts commit 26e3c86.

* fix azure retry issue when return 2XX with error

fix comments

* Disable graceful termination for udp

* cherry pick of 017f57a, had to do a very simple merge of BUILD

* Fix memory leak from not closing hcs container handles

* Fix volume mount tests issue for windows

For windows node, security context is disabled. This PR fixes a bug so
that fsGroup will not be applied to pods that run on windows node.

Change-Id: Id9870416d2ad8ef791b3b4896d6747a2adbada2f

* Kubernetes version v1.14.4-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.14.md for v1.14.3.

* Fix kubectl apply skew test with extra properties

* fix: update vm if detach a non-existing disk

fix gofmt issue

* picked up extra unnecessary  dep in merge

at least verify build thinks its unnecessary

* Move CSIDriver Lister to the controller

* Fix incorrect procMount defaulting

* vSphere: allow SAML token delegation

Issue kubernetes#77360

* Use any host that mounts the datastore to create Volume

Also, This change makes zone to work per datacenter and cleans up dummy vms.
There can be multiple datastores found for a given name. The datastore name is
unique only within a datacenter. So this commit returns a list of datastores
for a given datastore name in FindDatastoreByName() method. The calles are
responsible to handle or find the right datastore to use among those returned.

* ipvs: fix string check for IPVS protocol during graceful termination

Signed-off-by: Andrew Sy Kim <kiman@vmware.com>

* fix flexvol stuck issue due to corrupted mnt point

fix comments about PathExists

fix comments

revert change in PathExists func

* Avoid the default server mux

* Ignore cgroup pid support if related feature gates are disabled

* kubelet: retry pod sandbox creation when containers were never created

If kubelet never gets past sandbox creation (i.e., never attempted to
create containers for a pod), it should retry the sandbox creation on
failure, regardless of the restart policy of the pod.

* Default resourceGroup should be used when value of annotation azure-load-balancer-resource-group is empty string

* fix kubelet can not delete orphaned pod directory when the kubelet's root directory symbolically links to another device's directory

* Allow unit test to pass on machines without ipv6

* Fix AWS DHCP option set domain names causing garbled InternalDNS or Hostname addresses on Node

* Fix closing of dirs in doSafeMakeDir

This fixes the issue where "childFD" from syscall.Openat is assigned to
a local variable inside the for loop, instead of the correct one in the
function scope. This results in that when trying to close the "childFD"
in the function scope, it will be equal to "-1", instead of the correct
value.

* There are various reasons that the HPA will decide not the change the
current scale. Two important ones are when missing metrics might
change the direction of scaling, and when the recommended scale is
within tolerance of the current scale.

The way that ReplicaCalculator signals it's desire to not change the
current scale is by returning the current scale. However the current
scale is from scale.Status.Replicas and can be larger than
scale.Spec.Replicas (e.g. during Deployment rollout with configured
surge). This causes a positive feedback loop because
scale.Status.Replicas is written back into scale.Spec.Replicas,
further increasing the current scale.

This PR fixes the feedback loop by plumbing the replica count from
spec through horizontal.go and replica_calculator.go so the calculator
can punt with the right value.

* edit google dns hostname
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vSphere cloud provider fails to attach a volume due to Unable to find VM by UUID
5 participants