Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark openstack/cinder csi migrations are on by default #98538

Conversation

dims
Copy link
Member

@dims dims commented Jan 28, 2021

Signed-off-by: Davanum Srinivas davanum@gmail.com

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Related to kubernetes/enhancements#1489

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

ACTION REQUIRED: OpenStack Cinder CSI migration is on by default, Clinder CSI driver must be installed on clusters on OpenStack for Cinder volumes to work.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 28, 2021
@k8s-ci-robot
Copy link
Contributor

@dims: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 28, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2021
@dims
Copy link
Member Author

dims commented Jan 28, 2021

/assign @gnufied @xing-yang

@dims dims force-pushed the mark-openstack/cinder-csi-migration-as-ga branch from e205a94 to 95d03cc Compare January 28, 2021 19:10
@dims
Copy link
Member Author

dims commented Jan 28, 2021

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2021
@dims dims force-pushed the mark-openstack/cinder-csi-migration-as-ga branch from 95d03cc to cc1c284 Compare January 28, 2021 20:30
@gnufied
Copy link
Member

gnufied commented Jan 28, 2021

cc @jsafrane @msau42

@Jiawei0227
Copy link
Contributor

I think let's not rush to turn this on by GA this release. Firstly, let's wait for #98243 to get merged.

Instead, we can target to turn it on by default and remain it as beta in this release. At the same time, there is a list of criteria that each driver should follow to get it to GA. Let's check all the requirement has been met and then do the flip change.

@dims
Copy link
Member Author

dims commented Jan 28, 2021

@Jiawei0227 the openstack csi driver and the external cloud provider has been stable and working for a looooong time now. i just don't understand why we are dragging this.

https://github.com/kubernetes/kubernetes/pulls?q=is%3Apr+author%3Adims+openstack+in%3Asubject

@msau42
Copy link
Member

msau42 commented Jan 28, 2021

@dims @jsafrane has started doing testing of openstack csi migration and has already found bugs. We're putting together a list of concrete testing criteria to ensure that we have adequate coverage and confidence in the feature.

@dims
Copy link
Member Author

dims commented Jan 28, 2021

@jsafrane would you like to take this over? ( cc @gnufied )

@jsafrane
Copy link
Member

@dims, I can help with the storage parts, however, there are some steps related more to @kubernetes/cloud-provider-openstack-maintainers. As sig-storage we created https://docs.google.com/document/d/1j6uF9sOoGCQ9nNFVrdvCWyn77o9k3-758qsEj2K3BAY/edit# with the necessary steps. For OpenStack, I can't provide this:

  • Making sure that the most common tools to install/upgrade Kubernetes on OpenStack install the CSI driver too. I don't even know what are the tools. Note that simple documentation "if you want Cinder CSI driver, enable XYZ" is not enough. The driver must be installed when a cluster is being upgraded to a version that has CSI migration enabled, the upgraded cluster gets broken otherwise.
  • Add missing tests to CI. For OpenStack I miss topology tests, already found Fix translation of Cinder storage classess to CSI #98311.
  • On the other hand, we don't need to support migration in Windows containers, because in-tree Cinder did not declare that it supports Windows.
  • Some larger scale testing. CI can't cover everything. Since OpenStack is basically a Lego box that everyone assembles differently, report from multiple vendors would help too.

@dims
Copy link
Member Author

dims commented Jan 29, 2021

@dims, I can help with the storage parts, however, there are some steps related more to @kubernetes/cloud-provider-openstack-maintainers. As sig-storage we created https://docs.google.com/document/d/1j6uF9sOoGCQ9nNFVrdvCWyn77o9k3-758qsEj2K3BAY/edit# with the necessary steps. For OpenStack, I can't provide this:

  • Making sure that the most common tools to install/upgrade Kubernetes on OpenStack install the CSI driver too. I don't even know what are the tools. Note that simple documentation "if you want Cinder CSI driver, enable XYZ" is not enough. The driver must be installed when a cluster is being upgraded to a version that has CSI migration enabled, the upgraded cluster gets broken otherwise.

This we will do by documenting in cluster-api-provider-openstack and updating docs.

There is a bunch of CI jobs in openlab, #98311 would have been caught there. So i am not worried.

  • On the other hand, we don't need to support migration in Windows containers, because in-tree Cinder did not declare that it supports Windows.

Agree.

  • Some larger scale testing. CI can't cover everything. Since OpenStack is basically a Lego box that everyone assembles differently, report from multiple vendors would help too.

We can't mandate and we can't wait. Other than RH, most of the other folks have been using extrernal cloud provider + CSI AFAIK. So we have to lead the way. If we have sufficient docs and make enough noise in openstack dev mailing list. that's enough.

thanks,
dims

@jsafrane
Copy link
Member

This we will do by documenting in cluster-api-provider-openstack and updating docs.

As I wrote above, documentation may not be enough. Whatever tool a cluster admin uses to install or upgrade Kubernetes on OpenStack, this tool must install the CSI driver by default, otherwise the cluster is broken. Users don't read docs. I still haven't heard what's the most favorite tool to use on OpenStack. kops? kubeadm? Heat templates? Maybe they're all already fine, but someone must check it.

Add missing tests to CI. For OpenStack I miss topology tests, already found #98311.

There is a bunch of CI jobs in openlab, #98311 would have been caught there. So i am not worried.

No, openlab runs CSI migration tests and it did not catch this. I did not dig deeper, but the CI needs some fixes to enable topology. Does it even have multiple Cinder zones?

@xing-yang
Copy link
Contributor

typo in the release note: my default -> by default

CSIMigrationOpenStack: {Default: false, PreRelease: featuregate.Beta}, // Off by default (requires OpenStack Cinder CSI driver)
CSIMigrationOpenStackComplete: {Default: false, PreRelease: featuregate.Alpha},
CSIMigrationOpenStack: {Default: true, PreRelease: featuregate.Beta},
CSIMigrationOpenStackComplete: {Default: true, PreRelease: featuregate.Beta},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot
Copy link
Contributor

@dims: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2021
@ramineni
Copy link
Contributor

ramineni commented Feb 1, 2021

No, openlab runs CSI migration tests and it did not catch this. I did not dig deeper, but the CI needs some fixes to enable topology. Does it even have multiple Cinder zones?

@jsafrane @dims we have checked this. It does run Topology tests , looks like default manifests of external-provisioner doesnt have Topology=true feature gate enabled, thats the reason it couldn't catch it . We are working on to fix it. Thanks.

And also, kops from release 1.19 , do install cinder driver by default , https://kops.sigs.k8s.io/releases/1.19-notes/#openstack-cinder-plugin

@jsafrane
Copy link
Member

jsafrane commented Feb 1, 2021

kops from release 1.19 , do install cinder driver by default

Thanks for confirmation! Is there any other widely used installation tool?

And also, kops from release 1.19 , do install cinder driver by default , https://kops.sigs.k8s.io/releases/1.19-notes/#openstack-cinder-plugin

It's not only about running topology tests and enabling topology in the provisioner. There are few tests than actually need to have a cluster with Cinder with multiple zones. Can openlab provide this? Since I ran into #98311, it seems to me that nobody has tested it before.

@ramineni
Copy link
Contributor

ramineni commented Feb 2, 2021

kops from release 1.19 , do install cinder driver by default

Thanks for confirmation! Is there any other widely used installation tool?

kubespray also do support openstack installation I suppose . But I dont have enough data to say which is widely used.

And also, kops from release 1.19 , do install cinder driver by default , https://kops.sigs.k8s.io/releases/1.19-notes/#openstack-cinder-plugin

It's not only about running topology tests and enabling topology in the provisioner. There are few tests than actually need to have a cluster with Cinder with multiple zones. Can openlab provide this? Since I ran into #98311, it seems to me that nobody has tested it before.

@jsafrane currently tests are not running with multi AZ. Though #98311 should be caught with signle AZ as well. we have already discussed and @chrigl is working with openlab to explore the possibility to enable for multi AZ and modify scripts accordingly.

@jsafrane
Copy link
Member

For the record, I filed couple of issues that need to be fixed before we can enable this feature by default: https://github.com/kubernetes/cloud-provider-openstack/issues/created_by/jsafrane

(another view: https://github.com/orgs/kubernetes-csi/projects/40)

@dims dims force-pushed the mark-openstack/cinder-csi-migration-as-ga branch from 30d30ac to 90256bd Compare March 8, 2021 17:36
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@dims dims force-pushed the mark-openstack/cinder-csi-migration-as-ga branch from 90256bd to b74354a Compare March 8, 2021 18:03
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 8, 2021
@jsafrane
Copy link
Member

jsafrane commented Mar 8, 2021

This is the first CSI driver that enables CSI migration. The plan is to have it in beta for two next releases to get feedback from users and have GA in 1.23 at the earliest.

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 8, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2021
@jsafrane
Copy link
Member

jsafrane commented Mar 8, 2021

/milestone v1.21

@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Mar 8, 2021
@k8s-ci-robot k8s-ci-robot added release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 8, 2021
@jsafrane
Copy link
Member

jsafrane commented Mar 8, 2021

/retest

@k8s-ci-robot k8s-ci-robot merged commit 7c70213 into kubernetes:master Mar 8, 2021
@msau42
Copy link
Member

msau42 commented Mar 8, 2021

@dims can you update the PR title to reflect that this is turning on the feature by default?

@gnufied
Copy link
Member

gnufied commented Mar 17, 2021

This PR should also note that - cloudprovider_openstack_api_request_duration_seconds and cloudprovider_openstack_api_request_errors will be renamed to openstack_api_request_duration_seconds and openstack_api_request_errors_total` - so as if there are any dashboards that use old metrics can be updated.

@dims dims changed the title Mark openstack/cinder csi migrations as GA Mark openstack/cinder csi migrations are on by default Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants