-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix device name change issue for azure disk #60346
fix device name change issue for azure disk #60346
Conversation
/kind bug |
Does this mean if user specified |
@feiskyer Yes, if customer specify |
Is there any performance penalty by changing this? Is there any documentation about this? I think caching should achieve much more better performance than none. |
@feiskyer Here is the Azure Disk Caching doc, host cache setting depends on the workload, we should let customers decide host cache settings if they really want to enable it according to their real workload. While by default, we should set as None, otherwise disk would be not accessiable. Below are the doc about host cache settings: With Premium storage accounts, all disks can have caching enabled, but only OS disks can have Read/Write, data disks can be set only to ReadOnly or None. For write intensive I/O, it’s recommended to set caching to None on the additional data disks. |
@andyzhangx Thanks. Reasonable changing default to None. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx, feiskyer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
@andyzhangx: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue (batch tested with PRs 60346, 60135, 60289, 59643, 52640). If you want to cherry-pick this change to another branch, please follow the instructions here. |
Removing label |
…0346-upstream-release-1.9 Automatic merge from submit-queue. Automated cherry pick of #60346: fix device name change issue for azure disk Cherry pick of #60346 on release-1.9. #60346: fix device name change issue for azure disk **Release note**: ``` fix disk unavailable issue when mounting multiple azure disks due to dev name change ```
What this PR does / why we need it:
fix device name change issue for azure disk due to default host cache setting changed from None to ReadWrite from v1.7, and default host cache setting in azure portal is
None
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #60344, #57444
also fixes following issues:
Azure/acs-engine#1918
Azure/AKS#201
Special notes for your reviewer:
From v1.7, default host cache setting changed from None to ReadWrite, this would lead to device name change after attach multiple disks on azure vm, finally lead to disk unaccessiable from pod.
For an example:
statefulset with 8 replicas(each with an azure disk) on one node will always fail, according to my observation, add the 6th data disk will always make dev name change, some pod could not access data disk after that.
I have verified this fix on v1.8.4
Without this PR on one node(dev name changes):
With this PR on one node(no dev name change):
Following
myvm-0
,myvm-1
is crashing due to dev name change, after controller manager replacement, myvm2-x pods work well.Release note:
/assign @karataliu
/sig azure
@feiskyer could you mark it as v1.10 milestone?
@brendandburns @khenidak @rootfs @jdumars FYI
Since it's a critical bug, I will cherry pick this fix to v1.7-v1.9, note that v1.6 does not have this issue since default cachingmode is
None