-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the workflow of xfs devices filesystem check and mount #132
Conversation
/assign @jsafrane |
/assign @gnufied |
/assign @jingxu97 |
/unassign @gnufied |
/assign @saad-ali |
/unassign @jsafrane |
/assign @thockin |
/unassign @saad-ali |
/assign @dims |
approve in principle ... please get a LGTM from sig-storage folks |
mount/mount_linux.go
Outdated
defer os.RemoveAll(target) | ||
|
||
klog.V(4).Infof("Attempting to mount disk %s at %s", source, target) | ||
if err := mounter.Interface.Mount(source, target, "", []string{"defaults"}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use target
field given in the mount function as mount location rather than tempdir? Could a kubelet/driver crash leave the volume mounted in tempdir and never get cleaned up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe the target is not existing or not available for mounting for some potential reasons. tempdir is more safer to temporarily mount and unmount, and we will make sure it is deleted in the end.
But you are right, it is a risk if the caller crashes between mount and unmount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think again, it is the caller of formatAndMount's responsibility to make sure the existence and availablity of the target, so I think we can use the target
field here.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: 27149chen The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @gnufied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign @jsafrane
Jan are you more familiar with XFS?
Signed-off-by: Lou <luogj@cn.ibm.com>
mostly lgtm. @27149chen Do you know why we are seeing filesystem corruption on volumes restored from snapshots? Is that because snapshots were taken in inconsistent state? Is there work tracked somewhere to fix the root cause of the problem too? |
@gnufied yes. |
} | ||
}() | ||
klog.V(4).Infof("Attempting to mount disk %s at %s", source, target) | ||
if err := mounter.Interface.Mount(source, target, "", []string{"defaults"}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From man-page:
In this situation, the log can be replayed by mounting and
immediately unmounting the filesystem on the same class of machine
that crashed. Please make sure that the machine's hardware is
reliable before replaying to avoid compounding the problems.
Are we being too aggressive in automatically fixing errors here? Can this make problems worse somehow? Should this be configurable? If we merge this PR as it is - it will become the new default even for in-tree Kubernetes drivers, so we have to be careful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gnufied do you mean that the node might be bad or not the same class, so there is potential risk?
@gnufied , could you continue your review please? I asked a question, can you reply it? thanks. |
@27149chen I am hoping someone who knows XFS more than me has a chance to review this PR before it gets merged. To me it looks good but my XFS knowledge is lacking and it makes big enough that change that could affect all of k8s including in-tree drivers (which are strictly in maintenance mode). |
@gnufied , please add your lgtm label if it looks good to you. |
@27149chen: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi - I'm the upstream xfsprogs maintainer, and have been an XFS developer for nearly 20 years. The problem you're trying to solve here was recently brought to my attention. It's my understanding that all of this effort to run Anything else, honestly, is simply not going to work, and will consistently lead to data loss. |
@sandeen thank you very much for your professional comment, it really helps a lot.
|
There is no reason to fsck/repair any metadata journaling filesystem before every mount. This is what the metadata log is for - it ensures consistency after a crash/power loss/etc. For journaling filesystems, fsck/repair tools only need to be used after filesystem corruption has been detected, or if for some reason you need to verify filesystem integrity prior to some administrative operation. (For example, ext4 recommends a full e2fsck to validate the filesystem before doing a resize, because resize can be a very invasive, possibly risky operation if corruption is encountered during the operation.)
If you are properly snapshotting the filesystem, xfs_repair won't be needed. xfs_repair should be exception activity, rare enough that it would be done manually when intervention is required. i.e. something goes wrong (disk bit flip, admin error, code bug, etc), xfs notices the corruption and shuts down, the administrator notices the error, and runs xfs_repair. Please understand the difference: xfs_repair will put a filesystem back into a consistent state. But it will not fully recover the prior filesystem state if you do something like take a non-atomic snapshot of the device.
XFS has runtime checks - both CRC verification as well as structure integrity validation - on every metadata read and write from disk. If it finds an error, in most cases it will shut down the fileystem. In general there should be no need to check the filesystem while it's mounted.
|
@sandeen , thank you for your explaination. My understanding is that we don't need to run fsck/xfs_repair before mounting in our code, we should mount it directly, and if it succeeds, we don't need to do anything, and if it is failed, we should return error and ask the user fix it manually. Am I right? |
Apologies for the late reply, I didn't see the notification about your question. That would be my suggestion, yes. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This PR improves the method checkAndRepairXfsFilesystem which will be used before mounting an xfs device.
changes are as following:
Backgroud:
References: