-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate operations to be done after a node is removed from cluster #87
Comments
Hey @kmova! Came across this via LFX-Mentorship repo. Would love to get involved and apply for the same! I have had previous experience on working on open source projects (participated in GSoC'21 under CERN) and would love to continue learning! I had a question regarding the LFX mentorship application: Do we need to draft a proposal for the project? |
Can we reproduce this problem at will?
It will be good to provide the reproduce with error messages & state of the system when this issue happens. |
I faced the same issue while testing jiva, if the disk on the node backing the local pv gets removed/re-installed the new local mount path comes up with an empty formatted volume and the replica pod remains in a pending state complaining the local path doesn't exist. |
@niladrih do you have any update on this issue? Happy to help, the issue is also easily demo'able using k3d. |
For applications which are deployed with high-availability, and can recover/rebuild the data from the lost node, persistentVolumeClaimRetentionPolicy seems like a possible solution. It is in beta (k8s v1.27 onwards), and would need the cluster admin/app operator to delete the Pod which was scheduled on the lost node. |
Assigning this to milestone v4.3 with design as the scope. |
Describe the problem/challenge you have
When a node running the stateful pod with Local PV goes out of the cluster, the pod gets into a pending state and remains in a pending state. The administrator or the automated operator will have to run some manual steps to bring the pod back online. The operations to be performed may vary depending on the way storage is connected to the nodes. However, a few options are common across different stateful operators. The general actions to be performed are:
delete
, delete the PVretain
, remove referencesDescribe the solution you'd like
A Kubernetes operator that can be launched into the cluster with a ConfigMap(s) that can specify:
Anything else you would like to add:
It should be possible to either run this operator independently or embed this controller into other stateful operators.
The text was updated successfully, but these errors were encountered: