-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon: Remove extra mon from quorum before taking down pod #14667
Conversation
56e2458
to
55029b8
Compare
return errors.Wrap(err, "failed to update cluster rbd bootstrap peer token") | ||
} | ||
|
||
// We remove the mon pod last so that if there is some disconnect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We remove the mon pod last
Can this be reframed??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'll clarify the comment
|
||
logger.Infof("there is an extra mon deployment that is not needed and not in quorum") | ||
for _, deploy := range deployments { | ||
monName := deploy.Labels[controller.DaemonIDLabel] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
controller.DaemonIDLabel
Here how we decide the extra one using label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll clarify the comments. Basically, if we find an extra mon deployment that is not in the ceph mon quorum, we can delete the extra mon deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How it is identified as extra mon using this exp
monName := deploy.Labels[controller.DaemonIDLabel]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the name of the mon daemon, found in a running deployment. Then if the mon daemon of the same name is not found in the loop below comparing against each mon in the desired list of mons, it will be considered extra and needs to be removed on line 1090.
When removing a mon from quorum, there is a race condition that can result in mon quorum going being lost at least temporarily. The mon pod was being deleted first, and then the mon removed from quorum. If any other mon went down between the time the pod of the bad mon was deleted and when the mon was removed from quorum, there may not be sufficient quorum to complete the action of removing the mon from quorum and the operator would be stuck. For example, there could be 4 mons temporarily due to timing of upgrading K8s nodes where mons may be taken down for some number of minutes. Say a new mon is started while the down mon also comes back up. Now the operator sees it can remove the 4th mon from quorum, so it starts to remove it. Now say another mon goes down on another node that is being updated or otherwise drained. Since the 4th mon pod was deleted and another mon is down, there are only two mons remaining in quorum, but 3 mons are required in quorum when there are 4 mons. Therefore, the quorum is stuck until the third mon comes back up. The solution is to first remove the extra mon from quorum before taking down the mon pod. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
55029b8
to
8987d26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
mon: Remove extra mon from quorum before taking down pod (backport #14667)
When removing a mon from quorum, there is a race condition that can result in mon quorum going being lost at least temporarily. The mon pod was being deleted first, and then the mon removed from quorum. If any other mon went down between the time the pod of the bad mon was deleted and when the mon was removed from quorum, there may not be sufficient quorum to complete the action of removing the mon from quorum and the operator would be stuck.
For example, there could be 4 mons temporarily due to timing of upgrading K8s nodes where mons may be taken down for some number of minutes. Say a new mon is started while the down mon also comes back up. Now the operator sees it can remove the 4th mon from quorum, so it starts to remove it. Now say another mon goes down on another node that is being updated or otherwise drained. Since the 4th mon pod was deleted and another mon is down, there are only two mons remaining in quorum, but 3 mons are required in quorum when there are 4 mons. Therefore, the quorum is stuck until the third mon comes back up.
The solution is to first remove the extra mon from quorum before taking down the mon pod.
Checklist: