mon: Remove extra mon from quorum before taking down pod #14667

travisn · 2024-08-30T18:19:20Z

When removing a mon from quorum, there is a race condition that can result in mon quorum going being lost at least temporarily. The mon pod was being deleted first, and then the mon removed from quorum. If any other mon went down between the time the pod of the bad mon was deleted and when the mon was removed from quorum, there may not be sufficient quorum to complete the action of removing the mon from quorum and the operator would be stuck.

For example, there could be 4 mons temporarily due to timing of upgrading K8s nodes where mons may be taken down for some number of minutes. Say a new mon is started while the down mon also comes back up. Now the operator sees it can remove the 4th mon from quorum, so it starts to remove it. Now say another mon goes down on another node that is being updated or otherwise drained. Since the 4th mon pod was deleted and another mon is down, there are only two mons remaining in quorum, but 3 mons are required in quorum when there are 4 mons. Therefore, the quorum is stuck until the third mon comes back up.

The solution is to first remove the extra mon from quorum before taking down the mon pod.

Checklist:

Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
Reviewed the developer guide on Submitting a Pull Request
Pending release notes updated with breaking and/or notable changes for the next minor release.
Documentation has been updated, if necessary.
Unit tests have been added, if necessary.
Integration tests have been added, if necessary.

pkg/operator/ceph/cluster/mon/health.go

parth-gr · 2024-09-04T14:43:25Z

pkg/operator/ceph/cluster/mon/health.go

+		return errors.Wrap(err, "failed to update cluster rbd bootstrap peer token")
+	}
+
+	// We remove the mon pod last so that if there is some disconnect


We remove the mon pod last

Can this be reframed??

Yes I'll clarify the comment

parth-gr · 2024-09-04T14:45:24Z

pkg/operator/ceph/cluster/mon/mon.go

+
+	logger.Infof("there is an extra mon deployment that is not needed and not in quorum")
+	for _, deploy := range deployments {
+		monName := deploy.Labels[controller.DaemonIDLabel]


controller.DaemonIDLabel
Here how we decide the extra one using label?

I'll clarify the comments. Basically, if we find an extra mon deployment that is not in the ceph mon quorum, we can delete the extra mon deployment.

How it is identified as extra mon using this exp
monName := deploy.Labels[controller.DaemonIDLabel]

That's the name of the mon daemon, found in a running deployment. Then if the mon daemon of the same name is not found in the loop below comparing against each mon in the desired list of mons, it will be considered extra and needs to be removed on line 1090.

When removing a mon from quorum, there is a race condition that can result in mon quorum going being lost at least temporarily. The mon pod was being deleted first, and then the mon removed from quorum. If any other mon went down between the time the pod of the bad mon was deleted and when the mon was removed from quorum, there may not be sufficient quorum to complete the action of removing the mon from quorum and the operator would be stuck. For example, there could be 4 mons temporarily due to timing of upgrading K8s nodes where mons may be taken down for some number of minutes. Say a new mon is started while the down mon also comes back up. Now the operator sees it can remove the 4th mon from quorum, so it starts to remove it. Now say another mon goes down on another node that is being updated or otherwise drained. Since the 4th mon pod was deleted and another mon is down, there are only two mons remaining in quorum, but 3 mons are required in quorum when there are 4 mons. Therefore, the quorum is stuck until the third mon comes back up. The solution is to first remove the extra mon from quorum before taking down the mon pod. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>

parth-gr

lgtm

mon: Remove extra mon from quorum before taking down pod (backport #14667)

travisn added the backport-release-1.15 label Aug 30, 2024

travisn requested a review from sp98 August 30, 2024 18:19

travisn commented Aug 30, 2024

View reviewed changes

pkg/operator/ceph/cluster/mon/health.go Show resolved Hide resolved

pkg/operator/ceph/cluster/mon/health.go Show resolved Hide resolved

travisn force-pushed the remove-mon-race branch from 56e2458 to 55029b8 Compare August 30, 2024 18:29

sp98 approved these changes Sep 4, 2024

View reviewed changes

sp98 reviewed Sep 4, 2024

View reviewed changes

pkg/operator/ceph/cluster/mon/health.go Outdated Show resolved Hide resolved

parth-gr reviewed Sep 4, 2024

View reviewed changes

travisn force-pushed the remove-mon-race branch from 55029b8 to 8987d26 Compare September 4, 2024 19:06

parth-gr approved these changes Sep 5, 2024

View reviewed changes

travisn merged commit 772a4fa into rook:master Sep 5, 2024
54 checks passed

mergify bot mentioned this pull request Sep 5, 2024

mon: Remove extra mon from quorum before taking down pod (backport #14667) #14692

Merged

6 tasks

mergify bot added a commit that referenced this pull request Sep 5, 2024

Merge pull request #14692 from rook/mergify/bp/release-1.15/pr-14667

917bb01

mon: Remove extra mon from quorum before taking down pod (backport #14667)

travisn deleted the remove-mon-race branch October 4, 2024 19:36

travisn mentioned this pull request Oct 4, 2024

mon: Do not remove extra mon in middle of failover #14805

Merged

6 tasks

mergify bot mentioned this pull request Oct 7, 2024

mon: Do not remove extra mon in middle of failover (backport #14805) #14814

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mon: Remove extra mon from quorum before taking down pod #14667

mon: Remove extra mon from quorum before taking down pod #14667

travisn commented Aug 30, 2024

parth-gr Sep 4, 2024

travisn Sep 4, 2024

parth-gr Sep 4, 2024

travisn Sep 4, 2024

parth-gr Sep 5, 2024 •

edited

Loading

travisn Sep 5, 2024

parth-gr left a comment

mon: Remove extra mon from quorum before taking down pod #14667

mon: Remove extra mon from quorum before taking down pod #14667

Conversation

travisn commented Aug 30, 2024

parth-gr Sep 4, 2024

Choose a reason for hiding this comment

travisn Sep 4, 2024

Choose a reason for hiding this comment

parth-gr Sep 4, 2024

Choose a reason for hiding this comment

travisn Sep 4, 2024

Choose a reason for hiding this comment

parth-gr Sep 5, 2024 • edited Loading

Choose a reason for hiding this comment

travisn Sep 5, 2024

Choose a reason for hiding this comment

parth-gr left a comment

Choose a reason for hiding this comment

parth-gr Sep 5, 2024 •

edited

Loading