Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move image registry operator to control plane #1643

Merged
merged 1 commit into from
Aug 5, 2022

Conversation

csrwng
Copy link
Contributor

@csrwng csrwng commented Aug 3, 2022

What this PR does / why we need it:
Stops CVO from reconciling the cluster image registry operator into the
guest cluster and enables the control plane operator to reconcile it on
the management cluster side.

This change is needed to make it possible to clean up resources created
by the guest cluster. Having the image registry operator on the control
plane side allows it to keep running after worker nodes have been
removed from the cluster. On the guest cluster, changing the operator
config state to "Removed" tells the operator to remove the S3 bucket
(or other cloud storage) it created when starting up.

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
Fixes #HOSTEDCP-486

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes unit tests.

@openshift-ci openshift-ci bot requested review from alvaroaleman and enxebre August 3, 2022 13:52
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 3, 2022
Copy link
Contributor

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of small nits but looks good. E2E will tell if it actually is good :)

}

switch params.platform {
case hyperv1.AWSPlatform:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so on azure this isn't going to work initially since there are no creds? If so, lets maybe create a small reminder jira issue to fix that up later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The creds for azure are already on the guest cluster since the hcco puts them there. The operator should read them as usual. We have a special case here for AWS because we need a token minter instead of using the token request volume in the podspec

sleep 2
done

echo "{{ .WorkerNamespace }}" > "{{ .TokenDir }}/namespace"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can get the namespace through the downwards api and that in turn would make the template variables all static and allow us to render it in an init or var declaration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually all vars are static... I will change to generating this in an init func.

}

/*
Command: []string{"/usr/bin/control-plane-operator", "token-minter"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will cleanup

svc.Spec.Selector = selectorLabels()
}

func ReconcileServiceMonitor(sm *prometheusoperatorv1.ServiceMonitor, clusterID string, metricsSet metrics.MetricsSet) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use a podmonitor, we do not need the service

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, I'll convert

cp "/etc/certificate/ca/ca.crt" "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
export KUBERNETES_SERVICE_HOST=kube-apiserver
export KUBERNETES_SERVICE_PORT=$KUBE_APISERVER_SERVICE_PORT
/usr/bin/cluster-image-registry-operator \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit to stop running the shell:

Suggested change
/usr/bin/cluster-image-registry-operator \
exec /usr/bin/cluster-image-registry-operator \

targetPort: metrics
tlsConfig:
ca:
secret:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding, does this mean that whatever does the scraping needs permissions to read secrets? I think it would be good if we could avoid that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already require that in other service monitors (kas, kcm, etc) because we authenticate with the KAS via mtls to access the metrics endpoint.

@csrwng csrwng force-pushed the move-registry-operator branch 2 times, most recently from 78f0e91 to 90b3f3a Compare August 3, 2022 15:28
@csrwng
Copy link
Contributor Author

csrwng commented Aug 3, 2022

@alvaroaleman thx for the review. Addressed your comments.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2022
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 2 against base HEAD 0b95b70 and 8 for PR HEAD 90b3f3a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 1 against base HEAD 0b95b70 and 7 for PR HEAD 90b3f3a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 0b95b70 and 6 for PR HEAD 90b3f3a in total

@alvaroaleman
Copy link
Contributor

/retest-required

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 2 against base HEAD 8b84dee and 5 for PR HEAD 90b3f3a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 1 against base HEAD 8b84dee and 4 for PR HEAD 90b3f3a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 2 against base HEAD 260f0fe and 3 for PR HEAD 90b3f3a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 1 against base HEAD 260f0fe and 2 for PR HEAD 90b3f3a in total

@alvaroaleman
Copy link
Contributor

/hold
this doesn't look like flaking

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 4, 2022
@csrwng
Copy link
Contributor Author

csrwng commented Aug 4, 2022

taking a look

Stops CVO from reconciling the cluster image registry operator into the
guest cluster and enables the control plane operator to reconcile it on
the management cluster side.

This change is needed to make it possible to clean up resources created
by the guest cluster. Having the image registry operator on the control
plane side allows it to keep running after worker nodes have been
removed from the cluster. On the guest cluster, changing the operator
config state to "Removed" tells the operator to remove the S3 bucket
(or other cloud storage) it created when starting up.
@csrwng csrwng force-pushed the move-registry-operator branch from 90b3f3a to d529f6d Compare August 4, 2022 20:54
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 4, 2022
@csrwng
Copy link
Contributor Author

csrwng commented Aug 4, 2022

Fixed the upgrade path by adding the operator deployment to the set of manifests to delete and modifying the name of the cleanup manifest so that the cvo applies it first.

Copy link
Contributor

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
pending e2e succeeding

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 4, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, csrwng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alvaroaleman,csrwng]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@relyt0925
Copy link
Contributor

/hold cc @hasueki (I think this is likely only going into 4.12: but still something that should be accounted for).

@csrwng
Copy link
Contributor Author

csrwng commented Aug 5, 2022

/retest-required

@alvaroaleman
Copy link
Contributor

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 5, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 5, 2022

@csrwng: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot merged commit 8ef6453 into openshift:main Aug 5, 2022
@hasueki hasueki mentioned this pull request Aug 8, 2022
4 tasks
@csrwng csrwng deleted the move-registry-operator branch July 14, 2023 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants