Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate builds tests away from using sample imagestreams since samples will be optional #26909

Merged
merged 1 commit into from
Mar 17, 2022

Conversation

bparees
Copy link
Contributor

@bparees bparees commented Mar 15, 2022

Resolving some of the failures seen in:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-aws-no-capabilities/1503711956505202688

which runs our e2e but with optional capabilities (in this case, samples operator)
disabled.

Note, in a few cases we used "foo:latest" as the imagestream reference which had the nice property that it protected us as the imagestream version (e.g. nodejs:16) moved on to, say, nodejs:17. None of these tests really care about the underlying image they are using, but it does mean the tests will no longer be automatically picking up newer versions of images as they become available, for better or worse.

@openshift-ci openshift-ci bot requested review from gabemontero and knobunc March 15, 2022 23:15
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 15, 2022
@bparees
Copy link
Contributor Author

bparees commented Mar 15, 2022

@adambkaplan @gabemontero @coreydaley ptal, we need to move these tests away from relying on the samples operator asap so we can bring online our new e2e job that tests the cluster when optional capabilities are disabled

there will be a few more of these, but these were the easiest ones.

@wking
Copy link
Member

wking commented Mar 15, 2022

For context on samples becoming optional, see openshift/cluster-samples-operator#414 and openshift/enhancements#922

Copy link
Contributor

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to update https://github.com/openshift/origin/blob/master/test/extended/util/image/image.go#L15-L71 per the process defined at https://github.com/openshift/origin/tree/master/test/extended/util/image @bparees ?

Otherwise the changes here are OK. I'm assuming you have already scanned for DockerImage from's with image-registry.openshift-image-registry.svc:5000 or from's of type ImageStream/ImageStreamTag and these are the hits you got.

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

Don't you need to update https://github.com/openshift/origin/blob/master/test/extended/util/image/image.go#L15-L71 per the process defined at https://github.com/openshift/origin/tree/master/test/extended/util/image @bparees ?

hrrrmm.. i'd forgotten about that. That's going to be a problem since we're not going to mirror redhat registry content to that quay repo. Though i'll also note that these tests would never have worked in a disconnected environment anyway, unless someone mirrored the content so the imagestreams could import it. i'll need to think about what we need to do here. It's also slightly complicated by the cases where the image pullspec is embedded in a yaml file, not in code where we can easily invoke LocationFor.

I'll have to see if i can dig up centos versions of these images that we can mirror, and then sort out what to do about the dynamic LocationFor substituting in the yaml files.

/hold

Otherwise the changes here are OK. I'm assuming you have already scanned for DockerImage from's with image-registry.openshift-image-registry.svc:5000 or from's of type ImageStream/ImageStreamTag and these are the hits you got.

No, we actually have an e2e job running already w/ the samples operator disabled, so this was based on which tests failed in that job, linked in the PR description (followed by me digging into the failures to confirm they were related to not having the imagestreams/samples-operator present)

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 16, 2022
@gabemontero
Copy link
Contributor

Don't you need to update https://github.com/openshift/origin/blob/master/test/extended/util/image/image.go#L15-L71 per the process defined at https://github.com/openshift/origin/tree/master/test/extended/util/image @bparees ?

hrrrmm.. i'd forgotten about that. That's going to be a problem since we're not going to mirror redhat registry content to that quay repo. Though i'll also note that these tests would never have worked in a disconnected environment anyway, unless someone mirrored the content so the imagestreams could import it. i'll need to think about what we need to do here. It's also slightly complicated by the cases where the image pullspec is embedded in a yaml file, not in code where we can easily invoke LocationFor.

pretty sure that we discovered during #26149 that Clayton's "bindata related libraries" handle the embedded yaml stuff. Somewhere in that bindata loading code he has a LocationFor.

and I thought there was some mirroring of sample content going on, but I'm less sure about that one .... pretty sure about the bindata / LocationFor stuff.

I'll have to see if i can dig up centos versions of these images that we can mirror, and then sort out what to do about the dynamic LocationFor substituting in the yaml files.

/hold

Otherwise the changes here are OK. I'm assuming you have already scanned for DockerImage from's with image-registry.openshift-image-registry.svc:5000 or from's of type ImageStream/ImageStreamTag and these are the hits you got.

No, we actually have an e2e job running already w/ the samples operator disabled, so this was based on which tests failed in that job, linked in the PR description (followed by me digging into the failures to confirm they were related to not having the imagestreams/samples-operator present)

That's probably OK at this point. I vaguely recall the switch from docker.io being more complicated / iterative, but its possible the groundwork from that and Clayton's prior overhaul toward achieving disconnected "made things simpler".

Also, I'm assuming the image eco suite is off on purpose since samples are disabled. That simplifies things too.

I'm sure you all are good to go wrt this.

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

hmmm, i'm not seeing anything that would be that aware. It would have to understanding what part of the yaml contained an image reference that needed to be LocationFor'd.

I think things like this are simply bypassing the LocationFor logic and pulling directly from the location indicated (which will work, as long as the cluster under test is not disconnected):

name: quay.io/redhat-developer/test-build-simples2i:1.2

The fact that this PR mostly passed testing:
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/26909/pull-ci-openshift-origin-master-e2e-gcp-builds/1503875911261360128

despite the fact that images like registry.redhat.io/ubi8/php-74:latest are not currently mirrored into quay.io/openshift/community-e2e-images proves the point, i think.

If anything was doing automated LocationFor substitution of those yaml values, the test would have ended up referencing the image on quay.io/openshift/community-e2e-images and then failing to pull the image, but the test passed.

Using imagestream values, or pullspecs that effectively point to imagestreams in the cluster, somewhat dodges the problem by giving the cluster an opportunity to redirect where the image is actually coming from. So that is being lost by my changes here. But i think there are deeper issues w/ the fact that fundamentally these tests(before or after my changes) do not align w/ the claimed rule that all images e2e's depend on, live in that community repository.

i'll keep digging.

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

there also seems to be at least some tests that are not actually conformant to this rule today:

out, err = oc.Run("new-build").Args("registry.access.redhat.com/ubi8/ruby-27", "https://github.com/openshift/ruby-hello-world.git").Output()

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

Another example of existing embedded refs to external images that are not mirrored into community-e2e, so clearly is not being substituted at test time (and would break for any disconnected cluster):

FROM registry.access.redhat.com/ubi8/ruby-27

so i think the tldr here is there is a ton of cleanup that needs to be done if the build tests are going to fully conform to that policy and i'm not inclined to use this PR to drive that cleanup. We would need to:

  1. get all the ref'd images mirrored into community-e2e (not clear that's even an option given these are redhat registry product images), or rework the tests to use alternate images
  2. introduce logic that substitutes the image refs into the fixtures, which again, i am skeptical can be done generically given that variety of places that a pullspec can appear (that embedded Dockerfile one being a good example), which means the tests consuming the fixtures would have to load the fixture, replace the image ref, and then use the fixture

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

I spoke with Derek about this and the "e2es must consume mirrored images from community-e2e" policy is specific to conformance e2e tests. I'll double check that none of these are marked conformance and if that's true, i think we're ok to proceed

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

yeah none of these are [Conformance] tests so i think we're ok. @gabemontero you want to give it the lgtm?

/hold cancel

@gabemontero
Copy link
Contributor

hmmm, i'm not seeing anything that would be that aware. It would have to understanding what part of the yaml contained an image reference that needed to be LocationFor'd.

I think things like this are simply bypassing the LocationFor logic and pulling directly from the location indicated (which will work, as long as the cluster under test is not disconnected):

name: quay.io/redhat-developer/test-build-simples2i:1.2

The fact that this PR mostly passed testing: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/26909/pull-ci-openshift-origin-master-e2e-gcp-builds/1503875911261360128

despite the fact that images like registry.redhat.io/ubi8/php-74:latest are not currently mirrored into quay.io/openshift/community-e2e-images proves the point, i think.

If anything was doing automated LocationFor substitution of those yaml values, the test would have ended up referencing the image on quay.io/openshift/community-e2e-images and then failing to pull the image, but the test passed.

Using imagestream values, or pullspecs that effectively point to imagestreams in the cluster, somewhat dodges the problem by giving the cluster an opportunity to redirect where the image is actually coming from. So that is being lost by my changes here. But i think there are deeper issues w/ the fact that fundamentally these tests(before or after my changes) do not align w/ the claimed rule that all images e2e's depend on, live in that community repository.

i'll keep digging.

Yeah I could be misrembering, and trying to retrace the history in that PR, and looking at the code again, I'm not finding anything definitive. Can't shake the recollection though ...

... that said, given your other comments, about the other "danglers" and the multitude of ways images can be references with build, I'm fine with bypassing additions to https://github.com/openshift/origin/blob/master/test/extended/util/image/image.go#L15-L71 in this PR if that is your preference.

/lgtm

@gabemontero
Copy link
Contributor

of course github webhooks are unhealthy today so that lgtm ^^ may take a while to register @bparees

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

attempting to clarify the policy here:
#26912

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

of course github webhooks are unhealthy today so that lgtm ^^ may take a while to register @bparees

thanks, i'll bash it into place if need be

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

/retest

@bparees bparees added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 16, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, gabemontero

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@bparees
Copy link
Contributor Author

bparees commented Mar 16, 2022

/retest

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2022

New changes are detected. LGTM label has been removed.

@bparees bparees added the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2022
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

15 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2022

New changes are detected. LGTM label has been removed.

@bparees bparees added the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2022
@gabemontero
Copy link
Contributor

/lgtm

even though @bparees added it as well

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@bparees
Copy link
Contributor Author

bparees commented Mar 17, 2022

/retest

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2022

@bparees: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 257f410 into openshift:master Mar 17, 2022
@bparees bparees deleted the disableoptionals branch April 14, 2022 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants