Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First iteration and common utils for Jupyter web app CD #5663

Merged

Conversation

kimwnasptd
Copy link
Member

This is part of our effort to use the Optional-test infra for 1.3 going forward #5482

This PR introduces the following assumptions:

  1. The produced Argo Workflow will not try to push the produced image to a registry [ --no-push Kaniko arg] if the code is not run in the upstream AWS cluster [ LOCAL_TESTING=True env var ]. This will make it easier for more people to run the code in their Kubernetes cluster with Argo installed
  2. We expect some workflows to not have an exit dag. This is the case for now at least, where the CD only builds our public images and pushes them to the registry
  3. We will be using gcr.io/kaniko-project/executor:v1.5.0. This is a newer version from what's used from KFServing
  4. The produced images will use the SHA of the PR as the tag, which is what KFServing does

@davidspek in the end while working on trying out Kaniko I ended up writing code for it. So let's use this PR to test the first iteration only with JWA and then once we ensure it works lets use your PR #5640 to add workflows for the other components as well.

@PatrickXYS I have two questions:

  1. Can we use the public.ecr.aws/j1r0q0g6/jupyter-web-app yet, or do you still need to do some setup?
  2. What kind of credentials do I need to inject to the workflow in order for the Kaniko container to have permissions to push to the public AWS ECR? Is it just the docker-config [ and if yes, where should I mount it ] ?

cc @kubeflow/wg-notebooks-leads
/cc @PatrickXYS

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
@kimwnasptd
Copy link
Member Author

Also while working on the PR I made the workflow to be a presubmit job so that I can easily check the produced workflows. After we've confirmed the PR works as expected I'll move the workflow to be a postsubmit job, before we merge.

Looking at the produced Argo Workflow everything goes as expected:

  1. The image's tag is the SHA of the PR
  2. The Kaniko container fails because it doesn't has enough permissions

https://argo.kubeflow-testing.com/workflows/kubeflow-test-infra/kubeflow-kubeflow-presubmit-jwa-build-5663-64a400b-3968-932b?tab=workflow&nodeId=kubeflow-kubeflow-presubmit-jwa-build-5663-64a400b-3968-932b-4173484142&sidePanel=logs%3Akubeflow-kubeflow-presubmit-jwa-build-5663-64a400b-3968-932b-4173484142%3Amain

@PatrickXYS
Copy link
Member

Can we use the public.ecr.aws/j1r0q0g6/jupyter-web-app yet, or do you still need to do some setup?

This is done

What kind of credentials do I need to inject to the workflow in order for the Kaniko container to have permissions to push to the public AWS ECR? Is it just the docker-config [ and if yes, where should I mount it ] ?

You can follow kaniko official doc https://github.com/GoogleContainerTools/kaniko#pushing-to-amazon-ecr, we've already set up credentials and configs as configmap and secret stored in the cluster. So all you have to do is to specify those configurations in your argo workflow.

@davidspek
Copy link
Contributor

It looks like the volume mount:

- name: aws-secret
  mountPath: /root/.aws/

Is already in the pod spec, so you'd think that it would be able to push to the registry. I've only looked quickly, but I don't see the accompanying volumes: mapping.

@kimwnasptd
Copy link
Member Author

@PatrickXYS

This is done

That's awesome, thanks!

You can follow kaniko official doc https://github.com/GoogleContainerTools/kaniko#pushing-to-amazon-ecr, we've already set up credentials and configs as configmap and secret stored in the cluster. So all you have to do is to specify those configurations in your argo workflow.

Thanks for the link! Once you posted it I remembered I had glanced over that section in the README but had completely forgotten about it. I'll add the docker-config and the aws-secret.

@davidspek

It looks like the volume mount:

name: aws-secret
mountPath: /root/.aws/

Is already in the pod spec, so you'd think that it would be able to push to the registry. I've only looked quickly, but I don't see the accompanying volumes: mapping.

@davidspek I can see it in prow's logs http://prow.kubeflow-testing.com/view/s3/aws-kubeflow-jenkins/pr-logs/pull/kubeflow_kubeflow/5663/kubeflow-kubeflow-presubmit/1367813324770643968#1:build-log.txt%3A318. Did you had something else on your mind?

@davidspek
Copy link
Contributor

@kimwnasptd Yeah I figured that wouldn't be the case, I was looking in the wrong place. Isn't the aws secret already mounted then?

@kimwnasptd
Copy link
Member Author

Yeap, so we only need to add the docker-config volume/volumeMount so that the docker process will have permissions to push to public ECR [ @PatrickXYS correct me if anything I say is off ]

@davidspek
Copy link
Contributor

@kimwnasptd Yes I think that is all.

@kimwnasptd
Copy link
Member Author

The workflow successfully build the image and pushed it. We have the first public image with the new Optional-test infra folks! 🥳

This is supper exciting!

https://argo.kubeflow-testing.com/workflows/kubeflow-test-infra/kubeflow-kubeflow-presubmit-jwa-build-5663-02f77a1-9840-696f?tab=workflow&nodeId=kubeflow-kubeflow-presubmit-jwa-build-5663-02f77a1-9840-696f-2146802049

@kimwnasptd
Copy link
Member Author

I can also see it in the AWS ECR Gallery https://gallery.ecr.aws/j1r0q0g6/jupyter-web-app.

Changing the tag per @davidspek's comments, moving the workflow to postsubmits and we should be ready to go!

@davidspek
Copy link
Contributor

@kimwnasptd Once this PR has been merged I will rework my PR for all the (other) images using the code you created here.

@davidspek
Copy link
Contributor

davidspek commented Mar 5, 2021

@PatrickXYS Related to this PR, is there a way to setup image scanning with some form of GitHub integration (a badge, an issue, anything that can display the results really)? https://docs.aws.amazon.com/AmazonECR/latest/userguide/image-scanning.html

Maybe it can be part of the argo workflow and it outputs the scan result as an artifact?

@PatrickXYS
Copy link
Member

Related to this PR, is there a way to setup image scanning with some form of GitHub integration (a badge, an issue, anything that can display the results really)

This is customer-faced bonus feature, currently ECR don't support that nor our test-infra.

The badge / issue stuff, normally they're supported by official product, since test-infra is self-hosted, we do not plan to add it on server side.

Feel free to do so in argo workflow, it won't be hard as long as you get familiar with those Github APIs it should be fine.

Note: ECR image scanning is designed for security check

Copy link
Member

@PatrickXYS PatrickXYS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/hold

Feel free to unhold if you feel it's good to go

@davidspek
Copy link
Contributor

@kimwnasptd Could you rename commit b53cc02. It should be postsubmit job :). For later reference it might otherwise be confusing.

@kimwnasptd
Copy link
Member Author

kimwnasptd commented Mar 5, 2021

@davidspek I pushed two commits that moved the workflow to be a postsubmit job. I had explicitly made it a presubmit in the beginning so we could check/run the produced Argo Workflow while I was adding extra commits.

But now that we can confirm it works I made it to be a postsubmit job, so we should be all set
Just saw the typo. Force pushing

@davidspek
Copy link
Contributor

@kimwnasptd Cool!

@kimwnasptd
Copy link
Member Author

@Bobgy, at your convenience, could you merge this PR? We've also added a postsubmit workflow to prow_config.yaml, for building the Jupyter web app image after a PR is merged.

We are super excited for this

@davidspek
Copy link
Contributor

@kimwnasptd First thing tomorrow morning I'll rework my PR following this one for all the other components (it's getting a bit late now). This way, only 1 more PR will need to be merged by @Bobgy to get all the image building up and running.

@kimwnasptd kimwnasptd changed the title WIP: First iteration and common utils for Jupyter web app CD First iteration and common utils for Jupyter web app CD Mar 5, 2021
@kimwnasptd
Copy link
Member Author

/hold cancel

@davidspek
Copy link
Contributor

@kimwnasptd Did the force push go through? I still see the mistake in the commit hash. If it is being difficult it isn't a blocker of course.

/lgtm

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
@kimwnasptd kimwnasptd force-pushed the feature-kimwnasptd-kaniko-cd branch from 4581f9e to 7d66326 Compare March 5, 2021 20:30
@davidspek
Copy link
Contributor

/lgtm

@Bobgy
Copy link
Contributor

Bobgy commented Mar 5, 2021

/approve

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobgy, kimwnasptd, PatrickXYS

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-robot google-oss-robot merged commit 764ce33 into kubeflow:master Mar 5, 2021
@kimwnasptd kimwnasptd deleted the feature-kimwnasptd-kaniko-cd branch March 6, 2021 08:25
@davidspek
Copy link
Contributor

@kimwnasptd I just noticed 2 things. First thing, the Dockerfile location is the location within the build context according to the Kaniko docs (though it doesn't seem to matter). But more importantly, when building the JWA with the make file, it copies the Dockerfile.dockerignore to the context directory (this is also done for the TWA). However, this isn't done with the current build process so I'm not sure if that will cause a problem or larger images.

juliusvonkohout pushed a commit to juliusvonkohout/kubeflow that referenced this pull request Mar 7, 2021
* ci(utils): Add helpsrs for creating a kaniko task

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* ci(utils): Don't always add an exit dag

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* cd(jwa): Add CD workflow for building jwa

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* ci(utils): Add docker-config

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* cicd: Use base-commit[0:8] for the image tag

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* cd(config): Add a config file with list of images

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Make CD workflow to be a postsubmit job

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Subreptivus pushed a commit to equinor/kubeflow that referenced this pull request Mar 10, 2021
* ci(utils): Add helpsrs for creating a kaniko task

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* ci(utils): Don't always add an exit dag

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* cd(jwa): Add CD workflow for building jwa

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* ci(utils): Add docker-config

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* cicd: Use base-commit[0:8] for the image tag

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* cd(config): Add a config file with list of images

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Make CD workflow to be a postsubmit job

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants