-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e: existing_arrikto: initial e2e test #4154
e2e: existing_arrikto: initial e2e test #4154
Conversation
/retest |
9201469
to
526fd5a
Compare
ea977a8
to
d22854b
Compare
@yanniszark There is hard coding for cluster name, zone, in kubeflow/testing/kfctl/delete_existing_cluster.py Lines 20 to 22 in 0c244a3
that should match with kubeflow/testing/kfctl/create_existing_cluster.sh Lines 16 to 18 in 0c244a3
By making creation also a step, it would be possible to remove the hard coding also. |
@nrchakradhar currently, this is a dirty approach to get it working. For the separate DAG step, I was actually thinking of including deletion in the kfctl_delete step instead of making a new step. |
3e5af2c
to
4f92110
Compare
/retest |
5b69308
to
2062950
Compare
/retest |
1 similar comment
/retest |
bf6a724
to
29d2a8f
Compare
0af6407
to
d687f30
Compare
/retest |
@jlewi I'm getting failed tests from the kfctl-go-build-deploy test. /retest |
@@ -33,7 +33,13 @@ local runPath = srcDir + "/testing/workflows/run.sh"; | |||
local kfCtlPath = srcDir + "/bootstrap/bin/kfctl"; | |||
local kubeConfig = testDir + "/kfctl_test/.kube/kubeconfig"; | |||
|
|||
// Name for the Kubeflow app. | |||
// cluster_creation_script specifies the script to run in order to create |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we move to the argo_client/tekton_client rather than jsonnet in a future PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the issue is #4189
/retest |
@jlewi could you lgtm again now that I have rebased? |
testing/kfctl/kf_is_ready_test.py
Outdated
"pytorch-operator", | ||
"katib-controller", | ||
"tensorboard", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we testing tensorboard? That shouldn't even be deployed.
@yanniszark The changes to kfctl_is_ready_test.py seem a bit risky right now. If the test is flaky adding more tests to it is only likely to exacerbate the problem. Can we pull the changes to kfctl_is_ready_test.py into a separate PR? I think before we increase coverage of is_read_test we need to split it up; if we test more things in a single test the test is just going to become more unreliable. |
@yanniszark I think an easy thing to do would be break it up into multiple tests corresponding to different related groups of servicess e.g. def test_kfserving def test_gcp That should hopefully give us finer grained signal about what's flaky. We can also easily use annotations to disable one subset or another. But I'd suggest we do that in a separate PR so we don't keep blocking the arrikto E2E test. |
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Related: googleapis/google-api-python-client#299 Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
552b5fb
to
538e267
Compare
@jlewi as asked I reverted the extra coverage of the kf_is_ready test. Can you please comment on the issue with your suggestions on implementation? Some questions I had on the approach you proposed were:
If the presubmit passes we should be good to go, based on your latest review. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* kfctl: existing_arrikto: initial e2e test Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * fix deployment name and review comment Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * fix bad syntax with extend Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * increase timeout for service ip discovery Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * update deployments to reflect latest manifests Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * skip knative for existing_arrikto Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * bump machine size and increase timeout Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * python script: fix error in delete step Related: googleapis/google-api-python-client#299 Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * be more conservative in adding new test Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * revert checklist Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
* kfctl: existing_arrikto: initial e2e test Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * fix deployment name and review comment Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * fix bad syntax with extend Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * increase timeout for service ip discovery Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * update deployments to reflect latest manifests Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * skip knative for existing_arrikto Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * bump machine size and increase timeout Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * python script: fix error in delete step Related: googleapis/google-api-python-client#299 Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * be more conservative in adding new test Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com> * revert checklist Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
Initial E2E test for existing_arrikto.
This should ensure that nothing breaks with new PRs.
In the near future, we should move this to a pyfunc.
Signed-off-by: Yannis Zarkadas yanniszark@arrikto.com
This change is