Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix get_logs pod_names type and iteration blocking #1280

Merged
merged 2 commits into from
Jun 28, 2021

Conversation

Windfarer
Copy link
Contributor

@Windfarer Windfarer commented Jun 2, 2021

There are 2 issues in current code.

  1. pod_names is not subscriptable
    The return value of method self.get_pod_names is a "set" type, so that the following code pod_names[index] for getting the pod's name with index number will raise an exception TypeError: 'set' object is not subscriptable, so we should convert pod_names to list.

  2. iterate over multiple generators(the return value of watch.stream()) may be blocked on one generator
    We iterate over these streams, but when one pod not to produce new log, the iteration process will block on this stream and cannot continue to read other pods' stream. To fix this, we should create queues for every stream and start one thread for every stream for decoupling the logs' producing and consuming, so that we can implement non-blocking iterations over every streams.

I've tested this changed code on my kubeflow cluster, and it works well.

@aws-kf-ci-bot
Copy link
Contributor

Hi @Windfarer. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@coveralls
Copy link

coveralls commented Jun 2, 2021

Coverage Status

Coverage remained the same at 71.429% when pulling dcae3b1 on Windfarer:fix_log into c095f7a on kubeflow:master.

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why the fix is needed and update PR title/description?

@Windfarer
Copy link
Contributor Author

Windfarer commented Jun 2, 2021

Could you explain why the fix is needed and update PR title/description?

Sorry about the not detailed description. The return value of method self.get_pod_names is a "set" type, so that the following code pod_names[index] for getting the pod's name with index number will cause an error. And I've found other issue (mainly about the iterations of these streams) while testing this fixing, so I converted this PR to a Draft. I'll fix all the issues, update the complete description and re-open this PR later.

@Windfarer Windfarer changed the title fix get_logs pod_names type fix get_logs pod_names type and iteration blocking Jun 3, 2021
@Windfarer
Copy link
Contributor Author

@terrytangyuan I've updated the commits as well as the description, this PR is ready for review now.

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

Sorry for the late review.

/assign @terrytangyuan

@aws-kf-ci-bot
Copy link
Contributor

@Windfarer: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
kubeflow-tf-operator-presubmit dcae3b1 link /test kubeflow-tf-operator-presubmit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: terrytangyuan
To complete the pull request process, please assign jinchihe after the PR has been reviewed.
You can assign the PR to them by writing /assign @jinchihe in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gaocegege
Copy link
Member

The CI fails because of #1277

I am merging it manually.

@gaocegege gaocegege merged commit 0c41b27 into kubeflow:master Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants