-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Customize KubeRay container commands #41651
[Doc] Customize KubeRay container commands #41651
Conversation
2a51fdd
to
6698130
Compare
6698130
to
d6e1be9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few comments and questions
## Current KubeRay operator behavior for container commands | ||
* The current behavior for container commands is not finalized, and **may be updated in the future**. | ||
* See [code](https://github.com/ray-project/kuberay/blob/47148921c7d14813aea26a7974abda7cf22bbc52/ray-operator/controllers/ray/common/pod.go#L301-L326) for more details. | ||
Starting with KubeRay v1.1.0, if users add the annotation `ray.io/overwrite-container-cmd: "true"` to a RayCluster, KubeRay respects the container `command` and `args` as provided by the users, without including any generated command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting with KubeRay v1.1.0, if users add the annotation `ray.io/overwrite-container-cmd: "true"` to a RayCluster, KubeRay respects the container `command` and `args` as provided by the users, without including any generated command. | |
Starting with KubeRay v1.1.0, if users add the annotation `ray.io/overwrite-container-cmd: "true"` to a RayCluster, KubeRay respects the container `command` and `args` as provided by the users, without including the generated `ray start` command. |
Nit. Just to be a bit more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be helpful to include the definition/explanation of KUBERAY_GEN_RAY_START_CMD
here, in the text, instead of just in the code comment (users might not expect to see a key definition in the code comment instead of the main text)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not precise enough. The generated commands include ulimit
and ray start
commands. I add more details about both of them.
|
||
* `metadata.annotations.ray.io/overwrite-container-cmd: "true"`: This annotation tells KubeRay to respect the container `command` and `args` as provided by the users, without including any generated command. | ||
|
||
* `ulimit -n 65536`: This command is necessary to avoid Ray scalability issues caused by running out of file descriptors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot (and the reader might be curious), why didn't we just make ulimit -n 65536 part of the KUBERAY_GEN_RAY_START_CMD? Is it because some users might want to delete the ulimit part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's an interesting explanation for the above question it might be helpful to write it in the doc, maybe here or at the line below where we say "Note that this environment variable doesn't include the ulimit
command."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why didn't we just make ulimit -n 65536 part of the KUBERAY_GEN_RAY_START_CMD? Is it because some users might want to delete the ulimit part?
Some users want to run some custom commands between ulimit
and ray start
. ray-project/kuberay#1560 (comment) is an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I forgot. Makes sense
@@ -63,91 +113,3 @@ Currently, for timing (1), we can set the container's `Command` and `Args` in Ra | |||
# Args: | |||
# echo 123 456 && ulimit -n 65536; ray start --head --dashboard-host=0.0.0.0 --num-cpus=1 --block --metrics-export-port=8080 --memory=2147483648 | |||
``` | |||
|
|||
|
|||
## Timing 2: After `ray start` (RayCluster is ready) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to think a bit to remember that the reason we don't need this section anymore is because we can use Part 1 to just directly add post-start commands after the KUBERAY_GEN_RAY_START_COMMAND.
I think it would help to make this really obvious and explicit in the doc for part 1, maybe just with a quick sentence or example. Or maybe just put a post start command in the existing example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an over-explanation for me. I will update it only if you feel strongly about its necessity.
Co-authored-by: Archit Kulkarni <architkulkarni@users.noreply.github.com> Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
Why are these changes needed?
Update doc for ray-project/kuberay#1704.
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.