-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EMR cluster master node hostname to info logs for convenience. #2007
Conversation
Also removes superfluous setting of `_emr_job_start`.
Hmm, printing the hostname might more distracting than useful. Good catch on |
Can I ask why you think this is distracting / not useful? Very frequently when a job fails it is useful to ssh into the master. This logline skips the extra step of accessing the EMR UI to look up the master node ip. |
Oh, I see. I usually I ssh into the master node with Basically what I'm trying to avoid is making it harder to find the cluster ID (which is really, really useful information) by putting it right next to the hostname, which is much longer and only useful in specific cases. It sounds like really when you need the hostname is after the job failed, not when we first join the cluster. And it's a little odd to provide it for existing pooled clusters only. |
Ah I see. I tend to not use the ssh command as 1) less api calls to potentially be throttled and 2) I often port forward to access the resource manager UI.
What about putting put on a different log-line then?
Ya I thought about that, but for new clusters we can't access the IP until it is finished provisioning. |
A different log line would be fine. How about I integrate printing master node's address into the code that checks the cluster's status after the job has been submitted? |
Sure, something along those lines sounds good to me. I can also code that up if you would like. |
Sure, that would be super! Something like adding a method that logs the cluster's hostname if we haven't already, and adding it to Also don't forget that the runner can give up on a cluster and relaunch with a different one (see |
Thank you for all the pull requests, by the way. :) |
And my pleasure! We use mrjob quite heavily in our offline batch processing; just enjoying helping the tool evolve. |
Sorry to put this off for so long. Will try to get this into the next release. |
No worries, thanks David! |
Superseded by #2074 |
Also removes superfluous setting of
_emr_job_start
.