Add EMR cluster master node hostname to info logs for convenience. #2007

aaronabf · 2019-02-28T20:49:57Z

Also removes superfluous setting of _emr_job_start.

Also removes superfluous setting of `_emr_job_start`.

coyotemarin · 2019-02-28T20:56:31Z

Hmm, printing the hostname might more distracting than useful. Good catch on _emr_job_start and _create_cluster(). Going to fork this branch and then merge it.

aaronabf · 2019-02-28T21:04:31Z

Can I ask why you think this is distracting / not useful?

Very frequently when a job fails it is useful to ssh into the master. This logline skips the extra step of accessing the EMR UI to look up the master node ip.

coyotemarin · 2019-02-28T21:13:28Z

Oh, I see. I usually I ssh into the master node with aws emr ssh --cluster-id <cluster id>, so the hostname is kind of redundant.

Basically what I'm trying to avoid is making it harder to find the cluster ID (which is really, really useful information) by putting it right next to the hostname, which is much longer and only useful in specific cases.

It sounds like really when you need the hostname is after the job failed, not when we first join the cluster. And it's a little odd to provide it for existing pooled clusters only.

aaronabf · 2019-02-28T21:18:47Z

I usually I ssh into the master node with aws emr ssh --cluster-id <cluster id>, so the hostname is kind of redundant.

Ah I see. I tend to not use the ssh command as 1) less api calls to potentially be throttled and 2) I often port forward to access the resource manager UI.

Basically what I'm trying to avoid is making it harder to find the cluster ID

What about putting put on a different log-line then?

And it's a little odd to provide it for existing pooled clusters only.

Ya I thought about that, but for new clusters we can't access the IP until it is finished provisioning.

coyotemarin · 2019-02-28T21:21:50Z

A different log line would be fine.

How about I integrate printing master node's address into the code that checks the cluster's status after the job has been submitted?

aaronabf · 2019-02-28T21:28:42Z

Sure, something along those lines sounds good to me. I can also code that up if you would like.

coyotemarin · 2019-02-28T21:35:17Z

Sure, that would be super! Something like adding a method that logs the cluster's hostname if we haven't already, and adding it to _wait_for_steps_to_complete() and _wait_for_step_to_complete()?

Also don't forget that the runner can give up on a cluster and relaunch with a different one (see _relaunch()).

coyotemarin · 2019-02-28T21:36:06Z

Thank you for all the pull requests, by the way. :)

aaronabf · 2019-02-28T22:05:10Z

_wait_for_step*_to_complete sounds good to me!

And my pleasure! We use mrjob quite heavily in our offline batch processing; just enjoying helping the tool evolve.

coyotemarin · 2019-05-30T04:15:54Z

Sorry to put this off for so long. Will try to get this into the next release.

aaronabf · 2019-05-30T19:45:58Z

No worries, thanks David!

coyotemarin · 2019-05-31T23:44:14Z

Superseded by #2074

Add EMR cluster master node hostname to info logs for convenience.

b1061a8

Also removes superfluous setting of `_emr_job_start`.

coyotemarin added this to the v0.6.10 milestone May 31, 2019

coyotemarin mentioned this pull request May 31, 2019

log public dns of master node #2074

Merged

coyotemarin closed this May 31, 2019

aaronabf deleted the upstream/master-ip branch May 31, 2019 23:49

coyotemarin removed this from the v0.6.10 milestone Jul 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add EMR cluster master node hostname to info logs for convenience. #2007

Add EMR cluster master node hostname to info logs for convenience. #2007

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019 •

edited

Loading

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

aaronabf commented Feb 28, 2019

coyotemarin commented May 30, 2019

aaronabf commented May 30, 2019

coyotemarin commented May 31, 2019

Add EMR cluster master node hostname to info logs for convenience. #2007

Add EMR cluster master node hostname to info logs for convenience. #2007

Conversation

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019 • edited Loading

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

aaronabf commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

coyotemarin commented Feb 28, 2019

aaronabf commented Feb 28, 2019

coyotemarin commented May 30, 2019

aaronabf commented May 30, 2019

coyotemarin commented May 31, 2019

coyotemarin commented Feb 28, 2019 •

edited

Loading