Spark example #5602

mattf · 2015-03-18T17:51:26Z

No description provided.

googlebot · 2015-03-18T17:51:28Z

Thanks for your pull request.

It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA) at https://cla.developers.google.com/.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check the information on your CLA or see this help article on setting the email on your git commits.

Once you've done that, please reply here to let us know. If you signed the CLA as a corporation, please let us know the company's name.

satoshi75nakamoto · 2015-03-18T18:07:05Z

@mattf — This is Spark workers using Spark's standalone mode correct?

mattf · 2015-03-18T18:07:53Z

@preillyme that's correct

jayunit100 · 2015-03-18T18:13:06Z

https://github.com/mattf/docker-spark/blob/master/worker/start.sh <-- i snooped around and found the dockerfiles for this that you're useing :) ... I think you will want you're containers to not tail the logs as the final line, as this would dupe kube into thinking that the actual spark process is running (even though it may have failed, and you're just tailing stale log folders..) Its not a show stopper, but you might want to mention this as a warning in you're README, because if a container fails, it would be confusing for kube

jayunit100 · 2015-03-18T18:26:48Z

also, does this work when master_ip != service_ip ? iirc we had that akka issue where the reverse resolution got rejected from master? update Ah, never mind, i see what you're doing , http://mail-archives.apache.org/mod_mbox/incubator-spark-commits/201501.mbox/%3Ced1ca66694004cda980b9f9cfd74379c@git.apache.org%3E . great idea !

timothysc · 2015-03-18T19:03:45Z

Hats off to @mattf and Armbrust on the akka trick.

jayunit100 · 2015-03-18T23:12:44Z

Tl;DR : Tested breifly on a cluster of 20 nodes, and works...

killing all the slaves doesnt bring the cluster down. they come back online and seem to work okay.
From my tests, it seems that killing the master (and restarting it) generally can cause some problems. i didnt dig too deep, but just something to note which might be a good next iteration, is confirming resiliency of the local_ip trick, wether periodically you need to reboot all slaves, and so on. i was running on a semi-unstable cluster so nothing conclusive.

In any case, its a great patch ! please just add the links to your dockerfiles repo to the README as well !

the test i ran was MASTER=spark://spark-master:7077 spark-submit pi.py , similar to what was in the README. output:

sh-4.2# MASTER=spark://spark-master:7077 spark-submit pi.py 
15/03/18 22:32:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[Stage 0:>                                                                                                                                                                                                          (0 + 0) / 2]15/03/18 22:32:46 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/03/18 22:33:01 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/03/18 22:33:16 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

[Stage 0:>                                                                                                                                                                                                          (0 + 0) / 2]15/03/18 22:33:31 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Pi is roughly 3.142080

pires · 2015-03-19T11:40:44Z

LGTM just need pointers to the Dockerfiles and other sources used in the example!

mattf · 2015-03-19T11:45:39Z

@pires i used the docker hub auto build system so that all the build information is trivially discoverable without having to duplicate it in the example. there's no extra effort required and nothing about the build is hidden or concealed.

pires · 2015-03-19T11:57:46Z

@mattf but you could point to your repo in the documentation, much like I did in the examples/hazelcast.

mattf · 2015-03-19T13:06:51Z

@pires how's that?

pires · 2015-03-19T13:11:15Z

@mattf nevermind, it's there. Thanks

cjcullen · 2015-03-19T16:43:48Z

@mattf this is great. Can you squash commits?

satoshi75nakamoto · 2015-03-20T06:31:36Z

I agree with and 👍 what @timothysc said well done @mattf

mention the spark cluster is standalone add detailed master & worker instructions add method to get master status add links option for master status add links option for worker status add example use of cluster add source location

mattf · 2015-03-27T20:48:41Z

@cjcullen i had to force it, how's that? generally i dislike losing history. what's the motivation?

cjcullen · 2015-03-30T17:12:12Z

Motivation is just to keep the commit history as clean (readable) as possible on master. This looks great. Thanks.

Spark example

ikehz · 2015-04-01T17:13:37Z

@mattf This is great, thanks for making this contribution. I think it would also be great to get documentation on Spark's end about this, maybe linked to from their Cluster Mode Overview. Is that something you've considered or are pursuing?

Thanks again for your contributions.

mattf · 2015-04-03T02:31:32Z

@ihmccreery i wasn't planning on it. you should feel free to. if you do, ping me and i'll help review the pr.

googlebot added the cla: no label Mar 18, 2015

cjcullen added cla: yes and removed cla: no labels Mar 19, 2015

mbforbes assigned cjcullen Mar 20, 2015

jayunit100 mentioned this pull request Mar 23, 2015

Develop process for accepting examples #5465

Closed

add tl;dr version of Spark README.md

31b923c

mention the spark cluster is standalone add detailed master & worker instructions add method to get master status add links option for master status add links option for worker status add example use of cluster add source location

mattf force-pushed the spark-example branch from 80edd16 to 31b923c Compare March 27, 2015 20:47

googlebot added cla: no and removed cla: yes labels Mar 27, 2015

cjcullen added cla: yes and removed cla: no labels Mar 30, 2015

cjcullen added a commit that referenced this pull request Mar 30, 2015

Merge pull request #5602 from mattf/spark-example

31324a0

Spark example

cjcullen merged commit 31324a0 into kubernetes:master Mar 30, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark example #5602

Spark example #5602

mattf commented Mar 18, 2015

googlebot commented Mar 18, 2015

satoshi75nakamoto commented Mar 18, 2015

mattf commented Mar 18, 2015

jayunit100 commented Mar 18, 2015

jayunit100 commented Mar 18, 2015

timothysc commented Mar 18, 2015

jayunit100 commented Mar 18, 2015

pires commented Mar 19, 2015

mattf commented Mar 19, 2015

pires commented Mar 19, 2015

mattf commented Mar 19, 2015

pires commented Mar 19, 2015

cjcullen commented Mar 19, 2015

satoshi75nakamoto commented Mar 20, 2015

mattf commented Mar 27, 2015

cjcullen commented Mar 30, 2015

ikehz commented Apr 1, 2015

mattf commented Apr 3, 2015

Spark example #5602

Spark example #5602

Conversation

mattf commented Mar 18, 2015

googlebot commented Mar 18, 2015

satoshi75nakamoto commented Mar 18, 2015

mattf commented Mar 18, 2015

jayunit100 commented Mar 18, 2015

jayunit100 commented Mar 18, 2015

timothysc commented Mar 18, 2015

jayunit100 commented Mar 18, 2015

pires commented Mar 19, 2015

mattf commented Mar 19, 2015

pires commented Mar 19, 2015

mattf commented Mar 19, 2015

pires commented Mar 19, 2015

cjcullen commented Mar 19, 2015

satoshi75nakamoto commented Mar 20, 2015

mattf commented Mar 27, 2015

cjcullen commented Mar 30, 2015

ikehz commented Apr 1, 2015

mattf commented Apr 3, 2015