Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark example #5602

Merged
merged 1 commit into from
Mar 30, 2015
Merged

Spark example #5602

merged 1 commit into from
Mar 30, 2015

Conversation

mattf
Copy link
Contributor

@mattf mattf commented Mar 18, 2015

No description provided.

@googlebot
Copy link

Thanks for your pull request.

It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA) at https://cla.developers.google.com/.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check the information on your CLA or see this help article on setting the email on your git commits.

Once you've done that, please reply here to let us know. If you signed the CLA as a corporation, please let us know the company's name.

@satoshi75nakamoto
Copy link
Contributor

@mattf — This is Spark workers using Spark's standalone mode correct?

@mattf
Copy link
Contributor Author

mattf commented Mar 18, 2015

@preillyme that's correct

@jayunit100
Copy link
Member

https://github.com/mattf/docker-spark/blob/master/worker/start.sh <-- i snooped around and found the dockerfiles for this that you're useing :) ... I think you will want you're containers to not tail the logs as the final line, as this would dupe kube into thinking that the actual spark process is running (even though it may have failed, and you're just tailing stale log folders..) Its not a show stopper, but you might want to mention this as a warning in you're README, because if a container fails, it would be confusing for kube

@jayunit100
Copy link
Member

also, does this work when master_ip != service_ip ? iirc we had that akka issue where the reverse resolution got rejected from master? update Ah, never mind, i see what you're doing , http://mail-archives.apache.org/mod_mbox/incubator-spark-commits/201501.mbox/%3Ced1ca66694004cda980b9f9cfd74379c@git.apache.org%3E . great idea !

@timothysc
Copy link
Member

Hats off to @mattf and Armbrust on the akka trick.

@jayunit100
Copy link
Member

Tl;DR : Tested breifly on a cluster of 20 nodes, and works...

  • killing all the slaves doesnt bring the cluster down. they come back online and seem to work okay.
  • From my tests, it seems that killing the master (and restarting it) generally can cause some problems. i didnt dig too deep, but just something to note which might be a good next iteration, is confirming resiliency of the local_ip trick, wether periodically you need to reboot all slaves, and so on. i was running on a semi-unstable cluster so nothing conclusive.

In any case, its a great patch ! please just add the links to your dockerfiles repo to the README as well !

  • the test i ran was MASTER=spark://spark-master:7077 spark-submit pi.py , similar to what was in the README. output:
sh-4.2# MASTER=spark://spark-master:7077 spark-submit pi.py 
15/03/18 22:32:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[Stage 0:>                                                                                                                                                                                                          (0 + 0) / 2]15/03/18 22:32:46 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/03/18 22:33:01 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/03/18 22:33:16 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

[Stage 0:>                                                                                                                                                                                                          (0 + 0) / 2]15/03/18 22:33:31 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Pi is roughly 3.142080   

@pires
Copy link
Contributor

pires commented Mar 19, 2015

LGTM just need pointers to the Dockerfiles and other sources used in the example!

@mattf
Copy link
Contributor Author

mattf commented Mar 19, 2015

@pires i used the docker hub auto build system so that all the build information is trivially discoverable without having to duplicate it in the example. there's no extra effort required and nothing about the build is hidden or concealed.

@pires
Copy link
Contributor

pires commented Mar 19, 2015

@mattf but you could point to your repo in the documentation, much like I did in the examples/hazelcast.

@mattf
Copy link
Contributor Author

mattf commented Mar 19, 2015

@pires how's that?

@pires
Copy link
Contributor

pires commented Mar 19, 2015

@mattf nevermind, it's there. Thanks

@cjcullen cjcullen added cla: yes and removed cla: no labels Mar 19, 2015
@cjcullen
Copy link
Member

@mattf this is great. Can you squash commits?

@satoshi75nakamoto
Copy link
Contributor

I agree with and 👍 what @timothysc said well done @mattf

mention the spark cluster is standalone

add detailed master & worker instructions

add method to get master status

add links option for master status

add links option for worker status

add example use of cluster

add source location
@mattf
Copy link
Contributor Author

mattf commented Mar 27, 2015

@cjcullen i had to force it, how's that? generally i dislike losing history. what's the motivation?

@cjcullen cjcullen added cla: yes and removed cla: no labels Mar 30, 2015
@cjcullen
Copy link
Member

Motivation is just to keep the commit history as clean (readable) as possible on master. This looks great. Thanks.

cjcullen added a commit that referenced this pull request Mar 30, 2015
@cjcullen cjcullen merged commit 31324a0 into kubernetes:master Mar 30, 2015
@ikehz
Copy link
Contributor

ikehz commented Apr 1, 2015

@mattf This is great, thanks for making this contribution. I think it would also be great to get documentation on Spark's end about this, maybe linked to from their Cluster Mode Overview. Is that something you've considered or are pursuing?

Thanks again for your contributions.

@mattf
Copy link
Contributor Author

mattf commented Apr 3, 2015

@ihmccreery i wasn't planning on it. you should feel free to. if you do, ping me and i'll help review the pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants