Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does this Library work on distributed system #11

Open
arjunktr opened this issue Apr 12, 2018 · 1 comment
Open

Does this Library work on distributed system #11

arjunktr opened this issue Apr 12, 2018 · 1 comment

Comments

@arjunktr
Copy link

Hi,

Does this Library work on distributed processing engine like Spark.

Thanks!

@pishen
Copy link
Collaborator

pishen commented Apr 12, 2018

It may work when you are finding nearest neighbor from the index. But I haven't try it before.
Please refer to spotify/annoy for more details:

Why is this useful? If you want to find nearest neighbors and you have many CPU's, you only need the RAM to fit the index once. You can also pass around and distribute static files to use in production environment, in Hadoop jobs, etc.

You can also check Ann4s, which claims that it can work with Spark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants