Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supplying a large number to a distributed object store weighting consumes large amounts of ram #6988

Open
hexylena opened this issue Nov 8, 2018 · 1 comment
Labels
area/framework help wanted also "hacktoberfest", beginner friendly set of issues kind/feature

Comments

@hexylena
Copy link
Member

hexylena commented Nov 8, 2018

Say you're configuring your distributed object_store_conf.xml file and you want to be really sure that data only goes to one backend and you don't trust your eyeballs that the other ones say weight="0". So you fill in a very large number like weight="1000000000".

Unfortunately this creates that many duplicates of the object, consuming ~30+gb of ram very quickly.

This is an easy "bug" to fix and would be a good beginner bug. Just need to refactor https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/objectstore/__init__.py#L615-L619 to use a weighted random choice rather than building an array of all backends + randomly choosing from that.

@hexylena hexylena added the help wanted also "hacktoberfest", beginner friendly set of issues label Aug 15, 2019
@hexylena
Copy link
Member Author

apparently in stdlib as soon as we're py3 only :)

https://docs.python.org/3.7/library/random.html#random.choices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/framework help wanted also "hacktoberfest", beginner friendly set of issues kind/feature
Projects
None yet
Development

No branches or pull requests

1 participant