-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put zarr, tileDB and hub benchmarks in one file #534
Put zarr, tileDB and hub benchmarks in one file #534
Conversation
Locust summaryGit referencesInitial: 0128704Terminal: d76b956 benchmarks/benchmark_tiledb_zarr_hub.pyChanges:
|
Codecov Report
@@ Coverage Diff @@
## master #534 +/- ##
=======================================
Coverage 88.46% 88.46%
=======================================
Files 52 52
Lines 3745 3745
=======================================
Hits 3313 3313
Misses 432 432 Continue to review full report at Codecov.
|
Good job! Could you rename the file to |
Yeah, I changed it to the more appropriate name. |
@DebadityaPal Great! I'll review it soon. |
@DebadityaPal I'll defer to @haiyangdeperci but I see duplicate code that can be factored out. for example: with Timer("Time"):
... |
@mynameisvinn I have removed the duplicate code. Thanks for pointing it out. |
@DebadityaPal much better! There is nothing better than concise, well organized code :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it absolutely necessary to put the the entire dataset in memory as below:
ds["image"].compute().reshape(ds.shape[0], -1),
The machine we are using for benchmarks has 160 GB of memory. This step requires 330 GiB of memory.
MemoryError: Unable to allocate 330. GiB for an array with shape (1803460, 256, 256, 3) and data type uint8
I've noted some other inefficiencies and I am working on them as well but this one is a blocker.
@DebadityaPal I am still having some memory issues after the patch. For instance:
|
The current changes resolve the memory crash issue. The whole thing should take less than 15GB of ram now. |
@DebadityaPal Great! Thanks for the update. I'll check it out right away |
A preliminary attempt at standardizing the benchmarks. Also did some optimizations. Linked to #529