Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loader optimization: Fix incorrect in-place sort of chunks by partition #116

Merged
merged 2 commits into from
May 20, 2022

Conversation

lossyrob
Copy link
Member

@lossyrob lossyrob commented May 17, 2022

In-place sort was being done on an ephemeral list. The lack of sort was causing itertools.group_by to make small groups as it streamed through the iterator, causing ingests to be slow for large sets of items with a mix of partitions.

Seeing a significant performance increase for large item ingest with many partitions (weekly partitions on about 4 months of Sentinel 2, dropping 50K item group ingests from 1000s to 40s)

@lossyrob lossyrob marked this pull request as draft May 17, 2022 00:47
@lossyrob lossyrob changed the title Fix incorrect in-place sort Loader optimization: Fix incorrect in-place sort of chunks by partition May 17, 2022
@lossyrob lossyrob marked this pull request as ready for review May 17, 2022 02:59
@lossyrob lossyrob requested a review from bitner May 17, 2022 03:01
@bitner bitner merged commit 3baae42 into main May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants