Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Memory Efficiency for find_label_issues_batched method #853

Closed
elisno opened this issue Sep 25, 2023 · 3 comments
Closed

Improve Memory Efficiency for find_label_issues_batched method #853

elisno opened this issue Sep 25, 2023 · 3 comments
Labels
enhancement New feature or request help-wanted We need your help to add this, but it may be more challenging than a "good first issue"

Comments

@elisno
Copy link
Member

elisno commented Sep 25, 2023

Summary

This issue aims to track the enhancement proposal to improve memory efficiency for the find_label_issues_batched method.

Problem

Users with large segmentation data have experienced memory issues due to the need for loading large datasets into memory with the current implementation of find_label_issues_batched.

Suggested Solution

At a high level, a proposed solution is to make the method more memory-efficient by modifying it to iterate over images instead of points and ensuring that flatten_and_preprocess is called internally.

Related Context

Issue #842 discusses the memory issues faced by a user when using the find_label_issues method on large datasets for segmentation.

@elisno elisno added needs triage enhancement New feature or request help-wanted We need your help to add this, but it may be more challenging than a "good first issue" and removed needs triage labels Sep 25, 2023
@jwmueller
Copy link
Member

jwmueller commented Oct 5, 2023

Sorry this GH issue is mis-specified, our small team is working to fix it to be properly specified. The cleanlab/experimental/label_issues_batched.py method has nothing to do with images, and should not be altered specifically for image data.

In the meantime, please refer to this issue:
#842

and try to find a fix outside of the label_issues_batched.py methods.

Apologies for the confusion!

@jwmueller
Copy link
Member

we will reopen this once it is properly specified

@jwmueller
Copy link
Member

New issue to track this problem is here:
#863

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help-wanted We need your help to add this, but it may be more challenging than a "good first issue"
Projects
None yet
Development

No branches or pull requests

2 participants