-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#860] Adding Spurious Correlation feature #1140
Conversation
cleanlab/datalab/datalab.py
Outdated
@@ -635,3 +636,64 @@ def load(path: str, data: Optional[Dataset] = None) -> "Datalab": | |||
load_message = f"Datalab loaded from folder: {path}" | |||
print(load_message) | |||
return datalab | |||
|
|||
def _spurious_correlation( | |||
self, properties_of_interest: Optional[List[str]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, let's omit this argument in Datalab._spurious_correlation()
.
Remember to remove the parameter in the docstring as well.
odd_aspect_ratio_score 0.900000 | ||
""" | ||
try: | ||
issues = self.get_issues() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a validation step that ensures that the issues
dataframe has all the relevant (image-specific) scores.
If it doesn't an error with a helpful message should be raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a validation step here to cjeck all vision/image issues are present in the correlations dataframe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the issues
dataframe should be validated, not the correlations_df
.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1140 +/- ##
==========================================
- Coverage 95.90% 95.82% -0.08%
==========================================
Files 81 82 +1
Lines 6050 6102 +52
Branches 996 1071 +75
==========================================
+ Hits 5802 5847 +45
- Misses 148 151 +3
- Partials 100 104 +4 ☔ View full report in Codecov by Sentry. |
Co-authored-by: Pratham Savaliya <no-reply>
…rrelation() method
…ISION_ISSUES and removed optional parameter in method and docstring
…d vision scores in transformed datasets
193543f
to
2b01057
Compare
Summary
spurious_correlation.py
module incleanlab/datalab/internal
location._spurious_correlation
inDatalab
class that uses an instance ofSpuriousCorrelations
class.Links to Relevant Issues or Conversations