-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Null Issue Manager #856
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #856 +/- ##
==========================================
+ Coverage 96.70% 96.82% +0.12%
==========================================
Files 65 66 +1
Lines 5091 5229 +138
Branches 875 905 +30
==========================================
+ Hits 4923 5063 +140
+ Misses 86 85 -1
+ Partials 82 81 -1
☔ View full report in Codecov by Sentry. |
Thanks for the contribution @abhijitpal1247! To make this easier to review, can you:
Thanks, we can get somebody to review this PR once those basics are provided. After your code structure has been verified to be good, we'll ask you to add a few more unit tests in order to get the codecov of this new code up to 100%. |
@jwmueller embeddings: outputs: report:
issues:
summary:
info:
P.S.: Converted dataframes to dict and presented here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quality score for each row should be the fraction of features which are not null in that row.
The rows that are marked as is_null_issue
should ONLY be those rows which are 100% null values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @abhijitpal1247!
I suggest you extract the core logic of computing the issue masks and scores into a separate method (see comment).
I also suggest you compute additional information in collect_info
.
Finally, I suggest you split the "parametrized" embeddings into separate test classes. One without null values, another with some null values.
@elisno will leave this for your review. The main criteria to ensure are simply that:
|
@elisno I have implemented the suggested changes. Let me know if I am missing out on something. |
@elisno , I have added the changes. Please review it. As we are nearing to the end of hacktoberfest, I would love to get this merge too. |
As a quality score, high values mean there are proportionally fewer NaN values in a given row.
Hey @jwmueller any updates 😅 We are very close to the end of hacktoberfest. Let me know if you need any additional info that can help you out. |
Hi @abhijitpal1247, I'm pushing a few changes in a bit before we can merge this today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful work @abhijitpal1247!
Thank you for adding this issue manager!
closes #810
Made a basic IssueManager for handling null values.