-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rdkit_2d featurizer #877
Merged
Merged
Changes from 1 commit
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
b677fac
add rdkit_2d featurizer
KnathanM a620924
Add test and docs for rdkit features
KnathanM 88408c0
improve docs for molecule featurizers
KnathanM 64d4c2f
address comments
KnathanM 1f958ff
allow featurizer to return nans
KnathanM ebef8e4
add v1 rdkit molecule featurizers
KnathanM 965e461
Merge branch 'main' into rdkit_2d_features
KnathanM def761f
fix merge errors
KnathanM f45d914
update documentation for rdkit descriptors
KnathanM f83dadc
lint
KnathanM 7b0b8a7
address comments
KnathanM e24cf6d
Merge branch 'main' into rdkit_2d_features
KnathanM 5b80ec7
fix missing feature when n_heavyatoms = 0
KnathanM 44900b0
Apply suggestions from code review
KnathanM f3769cd
fix typo
KnathanM c5b89ca
Merge branch 'main' into rdkit_2d_features
shihchengli 7e5208a
require descriptastorus
KnathanM 870e9c0
remove descriptastorus skip in tests
KnathanM 3717cec
Merge branch 'main' into rdkit_2d_features
KnathanM File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
lint
- Loading branch information
commit f83dadc5df33269bc7ec78be95ea80856115eb28
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between
RDKit2DFeaturizer
andV1RDKit2DFeaturizer
? Should we just keep one of them to avoid confusion?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the new one has 10 more descriptors. Maybe we can just keep one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had to keep just one, I would recommend it being the V1RDKit2DFeaturizer because if someone is trying to use a v1 model that used rdkit features, they would need that featurizer.
I added a new version because of the extra descriptors that have been added since descriptastorus. It could be nice to let people get these more descriptors. My intention is that all new models would use RDKit2DFeaturizer and that the V1 version would only be used for models that were trained in v1.
Let me know if you feel strongly, or we can bring it up in the next meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with keeping both of them. However, in the long term, I don't think people will care about V1RDKit2DFeaturizer anymore, as it will become a legacy we used in the past. If someone wants to do a benchmark with this smaller set, they should take responsibility for generating those features correctly via a notebook and provide them as additional features.