-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add example stubs (3) #12801
feat: add example stubs (3) #12801
Conversation
feat: add example stubs (2)
spacy/training/corpus.py
Outdated
class ReaderProtocol(Protocol): | ||
def __call__(self, nlp: "Language") -> Iterable[Example]: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrianeboyd : regarding your comment saying that Iterator
was correct here.
The ReaderProtocol
introduced in this PR aligns with the status of the create_X_reader
methods before this PR, them being typed as returning a Callable[["Language"], Iterable[Example]]
. So introducing this ReaderProtocol
here is not a modification in any way.
There was however an inconsistency between the typing of these create_X_reader
methods and the implementations of the XCorpus.__call__
methods, the latter returning Iterator[Example]
. As an Iterator
is also an Iterable
, we can make the types consistent by typing the latter ones as Iterable
too.
That said - I'm fine with taking this contribution out of this PR and have this one focus only on adding the example.pyi
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to move this to a separate PR. I still think that Iterable[Example]
should be Iterator[Example]
throughout since Iterator[Example]
is the correct type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is Iterator
the correct type? You could also argue that it's nice to define a ReaderProtocol
that is more generic/permissive in case you want to implement a different reader.
Anyway I don't want this discussion to hold up the PR and the unrelated changes & improvements of adding the example.pyi
so I'm reverting those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the changes in corpus.py
makes mypy
break on the CI. It's unclear to me why this incompatibility between Iterable
and Iterator
only shows up when adding example.pyi
, I guess because otherwise the statements were left unchecked. So that's why Basile had these as part of this PR in the first place.
So, let's make a final decision and make things consistent. I felt like typing to Iterable
is the most generic and least breaking because you can stil return an iterator, while narrowing down to Iterator
for all readers might be more breaking. What's your counter argument to type everything as Iterator
@adrianeboyd ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what isn't breaking is to widen argument types or to narrow return types? This is widening the return type, which could potentially break code if someone was indeed counting on it being an Iterator
. I don't think that we're using it as an Iterator
anywhere in our code or projects or examples, but I'm not even 100% sure.
For the *Corpus
classes modified here, the correct type is Iterator[Example]
and I don't see why this needs to be modified?
For the readers the ReaderProtocol
as proposed seems fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, looking at this again with fresh eyes, it appears that the whole Iterator
/Iterable
changes were made chasing a red herring. mypy
started failing, and rightfully so, after introducing example.pyi
because there was a wrongly typed return type Iterable[Doc]
in spacy.PlainTextCorpus.v1
, which should have been Iterable[Example]
. Fixing that, makes the CI green with no other edits needed.
We might still want to introduce the ReaderProtocol
, I thought it was nice too, but let's do that in a separate PR to keep the changes minimal here and so we can get this merged in hopefully soonish.
@property | ||
def ents(self) -> Sequence[Span]: ... | ||
@ents.setter | ||
def ents(self, value: Sequence[Span]) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setter should also accept a wider range of values including tuples, so this is incorrect in many ways. I don't really think we should encourage the use of tuples here, but the general bug/constraint from mypy is pretty limiting.
Third time's a charm (maybe) - PR following up on #12679 after Github sync & auth issues.
Description
This PR adds a stubs file for
spacy.training.example
. It also fixes a few typing-related issues.As this is targetingdevelop
, the history will look bad until the branches are synced. DoneTypes of change
enhancement
Checklist