Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update spaCy for thinc 8.0.0 #4920

Merged
merged 254 commits into from
Jan 29, 2020
Merged
Changes from 1 commit
Commits
Show all changes
254 commits
Select commit Hold shift + click to select a range
f61b441
Add load_from_config function
honnibal Dec 20, 2019
df99312
Add train_from_config script
honnibal Dec 20, 2019
c3f55b0
Merge configs and expose via spacy.config
ines Dec 20, 2019
1164c0a
Fix script
ines Dec 21, 2019
418383c
Suggest create_evaluation_callback
honnibal Dec 21, 2019
86d6f86
Hard-code for NER
honnibal Dec 21, 2019
c3fb7ab
Fix errors
honnibal Dec 21, 2019
4930405
Register command
honnibal Dec 21, 2019
29b86ab
Add TODO
honnibal Dec 21, 2019
fd7f044
Update train-from-config todos
honnibal Dec 21, 2019
6491a22
Fix imports
honnibal Dec 21, 2019
a42a9b0
Allow delayed setting of parser model nr_class
honnibal Dec 21, 2019
14d165a
Get train-from-config working
honnibal Dec 21, 2019
fb195bb
Tidy up and fix scores and printing
ines Dec 21, 2019
ea1e08c
Hide traceback if cancelled
ines Dec 21, 2019
7184e49
Fix weighted score formatting
ines Dec 21, 2019
a0b21b2
Fix score formatting
ines Dec 21, 2019
a038f08
Make output_path optional
ines Dec 21, 2019
8ad9304
Add Tok2Vec component
ines Dec 21, 2019
f318024
Tidy up and add tok2vec_tensors
ines Dec 21, 2019
1c18b3c
Add option to copy docs in nlp.update
honnibal Dec 21, 2019
febda25
Copy docs in nlp.update
honnibal Dec 21, 2019
d14f0d5
Adjust nlp.update() for set_annotations
honnibal Dec 21, 2019
d09f1a3
Don't shuffle pipes in nlp.update, decruft
honnibal Dec 21, 2019
f92e180
Support set_annotations arg in component update
honnibal Dec 21, 2019
5655601
Support set_annotations in parser update
honnibal Dec 21, 2019
16303ee
Add get_gradients method
honnibal Dec 21, 2019
b6d5d7a
Add get_gradients to parser
honnibal Dec 21, 2019
f2e907c
Merge branch 'develop' into feature/config
ines Dec 22, 2019
ab19240
Update errors.py
ines Dec 22, 2019
51d75d5
Fix problems caused by merge
ines Dec 22, 2019
a1c69a7
Add _link_components method in nlp
honnibal Dec 22, 2019
77c5215
Add concept of 'listeners' and ControlledModel
honnibal Dec 22, 2019
4b6e0f4
Merge branch 'feature/config' of https://github.com/explosion/spaCy i…
ines Dec 22, 2019
c50a5c4
Support optional attributes arg in ControlledModel
honnibal Dec 22, 2019
c2b675c
Try having tok2vec component in pipeline
honnibal Dec 22, 2019
cb16c9f
Fix tok2vec component
honnibal Dec 22, 2019
4afd803
Fix config
honnibal Dec 22, 2019
964c001
Merge branch 'feature/config' of https://github.com/explosion/spaCy i…
honnibal Dec 22, 2019
c36f268
Fix tok2vec
honnibal Dec 22, 2019
a22c98d
Update for Example
honnibal Dec 22, 2019
4bd4e6a
Update for Example
honnibal Dec 22, 2019
b80d910
Update config
honnibal Dec 22, 2019
4c8c409
Add eg2doc util
honnibal Dec 22, 2019
03cc457
Update and add schemas/types
ines Dec 23, 2019
703eefe
Update schemas
ines Dec 23, 2019
aec6323
Fix nlp.update
honnibal Dec 23, 2019
7bd7e4a
Fix tagger
honnibal Dec 23, 2019
7f4587e
Remove hacks from train-from-config
honnibal Dec 23, 2019
efb532a
Remove hard-coded config str
honnibal Dec 23, 2019
b68de39
Calculate loss in tok2vec component
honnibal Dec 23, 2019
52d550e
Merge config changes
honnibal Dec 23, 2019
9a4d86a
Tidy up and use function signatures instead of models
ines Dec 23, 2019
19b36ea
Support union types for registry models
ines Dec 23, 2019
a236c7f
Minor cleaning in Language.update
honnibal Dec 23, 2019
b728ef6
Make ControlledModel specifically Tok2VecListener
honnibal Dec 23, 2019
7a06529
Fix train_from_config
honnibal Dec 23, 2019
0a65889
Fix tok2vec
honnibal Dec 23, 2019
c1fe156
Tidy up
ines Dec 23, 2019
eccbbbc
Add function for bilstm tok2vec
honnibal Dec 23, 2019
559eb27
Resolve conflict
honnibal Dec 23, 2019
330cc89
Fix type
ines Dec 23, 2019
92bc4b9
Merge
honnibal Dec 23, 2019
7edb25e
Fix syntax
honnibal Dec 23, 2019
4e1ffc2
Fix pytorch optimizer
honnibal Dec 23, 2019
62d3296
Add example configs
honnibal Dec 24, 2019
1a22f0a
Update for thinc describe changes
honnibal Dec 27, 2019
a9e4ecb
Update for Thinc changes
honnibal Dec 27, 2019
17328b6
Update for dropout/sgd changes
honnibal Dec 27, 2019
a235149
Update for dropout/sgd changes
honnibal Dec 27, 2019
0d9db10
Unhack gradient update
honnibal Dec 27, 2019
eb2fb38
Merge remote-tracking branch 'upstream/develop' into develop
svlandeg Dec 30, 2019
33e0412
Merge remote-tracking branch 'upstream/develop' into develop
svlandeg Jan 2, 2020
5dbf6c1
Work on refactoring _ml
honnibal Jan 5, 2020
6c91746
Remove _ml.py module
honnibal Jan 5, 2020
ab1c794
WIP upgrade cli scripts for thinc
honnibal Jan 5, 2020
9dbb5d6
Move some _ml stuff to util
honnibal Jan 5, 2020
4985e88
Import link_vectors from util
honnibal Jan 5, 2020
060c3e4
Update train_from_config
honnibal Jan 5, 2020
8b3958e
Import from util
honnibal Jan 5, 2020
c29923a
Import from util
honnibal Jan 5, 2020
736feb4
Temporarily add ml.component_models module
honnibal Jan 5, 2020
2841bd4
Move ml methods
honnibal Jan 5, 2020
a663f14
Move typedefs
honnibal Jan 5, 2020
1a34229
Update load vectors
honnibal Jan 5, 2020
aba16f8
Update gitignore
honnibal Jan 5, 2020
6b72c55
Move imports
honnibal Jan 6, 2020
22c6b9e
Add PrecomputableAffine
honnibal Jan 6, 2020
d811baa
Fix imports
honnibal Jan 6, 2020
69f09d3
Fix imports
honnibal Jan 6, 2020
69ce720
Fix imports
honnibal Jan 6, 2020
65b1cfd
Fix missing imports
honnibal Jan 6, 2020
6b82b29
Update CLI scripts
honnibal Jan 6, 2020
ed002bc
Update spacy.language
honnibal Jan 6, 2020
9916657
Add stubs for building the models
honnibal Jan 6, 2020
4f8dc13
Update model definition
honnibal Jan 6, 2020
e894b7c
Update create_default_optimizer
honnibal Jan 6, 2020
4618b5b
Fix import
honnibal Jan 6, 2020
17da780
Fix comment
honnibal Jan 6, 2020
e8088eb
Update imports in tests
honnibal Jan 6, 2020
5f630a5
Update imports in spacy.cli
honnibal Jan 6, 2020
b2a8e71
Fix import
honnibal Jan 6, 2020
86138ae
fix obsolete thinc imports
svlandeg Jan 6, 2020
11d5e85
update srsly pin
svlandeg Jan 6, 2020
1e3cbf4
from thinc to ml_datasets for example data such as imdb
svlandeg Jan 6, 2020
a852cb4
update ml_datasets pin
svlandeg Jan 7, 2020
44aee40
using STATE.vectors
svlandeg Jan 7, 2020
7af38dc
small fix
svlandeg Jan 7, 2020
18dfca6
fix Sentencizer.pipe
svlandeg Jan 7, 2020
38635c2
black formatting
svlandeg Jan 7, 2020
9a1954f
rename Affine to Linear as in thinc
svlandeg Jan 7, 2020
74c0923
set validate explicitely to True
svlandeg Jan 7, 2020
73f9411
rename with_square_sequences to with_list2padded
svlandeg Jan 7, 2020
5f94104
rename with_flatten to with_list2array
svlandeg Jan 7, 2020
6fa9699
chaining layernorm
svlandeg Jan 7, 2020
3768a63
small fixes
svlandeg Jan 7, 2020
fc270e5
revert Optimizer import
svlandeg Jan 7, 2020
f031c58
build_nel_encoder with new thinc style
svlandeg Jan 7, 2020
71b3d3f
fixes using model's get and set methods
svlandeg Jan 7, 2020
9f98f0b
Tok2Vec in component models, various fixes
svlandeg Jan 8, 2020
169ea91
fix up legacy tok2vec code
svlandeg Jan 8, 2020
3de81fd
add model initialize calls
svlandeg Jan 8, 2020
eada223
add in build_tagger_model
svlandeg Jan 9, 2020
2d8a315
small fixes
svlandeg Jan 9, 2020
958628a
setting model dims
svlandeg Jan 9, 2020
c2e7eb8
fixes for ParserModel
svlandeg Jan 9, 2020
68827e9
various small fixes
svlandeg Jan 9, 2020
0225552
initialize thinc Models
svlandeg Jan 9, 2020
5358963
fixes
svlandeg Jan 9, 2020
cd0a9e5
consistent naming of window_size
svlandeg Jan 10, 2020
a86e5b3
fixes, removing set_dropout
svlandeg Jan 10, 2020
1e12251
work around Iterable issue
svlandeg Jan 10, 2020
79186f1
remove legacy tok2vec
svlandeg Jan 10, 2020
2d7f37e
util fix
svlandeg Jan 10, 2020
899e799
fix forward function of tok2vec listener
svlandeg Jan 10, 2020
0a60d81
more fixes
svlandeg Jan 10, 2020
8c1b7ee
trying to fix PrecomputableAffine (not succesful yet)
svlandeg Jan 10, 2020
ddde967
alloc instead of allocate
svlandeg Jan 10, 2020
78a8b54
add morphologizer
svlandeg Jan 10, 2020
841a485
rename residual
svlandeg Jan 13, 2020
ea1a39d
rename fixes
svlandeg Jan 13, 2020
a46d7b4
Fix predict function
honnibal Jan 13, 2020
7c5d037
Update parser and parser model
honnibal Jan 13, 2020
14eb39f
fixing few more tests
svlandeg Jan 13, 2020
406d4a0
Fix precomputable affine
honnibal Jan 13, 2020
e6f4457
Update component model
honnibal Jan 13, 2020
a3637c4
Update parser model
honnibal Jan 13, 2020
4202968
Merge branch 'feature/config' of https://github.com/svlandeg/spaCy in…
honnibal Jan 13, 2020
f8c758f
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 13, 2020
d67ef42
Move backprop padding to own function, for test
honnibal Jan 13, 2020
8327638
Update test
honnibal Jan 13, 2020
f234d02
Fix p. affine
honnibal Jan 13, 2020
4ac7e20
Update NEL
honnibal Jan 13, 2020
5bc212a
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 13, 2020
b8f7da9
build_bow_text_classifier and extract_ngrams
svlandeg Jan 13, 2020
3127f22
Fix parser init
honnibal Jan 13, 2020
8237c07
Fix test add label
honnibal Jan 13, 2020
f5a1f56
add build_simple_cnn_text_classifier
svlandeg Jan 13, 2020
1ef9b0d
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 13, 2020
3644367
Fix parser init
honnibal Jan 13, 2020
08219e0
Set gpu off by default in example
honnibal Jan 13, 2020
edd6ae0
Fix tok2vec listener
honnibal Jan 13, 2020
2042443
Fix parser model
honnibal Jan 13, 2020
2ec3d1e
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 13, 2020
144c26a
Small fixes
honnibal Jan 13, 2020
ca937bc
small fix for PyTorchLSTM parameters
svlandeg Jan 13, 2020
e54df17
revert my_compounding hack (iterable fixed now)
svlandeg Jan 13, 2020
fdddf8a
fix biLSTM
svlandeg Jan 13, 2020
8eb55de
Fix uniqued
honnibal Jan 13, 2020
7a4d584
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 14, 2020
9cd9a27
PyTorchRNNWrapper fix
svlandeg Jan 14, 2020
756771a
small fixes
svlandeg Jan 14, 2020
6652113
use helper function to calculate cosine loss
svlandeg Jan 14, 2020
715fb78
small fixes for build_simple_cnn_text_classifier
svlandeg Jan 14, 2020
3a9f962
putting dropout default at 0.0 to ensure the layer gets built
svlandeg Jan 15, 2020
524fb6e
using thinc util's set_dropout_rate
svlandeg Jan 15, 2020
be6a759
moving layer normalization inside of maxout definition to optimize dr…
svlandeg Jan 15, 2020
8d1d249
temp debugging in NEL
svlandeg Jan 15, 2020
96c04ac
fixed NEL model by using init defaults !
svlandeg Jan 15, 2020
2229511
fixing after set_dropout_rate refactor
svlandeg Jan 15, 2020
a09e2cc
proper fix
svlandeg Jan 15, 2020
04fa038
fix test_update_doc after refactoring optimizers in thinc
svlandeg Jan 15, 2020
764c358
Add CharacterEmbed layer
honnibal Jan 15, 2020
6e18bda
Construct tagger Model
honnibal Jan 15, 2020
4e47803
Add missing import
honnibal Jan 15, 2020
96a8b7b
Remove unused stuff
honnibal Jan 15, 2020
1cac2f1
Work on textcat
honnibal Jan 15, 2020
beadcb3
fix test (again :)) after optimizer refactor
svlandeg Jan 15, 2020
80fb8d6
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 15, 2020
8512bc6
fixes to allow reading Tagger from_disk without overwriting dimensions
svlandeg Jan 15, 2020
1685d85
don't build the tok2vec prematuraly
svlandeg Jan 16, 2020
2aced6a
fix CharachterEmbed init
svlandeg Jan 16, 2020
40f5251
CharacterEmbed fixes
svlandeg Jan 16, 2020
531b645
Fix CharacterEmbed architecture
svlandeg Jan 16, 2020
459a035
Merge remote-tracking branch 'upstream/develop' into develop
svlandeg Jan 17, 2020
923c48b
Merge remote-tracking branch 'upstream/develop' into feature/config
svlandeg Jan 17, 2020
b7c8fde
fix imports
svlandeg Jan 17, 2020
0ff02c3
renames from latest thinc update
svlandeg Jan 19, 2020
11fbff3
one more rename
svlandeg Jan 19, 2020
a815571
add initialize calls where appropriate
svlandeg Jan 19, 2020
fea47eb
fix parser initialization
svlandeg Jan 19, 2020
018d671
Update Thinc version
ines Jan 19, 2020
f7458a1
Fix errors, auto-format and tidy up imports
ines Jan 20, 2020
f4a677c
Fix validation
ines Jan 20, 2020
1afdaee
fix if bias is cupy array
svlandeg Jan 20, 2020
b20c424
revert for now
svlandeg Jan 20, 2020
5cf5deb
ensure it's a numpy array before running bp in ParserStepModel
svlandeg Jan 20, 2020
7aebbd3
no reason to call require_gpu twice
svlandeg Jan 20, 2020
b1f5b1d
use CupyOps.to_numpy instead of cupy directly
svlandeg Jan 20, 2020
0668d10
fix initialize of ParserModel
svlandeg Jan 20, 2020
329c191
remove unnecessary import
svlandeg Jan 20, 2020
e1c6e74
fixes for CosineDistance
svlandeg Jan 20, 2020
4c10143
fix device renaming
svlandeg Jan 20, 2020
7f3822e
use refactored loss functions (Thinc PR 251)
svlandeg Jan 21, 2020
02d4e07
overfitting test for tagger
svlandeg Jan 22, 2020
b762bb8
experimental settings for the tagger: avoid zero-init and subword nor…
svlandeg Jan 22, 2020
4126c5a
clean up tagger overfitting test
svlandeg Jan 22, 2020
ec3b4c2
use previous default value for nP
svlandeg Jan 22, 2020
59a32e1
remove toy config
svlandeg Jan 22, 2020
ceacbe8
bringing layernorm back (had a bug - fixed in thinc)
svlandeg Jan 22, 2020
e762eb8
revert setting nP explicitly
svlandeg Jan 23, 2020
d9822a6
remove setting default in constructor
svlandeg Jan 23, 2020
8634f48
restore values as they used to be
svlandeg Jan 23, 2020
bc9c8ac
add overfitting test for NER
svlandeg Jan 23, 2020
43f354d
add overfitting test for dep parser
svlandeg Jan 23, 2020
621154a
add overfitting test for textcat
svlandeg Jan 23, 2020
df0b8ec
fixing init for linear (previously affine)
svlandeg Jan 23, 2020
9ed1aae
larger eps window for textcat
svlandeg Jan 23, 2020
0ef9083
ensure doc is not None
svlandeg Jan 23, 2020
b8c7c37
Merge branch 'develop' into feature/config
svlandeg Jan 23, 2020
36a1553
Merge remote-tracking branch 'origin/fix/gold-training' into feature/…
svlandeg Jan 23, 2020
1a0dd90
Merge sentencizer changes
honnibal Jan 23, 2020
2ddc554
Require newer thinc
honnibal Jan 23, 2020
c0137eb
Make float check vaguer
honnibal Jan 23, 2020
245d6dc
Slop the textcat overfit test more
honnibal Jan 24, 2020
2c4f859
Fix textcat test
honnibal Jan 24, 2020
e1467ba
Fix exclusive classes for textcat
honnibal Jan 24, 2020
9b22180
Merge branch 'feature/config' of https://github.com/svlandeg/spaCy in…
svlandeg Jan 24, 2020
43b7ddd
Merge remote-tracking branch 'upstream/feature/config' into feature/c…
svlandeg Jan 24, 2020
8e49265
fix after renaming of alloc methods
svlandeg Jan 27, 2020
f3cf970
fixing renames and mandatory arguments (staticvectors WIP)
svlandeg Jan 27, 2020
36cff9f
upgrade to thinc==8.0.0.dev3
svlandeg Jan 27, 2020
0acbe67
refer to vocab.vectors directly instead of its name
svlandeg Jan 27, 2020
7d71cbf
rename alpha to learn_rate
svlandeg Jan 27, 2020
792b73b
adding hashembed and staticvectors dropout
svlandeg Jan 27, 2020
8f52a27
upgrade to thinc 8.0.0.dev4
svlandeg Jan 28, 2020
10840cf
add name back to avoid warning W020
svlandeg Jan 28, 2020
2b8076b
thinc dev4
svlandeg Jan 28, 2020
598f401
update srsly
svlandeg Jan 28, 2020
4678188
using thinc 8.0.0a0 !
svlandeg Jan 28, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Temporarily add ml.component_models module
  • Loading branch information
honnibal committed Jan 5, 2020
commit 736feb40fe77b46865577690e2481cea93a7692d
28 changes: 28 additions & 0 deletions spacy/ml/component_models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
def build_text_classifier(*args, **kwargs):
raise NotImplementedError


def build_simple_cnn_text_classifier(*args, **kwargs):
raise NotImplementedError


def build_bow_text_classifier(*args, **kwargs):
raise NotImplementedError


def build_nel_encoder(*args, **kwargs):
raise NotImplementedError


def masked_language_model(*args, **kwargs):
raise NotImplementedError


def Tok2Vec(*args, **kwargs):
raise NotImplementedError

get_cossim_loss = None
zero_init = None
PrecomputableAffine = None
flatten = None
PrecomputableAffine = None