SpaCy tokenizer benchmark Introduction Quick and dirty scripts to measure the performance of spaCy Prepare the data $ download_corpus.sh $ python3 transform_corpus.py Run the benchmark $ python3 benchmark_spacy.py