SAPPHIRE is a simple monolingual phrase aligner based on word embeddings.
SAPPHIRE depends only on a pre-trained word embeddings.
Therefore, it is easily transferable to specific domains and different languages.
This library is designed for a pre-trained model of fastText.
But it is easy to replace the model.
- Python 3.5 or newer
- NumPy & SciPy
- fasttext
-
Install requirements
After cloning this repository, go to the root directory andpip install -r requirements.txt
orpipenv install
. -
Download pre-trained model of fastText or prepare your model of fastText.
$ curl -O https://dl.fbaipublicfiles.com/fasttext/vectors-english/wiki-news-300d-1M-subword.bin.zip
$ unzip wiki-news-300d-1M-subword.bin.zip
- Move the pre-trained model to model directory.
import sa