[update] Update README.md

- Add the instruction on how to use with Docker. - Fix "Usage" section according to changes in other files.
m-yoshinaka · Sep 16, 2020 · 9d0f6d9 · 9d0f6d9
1 parent c0716b1
commit 9d0f6d9
Showing 1 changed file with 43 additions and 24 deletions.
diff --git a/README.md b/README.md
@@ -3,6 +3,7 @@
 **SAPPHIRE** is a simple monolingual phrase aligner based on word embeddings.
 
 We explain the details of SAPPHIRE in the following paper.
+[[PDF]](https://www.aclweb.org/anthology/2020.lrec-1.847.pdf)
 ```
 @inproceedings{yoshinaka-etal-2020,
     author      = {Yoshinaka, Masato and Kajiwara, Tomoyuki and Arase, Yuki},
@@ -19,8 +20,8 @@ We explain the details of SAPPHIRE in the following paper.
 
 SAPPHIRE depends only on a pre-trained word embedding.
 Therefore, it is easily transferable to specific domains and different languages.
-This library is designed for a pre-trained model of [fastText](https://fasttext.cc/).
-But it is easy to replace the model.
+This tool is designed for a pre-trained model of [fastText](https://fasttext.cc/).
+(Of course, it is easy to replace the word embedding.)
 
 
 ## Requirements
@@ -30,27 +31,37 @@ But it is easy to replace the model.
 - fasttext
 
 
-## Installation (for fastText version)
+## Installation
 
-1. Install requirements
-After cloning this repository, go to the root directory and install requirements.
+1. Download the pre-trained model of fastText
+(or prepare your model of fastText) and move it to *model* directory.
 ```
-$ pip install -r requirements.txt
+$ curl -O https://dl.fbaipublicfiles.com/fasttext/vectors-english/wiki-news-300d-1M-subword.bin.zip
+$ unzip wiki-news-300d-1M-subword.bin.zip
+$ mv wiki-news-300d-1M-subword.bin model/
 ```
 
-2. Install SAPPHIRE
-Installation with `develop` option allows you to change the parameters and add scripts for other word representations.
+### Docker
+1. Build the Docker image:
 ```
-$ python setup.py develop
+$ docker build -t sapphire .
+```
+2. Run a container:
+```
+$ docker run -it --rm -v ${PWD}/model:/work/model sapphire:latest /bin/bash
+# python
+>>> from sapphire import Sapphire
 ```
 
-
-3. Download the pre-trained model of fastText (or prepare your model of fastText) and move it to *model* directory.
+### Local installation
+1. Install requirements:
 ```
-$ curl -O https://dl.fbaipublicfiles.com/fasttext/vectors-english/wiki-news-300d-1M-subword.bin.zip
-$ unzip wiki-news-300d-1M-subword.bin.zip
-$ mkdir model
-$ mv wiki-news-300d-1M-subword.bin model/
+$ pip install -r requirements.txt
+```
+2. Install SAPPHIRE using `develop` option
+(that allows you to add scripts for other word representations):
+```
+$ python setup.py develop
 ```
 
 
@@ -60,20 +71,28 @@ $ mv wiki-news-300d-1M-subword.bin model/
 ```
 $ python run_sapphire.py model/wiki-news-300d-1M-subword.bin
 ```
-To stop SAPPHIRE, enter `EXIT` when inputting a sentence.
+To stop SAPPHIRE, enter `Ctrl-C` when inputting a sentence.
 
 ### Usage of the SAPPHIRE module
 ```
+>>> import fasttext
 >>> from sapphire import Sapphire
->>> aligner = Sapphire()
+>>> model = fasttext.FastText.load_model(path_to_your_model)
+>>> aligner = Sapphire(model)
+```
+If you change the hyper-parameters,
 ```
-After preparing a **tokenized** sentence pair (`tokenized_sentence_a: list` and `tokenized_sentence_b: list`),
+>>> aligner.set_params(lambda_=0.6, delta=0.6, alpha=0.01, hungarian=False)
 ```
->>> result = aligner.align(tokenized_sentence_a, tokenized_sentence_b)
->>> alignment = result.top_alignment[0][0]
->>> print(alignment)
+After preparing a **tokenized** sentence pair
+(`tokenized_sentence_a: list` and `tokenized_sentence_b: list`),
+```
+>>> _, alignment = aligner.align(tokenized_sentence_a, tokenized_sentence_b)
+>>> alignment
 [(1, 3, 2, 3), (8, 9, 5, 6), (13, 13, 8, 8), (27, 27, 9, 9)]
 ```
-phrase pair <img src="https://render.githubusercontent.com/render/math?math={(x, y)}"> : 
-<img src="https://render.githubusercontent.com/render/math?math={(x_\text{start}, x_\text{end}, y_\text{start}, y_\text{end})}">
-  \# 1-indexed alignment
+
+- Phrase pair <img src="https://render.githubusercontent.com/render/math?math={(x,y)}">
+is represented as
+<img src="https://render.githubusercontent.com/render/math?math={(x_\text{start},x_\text{end},y_\text{start},y_\text{end})}">.
+- Outputs of SAPPHIRE are 1-indexed alignments.