Skip to content

Commit

Permalink
add int8 bert model (onnx#481)
Browse files Browse the repository at this point in the history
* add int8 bert model

Signed-off-by: mengniwa <mengni.wang@intel.com>

* update readme

Signed-off-by: mengniwa <mengni.wang@intel.com>

Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
  • Loading branch information
mengniwang95 and wenbingl authored Nov 12, 2021
1 parent 5f7b9ca commit 2063d79
Show file tree
Hide file tree
Showing 5 changed files with 50 additions and 5 deletions.
43 changes: 38 additions & 5 deletions text/machine_comprehension/bert-squad/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,17 @@ BERT (Bidirectional Encoder Representations from Transformers) applies Transform

## Model

|Model |Download |Download (with sample test data)| ONNX version |Opset version|
| ------------- | ------------- | ------------- | ------------- | ------------- |
|BERT-Squad| [416 MB](model/bertsquad-8.onnx) | [385 MB](model/bertsquad-8.tar.gz) | 1.3 | 8|
|BERT-Squad| [416 MB](model/bertsquad-10.onnx) | [384 MB](model/bertsquad-10.tar.gz) | 1.5 | 10|
|Model |Download |Download (with sample test data)| ONNX version |Opset version| Accuracy|
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
|BERT-Squad| [416 MB](model/bertsquad-8.onnx) | [385 MB](model/bertsquad-8.tar.gz) | 1.3 | 8| |
|BERT-Squad| [416 MB](model/bertsquad-10.onnx) | [384 MB](model/bertsquad-10.tar.gz) | 1.5 | 10| |
|BERT-Squad| [416 MB](model/bertsquad-12.onnx) | [384 MB](model/bertsquad-12.tar.gz) | 1.9 | 12| 80.67171|
|BERT-Squad-int8| [119 MB](model/bertsquad-12-int8.onnx) | [101 MB](model/bertsquad-12-int8.tar.gz) | 1.9 | 12| 80.43519|
> Compared with the fp32 BERT-Squad, BERT-Squad-int8's accuracy drop ratio is 0.29%, performance improvement is 1.81x.
>
> Note the performance depends on the test hardware.
>
> Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
Dependencies
* [tokenization.py](dependencies/tokenization.py)
Expand Down Expand Up @@ -110,13 +117,39 @@ Metric is Exact Matching (EM) of 80.7, computed over SQuAD v1.1 dev data, for th
## Training
Fine-tuned the model using SQuAD-1.1 dataset. Look at [BertTutorial.ipynb](https://github.com/onnx/tensorflow-onnx/blob/master/tutorials/BertTutorial.ipynb) for more information for converting the model from tensorflow to onnx and for fine-tuning

## Quantization
BERT-Squad-int8 is obtained by quantizing BERT-Squad model (opset=12). We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel-innersource/frameworks.ai.lpot.intel-lpot/blob/master/examples/onnxrt/onnx_model_zoo/bert-squad/readme.md) to understand how to use Intel® Neural Compressor for quantization.

### Environment
onnx: 1.9.0
onnxruntime: 1.8.0

### Prepare model
```shell
wget https://github.com/onnx/models/raw/master/text/machine_comprehension/bert-squad/model/bertsquad-12.onnx
```

### Model quantize
```bash
bash run_tuning.sh --input_model=/path/to/model \ # model path as *.onnx
--output_model=/path/to/model_tune \
--dataset_location=/path/to/SQuAD/dataset \
--config=bert.yaml
```

## References
* **BERT** Model from the paper [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)

* [BERT Tutorial](https://github.com/onnx/tensorflow-onnx/blob/master/tutorials/BertTutorial.ipynb)

* [Intel® Neural Compressor](https://github.com/intel/neural-compressor)

## Contributors
[Kundana Pillari](https://github.com/kundanapillari)
* [Kundana Pillari](https://github.com/kundanapillari)
* [mengniwang95](https://github.com/mengniwang95) (Intel)
* [airMeng](https://github.com/airMeng) (Intel)
* [ftian1](https://github.com/ftian1) (Intel)
* [hshen14](https://github.com/hshen14) (Intel)

## License
Apache 2.0
Git LFS file not shown
Git LFS file not shown
3 changes: 3 additions & 0 deletions text/machine_comprehension/bert-squad/model/bertsquad-12.onnx
Git LFS file not shown
Git LFS file not shown

0 comments on commit 2063d79

Please sign in to comment.