Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init paddle-nlp #2112

Merged
merged 211 commits into from
Apr 22, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
211 commits
Select commit Hold shift + click to select a range
a55440a
init paddle-nlp tools for QA test
chenbjin Apr 5, 2019
c4fc5ea
Fix paragraph extraction bug
Apr 8, 2019
76636fc
Update download links
Apr 8, 2019
13ffb7b
first update LAC README.md
Halfish Apr 8, 2019
d8a536c
rename EmoTect as emotion_detection
chenbjin Apr 8, 2019
797b0d2
download data from bos
Halfish Apr 8, 2019
8cfae7f
Update README.md
Halfish Apr 8, 2019
34c3cc7
Rename project
Apr 8, 2019
b5afc34
second add code
zhangyimi Apr 8, 2019
4df27c1
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
zhangyimi Apr 8, 2019
742d7b9
modify downloads.sh for lac
Halfish Apr 8, 2019
c51ed7e
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
Halfish Apr 8, 2019
c6dcf81
rename LAC to lexical_analysis
Halfish Apr 8, 2019
4a3e311
update lac readme
Halfish Apr 8, 2019
b0ec7ab
Update README.md
zhangyimi Apr 8, 2019
5a55634
Update README.md
zhangyimi Apr 8, 2019
db027f9
Update README.md
zhangyimi Apr 8, 2019
068b6c8
add struct.jpg
zhangyimi Apr 8, 2019
8276d8d
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
zhangyimi Apr 8, 2019
97b3b92
Update README.md
zhangyimi Apr 8, 2019
7b687e0
Update README.md
zhangyimi Apr 8, 2019
c6bc41e
update README
chenbjin Apr 8, 2019
bbef065
Update README.md
zhangyimi Apr 8, 2019
c779830
update emotion_detection README
chenbjin Apr 8, 2019
6c131a3
add download_data.sh and download_model.sh
chenbjin Apr 8, 2019
06ef80e
first commit ADE
luluxing3 Apr 9, 2019
0c16b2f
dialogue_model_toolkit_update
0YuanZhang0 Apr 9, 2019
c3db25b
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
0YuanZhang0 Apr 9, 2019
4958bc4
update emotion_detection model bos url
chenbjin Apr 9, 2019
cb059cd
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
chenbjin Apr 9, 2019
979c307
update README
luluxing3 Apr 9, 2019
47e175e
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
luluxing3 Apr 9, 2019
9c21762
update readme
0YuanZhang0 Apr 9, 2019
7100e3d
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
0YuanZhang0 Apr 9, 2019
87a2068
update readme
0YuanZhang0 Apr 9, 2019
7f336d5
update download file
0YuanZhang0 Apr 9, 2019
35e6fff
first commit DAM
luluxing3 Apr 10, 2019
2038575
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
luluxing3 Apr 10, 2019
b085ea7
add readme
luluxing3 Apr 10, 2019
cf1d166
fix readme
luluxing3 Apr 10, 2019
dea9ea5
fix readme
luluxing3 Apr 10, 2019
6895fe0
fix readme
luluxing3 Apr 10, 2019
342dc37
fix readme
luluxing3 Apr 10, 2019
8e48e42
fix readme
luluxing3 Apr 10, 2019
d50c211
rename
luluxing3 Apr 10, 2019
2b36c9e
rename again
luluxing3 Apr 10, 2019
6751e15
1. add gradient_clip for ernie_lac
Apr 10, 2019
5b1a309
fix download.sh
zhangyimi Apr 10, 2019
db451cb
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
zhangyimi Apr 10, 2019
44c1c13
Rename MRC task
Apr 10, 2019
4458d12
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
Apr 10, 2019
99d8362
fix logger
zhangyimi Apr 10, 2019
e37ff4c
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
zhangyimi Apr 10, 2019
8ba34f2
fix to douban
luluxing3 Apr 11, 2019
0519e9a
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
luluxing3 Apr 11, 2019
d9de355
fix final
luluxing3 Apr 11, 2019
51b1b56
update readme
luluxing3 Apr 11, 2019
cc4544c
update readme
luluxing3 Apr 11, 2019
b41076e
update readme
luluxing3 Apr 11, 2019
e38a9c7
fix batch is null
luluxing3 Apr 11, 2019
9ee6b63
fix typo
luluxing3 Apr 11, 2019
11fd719
fix typo
luluxing3 Apr 11, 2019
5f43165
fix typo
luluxing3 Apr 11, 2019
bb0d090
update ernie config
chenbjin Apr 11, 2019
f7a6ba7
update readme
chenbjin Apr 11, 2019
0436dd3
add AI platform url in readme
chenbjin Apr 11, 2019
a5809d0
update readme subtitlestyle
chenbjin Apr 11, 2019
beae8c6
update
ChinaLiuHao Apr 11, 2019
670b513
Update README.md
ChinaLiuHao Apr 11, 2019
6254425
Update README.md
ChinaLiuHao Apr 11, 2019
72a5e5c
update
ChinaLiuHao Apr 11, 2019
5d902f2
Create README.md
ChinaLiuHao Apr 11, 2019
10fafd3
Update README.md
ChinaLiuHao Apr 11, 2019
8adfb4e
Update README.md
ChinaLiuHao Apr 11, 2019
23cce9a
Update README.md
ChinaLiuHao Apr 11, 2019
15acc84
Update README.md
ChinaLiuHao Apr 11, 2019
0552725
Update README.md
ChinaLiuHao Apr 11, 2019
20bcc96
Update README.md
ChinaLiuHao Apr 11, 2019
15ee96f
Update README.md
ChinaLiuHao Apr 11, 2019
5661bed
Update README.md
ChinaLiuHao Apr 11, 2019
71ec1c8
Update README.md
ChinaLiuHao Apr 11, 2019
40c5902
Update README.md
ChinaLiuHao Apr 11, 2019
1c9827e
Update README.md
ChinaLiuHao Apr 11, 2019
7366794
Update README.md
ChinaLiuHao Apr 11, 2019
0aecd5d
Update README.md
ChinaLiuHao Apr 11, 2019
4511504
Update README.md
ChinaLiuHao Apr 11, 2019
5966ba6
Update README.md
ChinaLiuHao Apr 11, 2019
9480e9c
update batch size
luluxing3 Apr 11, 2019
dcb334e
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
luluxing3 Apr 11, 2019
e6c39f2
adapt to samll data size
luluxing3 Apr 11, 2019
e22a6f2
update ERNIE bcebos url
chenbjin Apr 12, 2019
32bbb7b
add language model
Aurelius84 Apr 12, 2019
16e7e6a
Merge pull request #2035 from Aurelius84/paddle-nlp
phlrain Apr 12, 2019
d87cdec
modify readme
0YuanZhang0 Apr 12, 2019
38f8581
update
ChinaLiuHao Apr 12, 2019
5ee4b0f
update
ChinaLiuHao Apr 12, 2019
b695f7c
Update README.md
ChinaLiuHao Apr 12, 2019
2172734
Update README.md
ChinaLiuHao Apr 12, 2019
7af4cd1
fix readme
luluxing3 Apr 12, 2019
29e112a
fix max_step, update run.sh and run_ernie.sh
chenbjin Apr 14, 2019
69084b8
add finetuned model for lac
Halfish Apr 15, 2019
22e627f
fix bug
Halfish Apr 15, 2019
5097081
Update README.md
Halfish Apr 15, 2019
a1400bf
update
ChinaLiuHao Apr 15, 2019
ebad1a5
Update README.md
zhangyimi Apr 15, 2019
30847e5
add ERNIE pretrained model, and update README
chenbjin Apr 15, 2019
ebfd297
update readme
0YuanZhang0 Apr 16, 2019
b2569d8
add CPU
luluxing3 Apr 16, 2019
cecb2a1
Merge branch 'paddle-nlp' of github.com:PaddlePaddle/models into padd…
luluxing3 Apr 16, 2019
38418ef
update infer in run.sh and run_ernie.sh
chenbjin Apr 16, 2019
d483d9f
Update README.md
ChinaLiuHao Apr 16, 2019
b821d2f
Update README.md
ChinaLiuHao Apr 16, 2019
8522379
Delete test.py
phlrain Apr 16, 2019
012ef61
fix bug
Halfish Apr 16, 2019
f66d3ab
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
Halfish Apr 16, 2019
c3c3343
fix run.sh infer bug & add ernie infer code
Halfish Apr 16, 2019
983f16e
fix cpu mode
0YuanZhang0 Apr 17, 2019
fca3892
Update README.md
phlrain Apr 17, 2019
082f5f7
fix bug for python3
Halfish Apr 17, 2019
a9099f7
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
Halfish Apr 17, 2019
da5d70a
fix CPU and GPU diff result bug
Halfish Apr 17, 2019
2517416
Update README.md
phlrain Apr 17, 2019
5ebf80d
update readme
0YuanZhang0 Apr 17, 2019
744c282
Update run_classifier.py
ChinaLiuHao Apr 17, 2019
cf230b1
Update README.md
Halfish Apr 17, 2019
30ca2d0
Update README.md
zhangyimi Apr 18, 2019
da68ae1
Update README.md
Halfish Apr 18, 2019
4aa709f
Update README.md
zhangyimi Apr 18, 2019
6f4dbe2
Update README.md
zhangyimi Apr 18, 2019
852a73f
Update run.sh
ChinaLiuHao Apr 18, 2019
0e4adaf
Update run_ernie.sh
ChinaLiuHao Apr 18, 2019
bca29e4
modify dir
0YuanZhang0 Apr 18, 2019
2148a45
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
0YuanZhang0 Apr 18, 2019
2fa3a0b
Update README.md
zhangyimi Apr 18, 2019
1182d81
modify dir too
luluxing3 Apr 18, 2019
2a249a7
modify path
luluxing3 Apr 18, 2019
88ec130
Update README.md
ChinaLiuHao Apr 19, 2019
b381ddf
Merge branch 'develop' into paddle-nlp
chenbjin Apr 19, 2019
e29dfeb
PaddleNLP modules backup to old/, rm links-LAC,Senta,SimNet
chenbjin Apr 19, 2019
b3b23bc
mv all modules out of paddle-nlp, rm Senta, auto_dialog_eval, deep_match
chenbjin Apr 19, 2019
803ee68
mv models/classify to models/classification, models/seq_lab to models…
chenbjin Apr 19, 2019
ed829f0
update readme for models/classification
chenbjin Apr 19, 2019
e21a38d
update sentiment_classification and rm README
chenbjin Apr 19, 2019
ca50abd
Add Transformer into paddle-nlp
guoshengCS Apr 19, 2019
56fdfa0
change seq_lab to sequence labeling
Halfish Apr 19, 2019
c1e93b5
Rename old as unarchived in PaddleNLP
guoshengCS Apr 19, 2019
39488d3
Merge pull request #2097 from guoshengCS/paddle-nlp-transformer-new
guoshengCS Apr 19, 2019
e2d9df3
add LARK
Apr 19, 2019
83421c0
Update README, add paddlehub
chenbjin Apr 19, 2019
559572d
add paddlehub
chenbjin Apr 19, 2019
fb12c1a
Add tmp readme
Apr 21, 2019
0c318b0
Update README.md
ChinaLiuHao Apr 22, 2019
3580532
Update README.md
zhangyimi Apr 22, 2019
c24850e
Update README.md
zhangyimi Apr 22, 2019
0c963e9
Update README.md
zhangyimi Apr 22, 2019
bdef090
Update run_ernie.sh
ChinaLiuHao Apr 22, 2019
cb27955
Update run_ernie.sh
ChinaLiuHao Apr 22, 2019
7b53029
Update README.md
zhangyimi Apr 22, 2019
d84cc3c
Update run_ernie_classifier.py
ChinaLiuHao Apr 22, 2019
88b3338
Update README.md
ChinaLiuHao Apr 22, 2019
2a7e141
Update README.md
ChinaLiuHao Apr 22, 2019
cc6f24f
Update run.sh
ChinaLiuHao Apr 22, 2019
d019336
Update run_ernie_classifier.py
ChinaLiuHao Apr 22, 2019
0eaa525
update
Apr 22, 2019
749f6e4
fix chunk_evaluator bug
Halfish Apr 22, 2019
54aa0be
change names
Apr 22, 2019
88a8e02
Update README
Apr 22, 2019
787dd2b
add gitmodules
Apr 22, 2019
df4eafe
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
Apr 22, 2019
fbdb7c6
add install code
Apr 22, 2019
54a8b8a
Update README.md
ChinaLiuHao Apr 22, 2019
3085952
Update README.md
zhangyimi Apr 22, 2019
c93c0fb
Update README.md
zhangyimi Apr 22, 2019
ad89c3d
Update README.md
zhangyimi Apr 22, 2019
c5bde10
Update README.md
zhangyimi Apr 22, 2019
396b2e9
Update README.md
zhangyimi Apr 22, 2019
aa6ba97
Update README.md
Apr 22, 2019
4bcb823
Update README.md
ChinaLiuHao Apr 22, 2019
2dc7a42
Update README.md
ChinaLiuHao Apr 22, 2019
149131c
Update READMEs
Apr 22, 2019
728ab29
Update README.md
Halfish Apr 22, 2019
00305da
Update README.md
Halfish Apr 22, 2019
6561d9d
Update README.md
Halfish Apr 22, 2019
6723417
Update README.md
ChinaLiuHao Apr 22, 2019
5a13676
Update README.md
ChinaLiuHao Apr 22, 2019
4ed856d
Update README.md
Halfish Apr 22, 2019
2e00487
Update README.md
Halfish Apr 22, 2019
49d3a17
README
0YuanZhang0 Apr 22, 2019
37f2a13
Update README.md
ChinaLiuHao Apr 22, 2019
6abc646
update emotion_detection README
chenbjin Apr 22, 2019
c027583
Update README.md
ChinaLiuHao Apr 22, 2019
67be0fc
Update README.md
ChinaLiuHao Apr 22, 2019
922a35f
Update README.md
ChinaLiuHao Apr 22, 2019
d1e4e12
Update README.md
zhangyimi Apr 22, 2019
9b69767
Update README.md
ChinaLiuHao Apr 22, 2019
27a65fe
Update README.md
ChinaLiuHao Apr 22, 2019
e471b0e
Update README.md
ChinaLiuHao Apr 22, 2019
8db94fe
Update README.md
ChinaLiuHao Apr 22, 2019
391c555
Update README.md
zhangyimi Apr 22, 2019
cddc96f
update REAME, add finetune doc
chenbjin Apr 22, 2019
9e50d44
update emotion_detection readme
chenbjin Apr 22, 2019
f91724e
change run.sh
zhangyimi Apr 22, 2019
de9cded
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
zhangyimi Apr 22, 2019
bf27ccd
Update README.md
zhangyimi Apr 22, 2019
700c3d4
Update the link in fluid dir
Apr 22, 2019
89afda0
Merge branch 'paddle-nlp' of upstream into paddle-nlp
Apr 22, 2019
4229c33
update readme
0YuanZhang0 Apr 22, 2019
f178186
Merge branch 'paddle-nlp' of https://github.com/PaddlePaddle/models i…
0YuanZhang0 Apr 22, 2019
b07f15f
update README for markdown style
chenbjin Apr 22, 2019
d291bcb
Update README.md
ChinaLiuHao Apr 22, 2019
26ab375
Update README.md
Apr 22, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix max_step, update run.sh and run_ernie.sh
  • Loading branch information
chenbjin committed Apr 14, 2019
commit 29e112ab6d9788512c63d53ffa7375e475154093
10 changes: 6 additions & 4 deletions PaddleNLP/paddle-nlp/emotion_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

对话情绪识别(Emotion Detection,简称EmoTect),专注于识别智能对话场景中用户的情绪,针对智能对话场景中的用户文本,自动判断该文本的情绪类别并给出相应的置信度,情绪类型分为积极、消极、中性。

对话情绪识别适用于聊天、客服等多个场景,能够帮助企业更好地把握对话质量、改善产品的用户交互体验,也能分析客服服务质量、降低人工质检成本。可通过[AI开放平台-对话情绪识别](http://ai.baidu.com/tech/nlp_apply/emotion_detection) 线上体验。
对话情绪识别适用于聊天、客服等多个场景,能够帮助企业更好地把握对话质量、改善产品的用户交互体验,也能分析客服服务质量、降低人工质检成本。可通过 [AI开放平台-对话情绪识别](http://ai.baidu.com/tech/nlp_apply/emotion_detection) 线上体验。

效果上,我们基于百度自建测试集(包含闲聊、客服)和nlpcc2014微博情绪数据集,进行评测,效果如下表所示
效果上,我们基于百度自建测试集(包含闲聊、客服)和nlpcc2014微博情绪数据集,进行评测,效果如下表所示,此外我们还开源了百度基于海量数据训练好的模型,该模型在聊天对话语料上fine-tune之后,可以得到更好的效果。

| 模型 | 闲聊 | 客服 | 微博 |
| :------| :------ | :------ | :------ |
Expand Down Expand Up @@ -47,11 +47,11 @@ sh run.sh eval
```shell
sh run.sh train
```
训练完成后,可修改```run.sh```中init_checkpoint参数,进行模型评估和预测
训练完成后,可修改```run.sh```中init_checkpoint参数,选择最优step的模型进行评估和预测

#### 模型预测

基于预训练模型,可在新的数据集(infer.tsv)上进行预测,得到模型预测结果及概率
在新的数据集(infer.tsv)上进行预测,得到模型预测结果及各label的概率
```shell
sh run.sh infer
```
Expand All @@ -78,6 +78,7 @@ sh run.sh infer
训练、预测、评估使用的数据示例如下,数据由两列组成,以制表符('\t')分隔,第一列是情绪分类的类别(0表示消极;1表示中性;2表示积极),第二列是以空格分词的中文文本,文件为utf8编码。

```text
label text_a
0 谁 骂人 了 ? 我 从来 不 骂人 , 我 骂 的 都 不是 人 , 你 是 人 吗 ?
1 我 有事 等会儿 就 回来 和 你 聊
2 我 见到 你 很高兴 谢谢 你 帮 我
Expand Down Expand Up @@ -131,6 +132,7 @@ TASK_DATA_PATH=./data
```
sh run_ernie.sh train
```
训练完成后,可修改```run_ernie.sh```中init_checkpoint参数,选择最优step的模型进行评估和预测
训练、评估、预测详细配置,请查看 ```run_ernie.sh```

## 如何贡献代码
Expand Down
2 changes: 1 addition & 1 deletion PaddleNLP/paddle-nlp/emotion_detection/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ infer() {
--batch_size 32 \
--data_dir ${DATA_PATH} \
--vocab_path ${VOCAB_PATH} \
--init_checkpoint ./save_models/textcnn/step_785/ \
--init_checkpoint ${CKPT_PATH}/step_785/ \
--config_path ./config.json
}

Expand Down
2 changes: 1 addition & 1 deletion PaddleNLP/paddle-nlp/emotion_detection/run_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ def main(args):
epoch=args.epoch)

num_train_examples = processor.get_num_examples(phase="train")
max_train_steps = args.epoch * num_train_examples // args.batch_size
max_train_steps = args.epoch * num_train_examples // args.batch_size + 1

print("Num train examples: %d" % num_train_examples)
print("Max train steps: %d" % max_train_steps)
Expand Down
2 changes: 1 addition & 1 deletion PaddleNLP/paddle-nlp/emotion_detection/run_ernie.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ infer() {
--verbose true \
--do_infer true \
--batch_size 32 \
--init_checkpoint ${MODEL_PATH}/params \
--init_checkpoint ${CKPT_PATH}/step_943 \
--infer_set ${TASK_DATA_PATH}/infer.tsv \
--vocab_path ${MODEL_PATH}/vocab.txt \
--max_seq_len 64 \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ def main(args):

num_train_examples = reader.get_num_examples(args.train_set)

max_train_steps = args.epoch * num_train_examples // args.batch_size // dev_count
max_train_steps = args.epoch * num_train_examples // args.batch_size // dev_count + 1

print("Device count: %d" % dev_count)
print("Num train examples: %d" % num_train_examples)
Expand Down