Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Staging to master #506

Merged
merged 29 commits into from
Dec 3, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
342bf36
update mlflow version to match the other azureml versions
miguelgfierro Nov 20, 2019
e91b9ef
Update generate_conda_file.py
miguelgfierro Nov 21, 2019
00d9ca0
added temporary
miguelgfierro Nov 21, 2019
9311727
Merge pull request #483 from microsoft/miguel/temporary
miguelgfierro Nov 22, 2019
f928b0d
Merge pull request #481 from microsoft/miguelgfierro-patch-1
Nov 25, 2019
2f9bfad
doc: update github url references
Nov 25, 2019
c8abcbe
docs: update nlp recipes references
Nov 25, 2019
99d00d4
Minor bug fix for text classification of multi languages notebook
kehuangms Nov 25, 2019
d71de4a
remove bert and xlnet notebooks
saidbleik Nov 25, 2019
3d7c037
Merge pull request #490 from microsoft/emawa/docs/update-nlp-references
saidbleik Nov 25, 2019
c3528d5
Merge pull request #493 from microsoft/kehuan
saidbleik Nov 25, 2019
7df12d8
Merge pull request #494 from microsoft/transformers2
saidbleik Nov 25, 2019
b0dc696
remove obsolete tests and links
saidbleik Nov 26, 2019
0b4b256
Add missing tmp directories.
hlums Nov 26, 2019
a39143f
fix import error and max_nodes for the cluster
daden-ms Nov 27, 2019
e578682
Merge pull request #497 from microsoft/transformers2
miguelgfierro Nov 27, 2019
bc41256
Merge pull request #499 from microsoft/daden/issue496
miguelgfierro Nov 27, 2019
d13cce1
Minor edits.
hlums Nov 27, 2019
6c2ab2a
Attempt to fix test device error.
hlums Nov 27, 2019
4b13b9d
Temporarily pin transformers version
hlums Nov 27, 2019
3e72fb0
Remove gpu tags temporarily
hlums Nov 27, 2019
40ae2b7
Test whether device error also occurs for SequenceClassifier.
hlums Nov 27, 2019
321032e
Revert temporary changes.
hlums Nov 27, 2019
3bb5cce
Revert temporary changes.
hlums Nov 27, 2019
857ce5c
Merge pull request #498 from microsoft/hlu/fix_temp_directories
miguelgfierro Nov 28, 2019
25b6643
Merge pull request #500 from microsoft/hlu/temporary_test_fix
miguelgfierro Nov 28, 2019
75e6eb9
update: major release version to 2.0.0
Dec 3, 2019
3921507
Merge pull request #505 from microsoft/emawa/update-feature-release-n…
saidbleik Dec 3, 2019
afbd86a
Merge branch 'master' into staging
miguelgfierro Dec 3, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ The following is a list of related repositories that we like and think are usefu
|[AzureML-BERT](https://github.com/Microsoft/AzureML-BERT)|End-to-end recipes for pre-training and fine-tuning BERT using Azure Machine Learning service.|
|[MASS](https://github.com/microsoft/MASS)|MASS: Masked Sequence to Sequence Pre-training for Language Generation.|
|[MT-DNN](https://github.com/namisan/mt-dnn)|Multi-Task Deep Neural Networks for Natural Language Understanding.|
|[UniLM](https://github.com/microsoft/unilm)|Unified Language Model Pre-training.|



## Build Status
Expand Down
3 changes: 0 additions & 3 deletions examples/text_classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,5 @@ The following summarizes each notebook for Text Classification. Each notebook pr
|Notebook|Environment|Description|Dataset|
|---|---|---|---|
|[BERT for text classification on AzureML](tc_bert_azureml.ipynb) |Azure ML|A notebook which walks through fine-tuning and evaluating pre-trained BERT model on a distributed setup with AzureML. |[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[XLNet for text classification with MNLI](tc_mnli_xlnet.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained XLNet model on a subset of the MultiNLI dataset|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[BERT for text classification of Hindi BBC News](tc_bbc_bert_hi.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained BERT model on Hindi BBC news data|[BBC Hindi News](https://github.com/NirantK/hindi2vec/releases/tag/bbc-hindi-v0.1)|
|[BERT for text classification of Arabic News](tc_dac_bert_ar.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained BERT model on Arabic news articles|[DAC](https://data.mendeley.com/datasets/v524p5dhpj/2)|
|[Text Classification of MultiNLI Sentences using Multiple Transformer Models](tc_mnli_transformers.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a number of pre-trained transformer models|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[Text Classification of Multi Language Datasets using Transformer Model](tc_multi_languages_transformers.ipynb)|Local|A notebook which walks through fine-tuning and evaluating a pre-trained transformer model for multiple datasets in different language|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) <br> [BBC Hindi News](https://github.com/NirantK/hindi2vec/releases/tag/bbc-hindi-v0.1) <br> [DAC](https://data.mendeley.com/datasets/v524p5dhpj/2)
50 changes: 2 additions & 48 deletions tests/integration/test_notebooks_text_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,50 +37,6 @@ def test_tc_mnli_transformers(notebooks, tmp):
assert pytest.approx(result["f1"], 0.89, abs=ABS_TOL)


@pytest.mark.gpu
@pytest.mark.integration
def test_tc_dac_bert_ar(notebooks, tmp):
notebook_path = notebooks["tc_dac_bert_ar"]
pm.execute_notebook(
notebook_path,
OUTPUT_NOTEBOOK,
kernel_name=KERNEL_NAME,
parameters=dict(
NUM_GPUS=1,
DATA_FOLDER=tmp,
BERT_CACHE_DIR=tmp,
MAX_LEN=175,
BATCH_SIZE=16,
NUM_EPOCHS=1,
TRAIN_SIZE=0.8,
NUM_ROWS=8000,
RANDOM_STATE=0,
),
)
result = sb.read_notebook(OUTPUT_NOTEBOOK).scraps.data_dict
assert pytest.approx(result["accuracy"], 0.871, abs=ABS_TOL)
assert pytest.approx(result["precision"], 0.865, abs=ABS_TOL)
assert pytest.approx(result["recall"], 0.852, abs=ABS_TOL)
assert pytest.approx(result["f1"], 0.845, abs=ABS_TOL)


@pytest.mark.gpu
@pytest.mark.integration
def test_tc_bbc_bert_hi(notebooks, tmp):
notebook_path = notebooks["tc_bbc_bert_hi"]
pm.execute_notebook(
notebook_path,
OUTPUT_NOTEBOOK,
kernel_name=KERNEL_NAME,
parameters=dict(NUM_GPUS=1, DATA_FOLDER=tmp, BERT_CACHE_DIR=tmp, NUM_EPOCHS=1),
)
result = sb.read_notebook(OUTPUT_NOTEBOOK).scraps.data_dict
assert pytest.approx(result["accuracy"], 0.71, abs=ABS_TOL)
assert pytest.approx(result["precision"], 0.25, abs=ABS_TOL)
assert pytest.approx(result["recall"], 0.28, abs=ABS_TOL)
assert pytest.approx(result["f1"], 0.26, abs=ABS_TOL)


@pytest.mark.integration
@pytest.mark.azureml
@pytest.mark.gpu
Expand Down Expand Up @@ -118,6 +74,7 @@ def test_tc_bert_azureml(
if os.path.exists("outputs"):
shutil.rmtree("outputs")


@pytest.mark.gpu
@pytest.mark.integration
def test_multi_languages_transformer(notebooks, tmp):
Expand All @@ -126,10 +83,7 @@ def test_multi_languages_transformer(notebooks, tmp):
notebook_path,
OUTPUT_NOTEBOOK,
kernel_name=KERNEL_NAME,
parameters={
"QUICK_RUN": True,
"USE_DATASET": "dac"
},
parameters={"QUICK_RUN": True, "USE_DATASET": "dac"},
)
result = sb.read_notebook(OUTPUT_NOTEBOOK).scraps.data_dict
assert pytest.approx(result["precision"], 0.94, abs=ABS_TOL)
Expand Down