Skip to content

Commit

Permalink
update entity numbers, remove multilingual entity models (#6215)
Browse files Browse the repository at this point in the history
  • Loading branch information
ninggao authored Feb 22, 2021
1 parent 516f8d2 commit 3542ce4
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 45 deletions.
26 changes: 5 additions & 21 deletions Orchestrator/docs/NLRModels.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,6 @@ This is a yet another high quality EN-only base model for entity extraction.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.

### pretrained.20210105.microsoft.dte.00.12.bert_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.

### pretrained.20210105.microsoft.dte.00.12.tulr_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.

### pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx (experimental)
This is a high quality EN-only base model for entity extraction. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Expand All @@ -72,16 +62,6 @@ This is a high quality EN-only base model for entity extraction. It's smaller an
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.

### pretrained.20210205.microsoft.dte.00.06.bert_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.

### pretrained.20210205.microsoft.dte.00.06.tulr_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.

## Models Evaluation
For a more quantitative comparison analysis of the different models see the following performance characteristics.

Expand Down Expand Up @@ -136,13 +116,17 @@ For a more quantitative comparison analysis of the different models see the foll
| ------------------------------------------------------------ | ---------- | ------ | ----------------------- | --------------- |
| pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx | BERT | 6 | ~ 23 ms | 259M |
| pretrained.20210205.microsoft.dte.00.12.bert_example_ner.en.onnx | BERT | 12 | ~ 40 ms | 427M |
| pretrained.20210218.microsoft.dte.00.06.bert_example_ner.en.onnx | BERT | 6 | ~ 23 ms | 259M |
| pretrained.20210218.microsoft.dte.00.12.bert_example_ner.en.onnx | BERT | 12 | ~ 40 ms | 425M |

- The following table shows how accurate is each model relative to provided training sample size using [Snips NLU][4] system, evaluated by **macro-average-F1**.

| Training samples per entity type | 10 | 20 | 50 | 100 | 200 |
| ------------------------------------------------------------ | ----- | ----- | ----- | ----- | ----- |
| pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx | 0.662 | 0.678 | 0.680 | 0.684 | 0.674 |
| pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx | 0.615 | 0.636 | 0.647 | 0.661 | 0.665 |
| pretrained.20210205.microsoft.dte.00.12.bert_example_ner.en.onnx | 0.637 | 0.658 | 0.684 | 0.698 | 0.702 |
| pretrained.20210218.microsoft.dte.00.06.bert_example_ner.en.onnx | 0.637 | 0.658 | 0.673 | 0.686 | 0.684 |
| pretrained.20210218.microsoft.dte.00.12.bert_example_ner.en.onnx | 0.661 | 0.664 | 0.670 | 0.685 | 0.681 |



Expand Down
24 changes: 0 additions & 24 deletions Orchestrator/v0.2/nlr_versions.json
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,6 @@
"description": "Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token intent base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210105.microsoft.dte.00.12.bert_example_ner_multilingual.onnx": {
"releaseDate": "01/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210105.microsoft.dte.00.12.bert_example_ner_multilingual.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210105.microsoft.dte.00.12.tulr_example_ner_multilingual.onnx": {
"releaseDate": "01/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210105.microsoft.dte.00.12.tulr_example_ner_multilingual.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx.zip",
Expand All @@ -70,18 +58,6 @@
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.unicoder_multilingual.onnx.zip",
"description": "Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 6-layer per-token intent base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210205.microsoft.dte.00.06.bert_example_ner_multilingual.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.bert_example_ner_multilingual.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 6-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210205.microsoft.dte.00.06.tulr_example_ner_multilingual.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.tulr_example_ner_multilingual.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 6-layer per-token entity base model",
"minSDKVersion": "4.10.0"
}
}
}

0 comments on commit 3542ce4

Please sign in to comment.