diff --git a/README.md b/README.md
index 3b219a1..5e3a545 100644
--- a/README.md
+++ b/README.md
@@ -1,36 +1,47 @@
-# Run HuggingFace NER (NLP) Model as Java using ONNX
+# Run HuggingFace NER (NLP) Model on Java using ONNX Runtime and DJL
+A NLP (Natural Language Processing) Java Application which detects `Names`, `organizaions`, and `locations` in a text by running Hugging face's [Roberta NER model](https://huggingface.co/xlm-roberta-large-finetuned-conll03-english) using [ONNX runtime](https://onnxruntime.ai/docs/get-started/with-java.html) and [Deep Java Library](https://djl.ai/)
-## Download files
+
+
+## Installation:
+Open Project folder in Java IDE (`Recommended: IntelliJ IDEA Community`) with gradle support and Build the project
+
+
+### Requirements:
+1. Java Development Kit JDK version: 11
+2. Gradle version 7+
+
+### Download files
These files are required to run the project
1. ONNX model
2. `tokenizer.json` file
-**Convert the model**
+### Convert the ONNX model
-To convert HuggingFace NER model to ONNX Open this [Google Colaboratory Notebook](https://colab.research.google.com/drive/1kZx9XOnExVfPoAGHhHRUrdQnioiLloBW#revisionId=0BwKss6yztf4KS0NKaWRiQjc0RGRvQkd6ZFp3OUFhR1lTclBNPQ) run the code as below shown image and follow all the steps
+To convert HuggingFace NER model to ONNX Open this [Google Colaboratory Notebook](https://colab.research.google.com/drive/1kZx9XOnExVfPoAGHhHRUrdQnioiLloBW#revisionId=0BwKss6yztf4KS0NKaWRiQjc0RGRvQkd6ZFp3OUFhR1lTclBNPQ) run the code as image shown below and follow all the steps
-
+
-(the code for above purpose is also saved in jupyter notebook in the file `convert Huggingface model to ONNX.ipynb`. you can run the code using [Jupyter notebook](https://jupyter.org/install))
+(the code for above purpose is also saved in jupyter notebook in the file `convert Huggingface model to ONNX.ipynb`. you can run the code using [Jupyter notebook](https://jupyter.org/install))
-Tokenzer file `tokenizer.json` was taken from this [huggingface repo](https://huggingface.co/xlm-roberta-large-finetuned-conll03-english)
-Download the `tokenizer.json` from the [link](https://huggingface.co/xlm-roberta-large-finetuned-conll03-english/raw/main/tokenizer.json) and save in `raw-files` directory
+after running the one of above codes your onnx model will be saved in `onnx/` folder.
+### Download tokenizer.json
+Tokenzer file `tokenizer.json` was taken from this [huggingface repo](https://huggingface.co/xlm-roberta-large-finetuned-conll03-english)
+Download the `tokenizer.json` from the [link](https://huggingface.co/xlm-roberta-large-finetuned-conll03-english/raw/main/tokenizer.json)
-## Installation:
-Open Project folder in Java IDE (`Recommended: IntelliJ IDE`) with gradle support and Build the project
+**move files**
+Copy files created from above two stesp into `raw-files` directory as shown in the below image
+
-### Requirements:
-1. Java Development Kit JDK version: 11
-2. Gradle version 7+
-### Building project
+## Building project
Build the project using This button

diff --git a/convert Huggingface model to ONNX.ipynb b/convert Huggingface model to ONNX.ipynb
index 0dad0f3..e533854 100644
--- a/convert Huggingface model to ONNX.ipynb
+++ b/convert Huggingface model to ONNX.ipynb
@@ -1,18 +1,4 @@
{
- "nbformat": 4,
- "nbformat_minor": 0,
- "metadata": {
- "colab": {
- "provenance": []
- },
- "kernelspec": {
- "name": "python3",
- "display_name": "Python 3"
- },
- "language_info": {
- "name": "python"
- }
- },
"cells": [
{
"cell_type": "markdown",
@@ -20,7 +6,7 @@
"id": "xJckl99IHePQ"
},
"source": [
- "## Hugging Face to ONNX\n"
+ "## Hugging Face Model to ONNX\n"
]
},
{
@@ -31,7 +17,7 @@
},
"outputs": [],
"source": [
- "!pip install -q transformers[onnx] transformers[sentencepiece] torch"
+ "%pip install -q transformers[onnx] transformers[sentencepiece] torch"
]
},
{
@@ -46,19 +32,10 @@
},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
- "Framework not requested. Using torch to export to ONNX.\n",
- "Downloading: 100% 852/852 [00:00<00:00, 758kB/s]\n",
- "Downloading: 100% 2.24G/2.24G [00:57<00:00, 39.2MB/s]\n",
- "Downloading: 100% 5.07M/5.07M [00:00<00:00, 45.9MB/s]\n",
- "Downloading: 100% 9.10M/9.10M [00:00<00:00, 59.7MB/s]\n",
- "Using framework PyTorch: 1.12.1+cu113\n",
- "Overriding 1 configuration item(s)\n",
- "\t- use_cache -> False\n",
"Validating ONNX model...\n",
- "tcmalloc: large alloc 1073741824 bytes == 0xb3b2000 @ 0x7f47b38afb6b 0x7f47b38cf379 0x7f46ce7575fe 0x7f46cea041e4 0x7f46ce81397c 0x7f46ce813daa 0x7f46ce82b8f1 0x7f46ce82dea9 0x7f46ce38cf26 0x7f46ce358820 0x7f46cea382be 0x5d746e 0x5d813c 0x4ff515 0x49caa1 0x55e571 0x5d7cf1 0x49ca7c 0x55e571 0x5d7cf1 0x5d9487 0x586306 0x5d808f 0x560200 0x55e571 0x5d7cf1 0x49ec69 0x5d7c18 0x49ec69 0x55e571 0x55ef23\n",
"\t-[✓] ONNX model output names match reference model ({'logits'})\n",
"\t- Validating ONNX Model output \"logits\":\n",
"\t\t-[✓] (3, 9, 8) matches (3, 9, 8)\n",
@@ -72,43 +49,27 @@
" --feature=token-classification \\\n",
" --model=xlm-roberta-large-finetuned-conll03-english onnx/"
]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
},
- {
- "cell_type": "code",
- "source": [
- "!zip -r model.zip onnx"
- ],
- "metadata": {
- "id": "RrFoM3zndFZ0"
- },
- "execution_count": null,
- "outputs": []
+ "kernelspec": {
+ "display_name": "Python 3.8.0 64-bit",
+ "language": "python",
+ "name": "python3"
},
- {
- "cell_type": "markdown",
- "source": [
- "Now download the model file as shown in the below image \n",
- "\n",
- "
\n",
- "\n",
- "\n",
- ""
- ],
- "metadata": {
- "id": "q0atoqIldM6m"
- }
+ "language_info": {
+ "name": "python",
+ "version": "3.8.0"
},
- {
- "cell_type": "markdown",
- "source": [
- "and after extracting it copy onnx folder in to `raw-files` directory in the Java Project Home directory. \n",
- "
\n",
- "\n",
- ""
- ],
- "metadata": {
- "id": "TSVrlfjlka0F"
+ "vscode": {
+ "interpreter": {
+ "hash": "437c6a4c2ab8ad564298253974ee7794b68bcbeea462aed8eaaa05b6d7c57f73"
}
}
- ]
-}
\ No newline at end of file
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/images/model-location.png b/images/model-location.png
new file mode 100644
index 0000000..07e7e7e
Binary files /dev/null and b/images/model-location.png differ