Merge pull request #202 from mlrun/0.9.x-dev

0.9.x dev
mlrun · Dec 13, 2021 · 3107de7 · 3107de7
2 parents fad845a + a3adca5
commit 3107de7
Show file tree

Hide file tree

Showing 9 changed files with 467 additions and 57 deletions.
diff --git a/README.md b/README.md
@@ -52,7 +52,7 @@ To run the MLRun demos, first do the following:
 ## Mask Detection Demo
 
 The [Mask detection](./mask-detection/README.md) demo is a 3 notebooks demo where we:
-1. **Train and evaluate** a model for detecting whether a person is wearing a mask in an image using Tensorflow.Keras or PyTorch (coming soon).
+1. **Train and evaluate** a model for detecting whether a person is wearing a mask in an image using Tensorflow.Keras or PyTorch.
 2. **Serve** the model as a serverless function in a http endpoint.
 3. Write an **automatic pipeline** where we download a dataset of images, train and evaluate, optimize the model (using ONNX) and serve it.
 

diff --git a/mask-detection/1-training-and-evaluation.ipynb b/mask-detection/1-training-and-evaluation.ipynb
@@ -20,7 +20,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Before we continue, we need to setup some requirements:"
+    "Before we continue, we need to install MLRun and the framework of choice (comment and uncomment the framework you wish to use):"
    ]
   },
   {
@@ -30,8 +30,14 @@
    "outputs": [],
    "source": [
     "!pip install mlrun\n",
-    "!pip install -U tensorflow==2.4.1\n",
-    "!pip install -U typing-extensions"
+    "!pip install -U typing-extensions\n",
+    "\n",
+    "########## For TF.Keras: ##########\n",
+    "!pip install -U tensorflow==2.4.4\n",
+    "\n",
+    "########## For PyTorch:  ##########\n",
+    "# !pip install -U torch==1.10\n",
+    "# !pip install -U torchvision==0.11.1"
    ]
   },
   {
@@ -90,7 +96,7 @@
     "\n",
     "### 2.1. Import a Function\n",
     "\n",
-    "We will download the images using `open_archive` - a function from MLRun's functions marketplace. We will import the fucntion using `mlrun.import_function` and describe it to get the function's documentation:"
+    "We will download the images using `open_archive` - a function from MLRun's functions marketplace. We will import the function using `mlrun.import_function` and describe it to get the function's documentation:"
    ]
   },
   {
@@ -412,7 +418,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "framework = \"tf-keras\""
+    "framework = \"tf-keras\"\n",
+    "# framework = \"pytorch\""
    ]
   },
   {
@@ -421,20 +428,58 @@
    "source": [
     "### TF.Keras\n",
     "\n",
-    "The code is taken from the python file [training-and-evaluation.py](tf-keras/training-and-evaluation.py). It is classic and straightforward, we: \n",
+    "The code is taken from the python file [training-and-evaluation.py](tf-keras/training-and-evaluation.py), which is classic and straightforward. We: \n",
     "1. Use `_get_datasets` to get the training and validation datasets (on evaluation - the evaluation dataset).\n",
-    "2. Use `_get_model` to build our classifier - simple transfer learning from MobileNetV2.\n",
+    "2. Use `_get_model` to build our classifier - simple transfer learning from MobileNetV2 (`keras.applications`).\n",
     "3. Call `train` to train the model.\n",
     "4. Call `evaluate` to evaluate the model.\n",
     "\n",
-    "Taking this code one step further is **MLRun**'s framework for `tf.keras`: \n",
+    "Taking this code one step further is **MLRun**'s framework for `tf.keras`:\n",
     "\n",
     "```python\n",
     "# Apply MLRun's interface for tf.keras:\n",
     "mlrun_tf_keras.apply_mlrun(model=model, context=context, ...)\n",
     "```\n",
     "\n",
-    "With just one line of code, it seamlessly provides:\n",
+    "With just one line of code, it seamlessly provides **automatic logging** (for both MLRun and Tensorboard) and **distributed training** by wrapping the `fit` and `evaluate` methods of `tf.keras.Model`.\n",
+    "\n",
+    "In addition, in the `evaluate` method code, we use the `TFKerasModelHandler` class. This class supports loading, saving and logging `tf.keras` models with ease, enabling easy versioning of the model and his results, artifacts and custom objects."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### PyTorch\n",
+    "\n",
+    "The code is taken from the python file [training-and-evaluation.py](pytorch/training-and-evaluation.py), which is classic and straightforward. We:\n",
+    "1. Use `_get_datasets` to get the training and validation datasets (on evaluation - the evaluation dataset). The function is initiazliing a `MaskDetectionDataset` to handle our images.\n",
+    "2. Initialize our `MaskDetector` classifier class - a simple transfer learning from MobileNetV2 (`torchvision.models`).\n",
+    "3. Call `train` to train the model.\n",
+    "4. Call `evaluate` to evaluate the model.\n",
+    "\n",
+    "Taking this code one step further is **MLRun**'s framework for `torch`:\n",
+    "\n",
+    "```python\n",
+    "import mlrun.frameworks.pytorch as mlrun_torch\n",
+    "```\n",
+    "\n",
+    "`mlrun_torch` is providing what we call \"shortcut functions\" for using PyTorch with ease:\n",
+    "* `train` - Training a model.\n",
+    "* `evaluate` - Evaluating a model.\n",
+    "\n",
+    "Both functions enable **automatic logging** (for both MLRun and Tensorboard) and **distributed training** by simply passing the following parameters: `auto_log: bool` and `use_horovod: bool`.\n",
+    "\n",
+    "In addition, you can choose to use our classes directly:\n",
+    "* `PyTorchMLRunInterface` - the interface for training, evaluating and predicting a PyTorch model. Our code is highly generic and should fit for any type of model.\n",
+    "* If you wish to use your own training code, to get automatic logging you will simply need to use our callback mechanism with `CallbackHandler`.\n",
+    "* `PyTorchModelHandler` - supports loading, saving and logging `torch` models with ease, enabling easy versioning of the model and his results, artifacts and custom objects."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Both **TF.Keras** and **PyTorch** has the same features regarding MLRun's automatic logging and distributed training orchastration:\n",
     "* **Automatic logging**: auto-log your training and model to both **Tensorboard** and **MLRun**. Additional settings can be passed onto this method to gain extra logging capabilities, like:\n",
     "  * Weights histograms and distributions\n",
     "  * Weights statistics\n",
@@ -443,22 +488,25 @@
     "  * Logging frequency and more\n",
     "* **Distributed training with Horovod**: Horovod will be initialized and used automatically if the MLRun Function's `kind` attribute is equal to `\"mpijob\"`, there won't be any additional changes needed to the original code! More on that later in [section 6](#section_6)\n",
     "\n",
-    "In addition, in the `evaluate` method code, we use the `mlrun.frameworks.tf_keras.TFKerasModelHandler` class. This class supports loading, saving and logging `tf.keras` models with ease, enabling easy versioning of the model and his results, artifacts and custom objects.\n",
-    "\n",
     "We suggest reading the documentation for further use, or like in this example, use the default settings."
-   ]
+   ],
+   "metadata": {
+    "collapsed": false
+   }
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
    "source": [
     "<a id=\"section_4\"></a>\n",
     "## 4. Create the MLRun Function\n",
     "\n",
     "We will use MLRun's `mlrun.code_to_function` to create a MLRun Function from our code in the above mentioned python file. Notice our MLRun Function will have two handlers: `train` and `evaluate`.\n",
     "\n",
     "We wish to run the training first as a Job, so we will set the `kind` parameter to `\"job\"`."
-   ]
+   ],
+   "metadata": {
+    "collapsed": false
+   }
   },
   {
    "cell_type": "code",
@@ -1082,7 +1130,7 @@
     "<a id=\"section_6\"></a>\n",
     "## 6. Run Distributed Training Using Horovod\n",
     "\n",
-    "Now we can see the second benefit of MLRun, we can **distribute** our model **training** across **multiple workers** (i.e., perform distributed training), assign **GPUs**, and more. We don't need to bother with Dockerfiles or K8s YAML configuration files — MLRun does all of this for us. All is needed to be done, is create our function with `kind=\"mpijob\"`.\n",
+    "Now we can see the second benefit of MLRun, we can **distribute** our model **training** across **multiple workers** (i.e., perform distributed training), assign **GPUs**, and more. We don't need to bother with Dockerfiles or K8s YAML configuration files — MLRun does all of this for us. We will simply create our function with `kind=\"mpijob\"`.\n",
     "\n",
     "> **Notice**: for this demo, in order to use GPUs in training, set the `use_gpu` variable to `True`. This will later assign the required configurations to use the GPUs and pass the correct image to support GPUs (image with CUDA libraries)."
    ]
@@ -1149,7 +1197,7 @@
    "source": [
     "Call run, and notice each epoch is shorter as we now have 2 workers instead of 1. As the 2 workers will print a lot of outputs we would rather wait for completion and then show the results. For that, we will pass `watch=False` and use the run objects function `wait_for_completion` and `show`. \n",
     "\n",
-    "In order to see the logs, you are welcome to go into the UI by clicking the blue hyperlink \"<span style=\"color:blue\">**click here**</span>\" after running the function and see the logs there:"
+    "To see the logs, you can go into the UI by clicking the blue hyperlink \"<span style=\"color:blue\">**click here**</span>\" after running the function:"
    ]
   },
   {

diff --git a/mask-detection/2-serving.ipynb b/mask-detection/2-serving.ipynb
@@ -70,7 +70,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "framework = \"tf-keras\""
+    "framework = \"tf-keras\"\n",
+    "# framework = \"pytorch\""
    ]
   },
   {
@@ -90,6 +91,25 @@
     "4. `post-process` - Parse the prediction probabilities and wrap them in a dictionary with which to respond."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### PyTorch\n",
+    "\n",
+    "The code is taken from the python file [serving.py](pytorch/serving.py). Our data will go through the following structure:\n",
+    "1. `resize` - Read the URL into an array and resize it to 224x224.\n",
+    "2. `preprocess` - Use `torchvision.transforms` to normalize the images for MobileNetV2.\n",
+    "3. `mlrun.frameworks.pytorch.PyTorchModelServer` - Infer the inputs through the model and return the predictions. It can be imported from:\n",
+    "    ```python\n",
+    "    from mlrun.frameworks.pytorch import PyTorchModelServer\n",
+    "    ```\n",
+    "    This class can be inherited and its pre-process, post-process, predict and explain methods can be overridden. In this demo, we will be using the defaults to showcase the topology feature of our serving functions.\n",
+    "4. `post-process` - Parse the prediction probabilities and wrap them in a dictionary with which to respond."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -217,10 +237,17 @@
     "# Set the topology and get the graph object:\n",
     "graph = serving_function.set_topology(\"flow\", engine=\"async\")\n",
     "\n",
+    "# Choose the ModelServer according to the selected framework:\n",
+    "model_server_class = (\n",
+    "    \"mlrun.frameworks.tf_keras.TFKerasModelServer\"\n",
+    "    if framework == \"tf-keras\"\n",
+    "    else \"mlrun.frameworks.pytorch.PyTorchModelServer\"\n",
+    ")\n",
+    "\n",
     "# Build the serving graph:\n",
     "graph.to(handler=\"resize\", name=\"resize\")\\\n",
     "     .to(handler=\"preprocess\", name=\"preprocess\")\\\n",
-    "     .to(class_name=\"mlrun.frameworks.tf_keras.TFKerasModelServer\", name=\"mask_detector\", model_path=project.get_artifact_uri(\"mask_detector\"))\\\n",
+    "     .to(class_name=model_server_class, name=\"detect_mask\", model_path=project.get_artifact_uri(\"mask_detector\"))\\\n",
     "     .to(handler=\"postprocess\", name=\"postprocess\").respond()\n",
     "\n",
     "# Plot to graph:\n",
@@ -559,17 +586,8 @@
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
    "version": "3.7.6"
-  },
-  "pycharm": {
-   "stem_cell": {
-    "cell_type": "raw",
-    "metadata": {
-     "collapsed": false
-    },
-    "source": []
-   }
   }
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
diff --git a/mask-detection/3-automatic-pipeline.ipynb b/mask-detection/3-automatic-pipeline.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# Mask Detection Demo - Automatic Pipeline (3 / 3)\n",
     "\n",
-    "The following example demonstrates how to package a project and how to run an automatic pipeline for training, evaluating, optimizing and serving the mask detection model using our saved MLRun functions from the previous notebooks.\n",
+    "The following example demonstrates how to package a project and how to run an automatic pipeline to train, evaluate, optimize and serve the mask detection model using our saved MLRun functions from the previous notebooks.\n",
     "\n",
     "1. [Set up the project](#section_1)\n",
     "2. [Write and save the workflow](#section_2)\n",
@@ -79,7 +79,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "framework = \"tf-keras\""
+    "framework = \"tf-keras\"\n",
+    "# framework = \"pytorch\""
    ]
   },
   {
@@ -209,7 +210,7 @@
     "):\n",
     "    # Get our project object:\n",
     "    project = mlrun.get_current_project()\n",
-    "    \n",
+    "\n",
     "    # Write down the ONNX requirements:\n",
     "    onnx_requirements = [\n",
     "        \"onnx~=1.10.1\",\n",
@@ -276,8 +277,8 @@
     "        handler=\"to_onnx\",\n",
     "        name=\"optimizing\",\n",
     "        params={\n",
-    "            \"model_name\": 'mask_detector',\n",
     "            \"model_path\": training_run.outputs['mask_detector'],\n",
+    "            \"onnx_model_name\": 'onnx_mask_detector'\n",
     "        },\n",
     "        outputs=[\"onnx_mask_detector\"],\n",
     "    ).after(build_condition)\n",
@@ -309,7 +310,7 @@
     "    # Build the serving graph:\n",
     "    graph.to(handler=\"resize\", name=\"resize\")\\\n",
     "         .to(handler=\"preprocess\", name=\"preprocess\")\\\n",
-    "         .to(\"mlrun.frameworks.onnx.ONNXModelServer\", \"onnx_mask_detector\", model_path=project.get_artifact_uri(\"onnx_mask_detector\"))\\\n",
+    "         .to(class_name=\"mlrun.frameworks.onnx.ONNXModelServer\", name=\"onnx_mask_detector\", model_path=project.get_artifact_uri(\"onnx_mask_detector\"))\\\n",
     "         .to(handler=\"postprocess\", name=\"postprocess\").respond()\n",
     "    # Set the desired requirements:\n",
     "    serving_function.with_requirements(requirements=onnx_requirements)\n",
@@ -321,7 +322,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note that after running the cell above, the `workflow.py` file is created. Saving your workflow to file allows you to use it if run the project from a different environment.\n",
+    "Note that after running the cell above, the `workflow.py` file is created. Saving your workflow to file allows you to run the project from a different environment.\n",
     "\n",
     "In order to take this project with the functions we set and the workflow we saved over to a different environemnt, first set the workflow to the project. The workflow can be set using `project.set_workflow`. After setting it, we will save the project by calling `project.save`. When loaded, it can be run from another environment from both code and from cli. For more information regarding saving and loading a MLRun project, see the [documentation](https://docs.mlrun.org/en/latest/projects/overview.html)."
    ]

diff --git a/mask-detection/README.md b/mask-detection/README.md
@@ -5,8 +5,7 @@ In the following demo we will demonstrate how to use MLRun to create a mask dete
 
 ### Key Technologies:
 
-* Either [**TF.Keras**](https://www.tensorflow.org/api_docs/python/tf/keras) to train and evaluate the model,
-* or [**PyTorch**](https://pytorch.org/) (Will be added soon)
+* Either [**TF.Keras**](https://www.tensorflow.org/api_docs/python/tf/keras) or [**PyTorch**](https://pytorch.org/) to train and evaluate the model
 * [**Horovod**](https://horovod.ai/) to run distributed training
 * [**ONNX**](https://onnx.ai/) to optimize and accelerate the model's performance
 * [**Nuclio**](https://nuclio.io/) to create a high-performance serverless Serving function

diff --git a/mask-detection/pytorch/serving.py b/mask-detection/pytorch/serving.py
@@ -0,0 +1,79 @@
+import urllib.request
+from typing import Dict, List, Union
+
+import numpy as np
+import torchvision
+from PIL import Image
+
+
+def resize(event: Dict) -> List[Image.Image]:
+    """
+    Read images urls into numpy arrays and resize them to MobileNetV2 standard size of 224x224.
+
+    :param event: A dictionary with the images urls at the 'data_url' key.
+
+    :returns: A list of all the resized images as numpy arrays.
+    """
+    # Read the images urls passed:
+    images_urls = event["data_url"]
+
+    # Initialize an empty list for the resized images:
+    resized_images = []
+
+    # Go through the images urls and read and resize them:
+    for image_url in images_urls:
+        # Get the image:
+        urllib.request.urlretrieve(image_url, "temp.png")
+        image = Image.open("temp.png")
+        # Resize it:
+        image = image.resize((224, 224))
+        # Collect it:
+        resized_images.append(image)
+
+    return resized_images
+
+
+def preprocess(images: List[Image.Image]) -> Dict[str, List[np.ndarray]]:
+    """
+    Run the given images through MobileNetV2 preprocessing so they will be ready to be inferred through the mask
+    detection model.
+
+    :param images: A list of images to preprocess.
+
+    :returns: A dictionary for the PyTorchModelServer, with the preprocessed images in the 'inputs' key.
+    """
+    # Prepare the transforms composition:
+    transforms_composition = torchvision.transforms.Compose(
+        [
+            torchvision.transforms.ToTensor(),
+            torchvision.transforms.Normalize(
+                mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
+            ),
+        ]
+    )
+
+    # Apply the transforms:
+    preprocessed_images = [np.expand_dims(transforms_composition(image).numpy(), 0) for image in images]
+    preprocessed_images = [np.vstack(preprocessed_images)]
+
+    return {"inputs": preprocessed_images}
+
+
+def postprocess(model_response: dict) -> Dict[str, Union[int, float]]:
+    """
+    Read the predicted classes probabilities response from the PyTorchModelServer and parse them into a dictionary with
+    the results.
+
+    :param model_response: The PyTorchModelServer response with the predicted probabilities.
+
+    :returns: A dictionary with the parsed prediction.
+    """
+    # Read the prediction from the model:
+    prediction = np.squeeze(model_response["outputs"])
+
+    # Parse and return:
+    return {
+        "class": int(np.argmax(prediction)),
+        "with_mask": float(prediction[0]),
+        "without_mask": float(prediction[1]),
+    }