Skip to content

Commit

Permalink
Merge pull request #202 from mlrun/0.9.x-dev
Browse files Browse the repository at this point in the history
0.9.x dev
  • Loading branch information
aviaIguazio authored Dec 13, 2021
2 parents fad845a + a3adca5 commit 3107de7
Show file tree
Hide file tree
Showing 9 changed files with 467 additions and 57 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ To run the MLRun demos, first do the following:
## Mask Detection Demo

The [Mask detection](./mask-detection/README.md) demo is a 3 notebooks demo where we:
1. **Train and evaluate** a model for detecting whether a person is wearing a mask in an image using Tensorflow.Keras or PyTorch (coming soon).
1. **Train and evaluate** a model for detecting whether a person is wearing a mask in an image using Tensorflow.Keras or PyTorch.
2. **Serve** the model as a serverless function in a http endpoint.
3. Write an **automatic pipeline** where we download a dataset of images, train and evaluate, optimize the model (using ONNX) and serve it.

Expand Down
80 changes: 64 additions & 16 deletions mask-detection/1-training-and-evaluation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we continue, we need to setup some requirements:"
"Before we continue, we need to install MLRun and the framework of choice (comment and uncomment the framework you wish to use):"
]
},
{
Expand All @@ -30,8 +30,14 @@
"outputs": [],
"source": [
"!pip install mlrun\n",
"!pip install -U tensorflow==2.4.1\n",
"!pip install -U typing-extensions"
"!pip install -U typing-extensions\n",
"\n",
"########## For TF.Keras: ##########\n",
"!pip install -U tensorflow==2.4.4\n",
"\n",
"########## For PyTorch: ##########\n",
"# !pip install -U torch==1.10\n",
"# !pip install -U torchvision==0.11.1"
]
},
{
Expand Down Expand Up @@ -90,7 +96,7 @@
"\n",
"### 2.1. Import a Function\n",
"\n",
"We will download the images using `open_archive` - a function from MLRun's functions marketplace. We will import the fucntion using `mlrun.import_function` and describe it to get the function's documentation:"
"We will download the images using `open_archive` - a function from MLRun's functions marketplace. We will import the function using `mlrun.import_function` and describe it to get the function's documentation:"
]
},
{
Expand Down Expand Up @@ -412,7 +418,8 @@
"metadata": {},
"outputs": [],
"source": [
"framework = \"tf-keras\""
"framework = \"tf-keras\"\n",
"# framework = \"pytorch\""
]
},
{
Expand All @@ -421,20 +428,58 @@
"source": [
"### TF.Keras\n",
"\n",
"The code is taken from the python file [training-and-evaluation.py](tf-keras/training-and-evaluation.py). It is classic and straightforward, we: \n",
"The code is taken from the python file [training-and-evaluation.py](tf-keras/training-and-evaluation.py), which is classic and straightforward. We: \n",
"1. Use `_get_datasets` to get the training and validation datasets (on evaluation - the evaluation dataset).\n",
"2. Use `_get_model` to build our classifier - simple transfer learning from MobileNetV2.\n",
"2. Use `_get_model` to build our classifier - simple transfer learning from MobileNetV2 (`keras.applications`).\n",
"3. Call `train` to train the model.\n",
"4. Call `evaluate` to evaluate the model.\n",
"\n",
"Taking this code one step further is **MLRun**'s framework for `tf.keras`: \n",
"Taking this code one step further is **MLRun**'s framework for `tf.keras`:\n",
"\n",
"```python\n",
"# Apply MLRun's interface for tf.keras:\n",
"mlrun_tf_keras.apply_mlrun(model=model, context=context, ...)\n",
"```\n",
"\n",
"With just one line of code, it seamlessly provides:\n",
"With just one line of code, it seamlessly provides **automatic logging** (for both MLRun and Tensorboard) and **distributed training** by wrapping the `fit` and `evaluate` methods of `tf.keras.Model`.\n",
"\n",
"In addition, in the `evaluate` method code, we use the `TFKerasModelHandler` class. This class supports loading, saving and logging `tf.keras` models with ease, enabling easy versioning of the model and his results, artifacts and custom objects."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### PyTorch\n",
"\n",
"The code is taken from the python file [training-and-evaluation.py](pytorch/training-and-evaluation.py), which is classic and straightforward. We:\n",
"1. Use `_get_datasets` to get the training and validation datasets (on evaluation - the evaluation dataset). The function is initiazliing a `MaskDetectionDataset` to handle our images.\n",
"2. Initialize our `MaskDetector` classifier class - a simple transfer learning from MobileNetV2 (`torchvision.models`).\n",
"3. Call `train` to train the model.\n",
"4. Call `evaluate` to evaluate the model.\n",
"\n",
"Taking this code one step further is **MLRun**'s framework for `torch`:\n",
"\n",
"```python\n",
"import mlrun.frameworks.pytorch as mlrun_torch\n",
"```\n",
"\n",
"`mlrun_torch` is providing what we call \"shortcut functions\" for using PyTorch with ease:\n",
"* `train` - Training a model.\n",
"* `evaluate` - Evaluating a model.\n",
"\n",
"Both functions enable **automatic logging** (for both MLRun and Tensorboard) and **distributed training** by simply passing the following parameters: `auto_log: bool` and `use_horovod: bool`.\n",
"\n",
"In addition, you can choose to use our classes directly:\n",
"* `PyTorchMLRunInterface` - the interface for training, evaluating and predicting a PyTorch model. Our code is highly generic and should fit for any type of model.\n",
"* If you wish to use your own training code, to get automatic logging you will simply need to use our callback mechanism with `CallbackHandler`.\n",
"* `PyTorchModelHandler` - supports loading, saving and logging `torch` models with ease, enabling easy versioning of the model and his results, artifacts and custom objects."
]
},
{
"cell_type": "markdown",
"source": [
"Both **TF.Keras** and **PyTorch** has the same features regarding MLRun's automatic logging and distributed training orchastration:\n",
"* **Automatic logging**: auto-log your training and model to both **Tensorboard** and **MLRun**. Additional settings can be passed onto this method to gain extra logging capabilities, like:\n",
" * Weights histograms and distributions\n",
" * Weights statistics\n",
Expand All @@ -443,22 +488,25 @@
" * Logging frequency and more\n",
"* **Distributed training with Horovod**: Horovod will be initialized and used automatically if the MLRun Function's `kind` attribute is equal to `\"mpijob\"`, there won't be any additional changes needed to the original code! More on that later in [section 6](#section_6)\n",
"\n",
"In addition, in the `evaluate` method code, we use the `mlrun.frameworks.tf_keras.TFKerasModelHandler` class. This class supports loading, saving and logging `tf.keras` models with ease, enabling easy versioning of the model and his results, artifacts and custom objects.\n",
"\n",
"We suggest reading the documentation for further use, or like in this example, use the default settings."
]
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"section_4\"></a>\n",
"## 4. Create the MLRun Function\n",
"\n",
"We will use MLRun's `mlrun.code_to_function` to create a MLRun Function from our code in the above mentioned python file. Notice our MLRun Function will have two handlers: `train` and `evaluate`.\n",
"\n",
"We wish to run the training first as a Job, so we will set the `kind` parameter to `\"job\"`."
]
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
Expand Down Expand Up @@ -1082,7 +1130,7 @@
"<a id=\"section_6\"></a>\n",
"## 6. Run Distributed Training Using Horovod\n",
"\n",
"Now we can see the second benefit of MLRun, we can **distribute** our model **training** across **multiple workers** (i.e., perform distributed training), assign **GPUs**, and more. We don't need to bother with Dockerfiles or K8s YAML configuration files — MLRun does all of this for us. All is needed to be done, is create our function with `kind=\"mpijob\"`.\n",
"Now we can see the second benefit of MLRun, we can **distribute** our model **training** across **multiple workers** (i.e., perform distributed training), assign **GPUs**, and more. We don't need to bother with Dockerfiles or K8s YAML configuration files — MLRun does all of this for us. We will simply create our function with `kind=\"mpijob\"`.\n",
"\n",
"> **Notice**: for this demo, in order to use GPUs in training, set the `use_gpu` variable to `True`. This will later assign the required configurations to use the GPUs and pass the correct image to support GPUs (image with CUDA libraries)."
]
Expand Down Expand Up @@ -1149,7 +1197,7 @@
"source": [
"Call run, and notice each epoch is shorter as we now have 2 workers instead of 1. As the 2 workers will print a lot of outputs we would rather wait for completion and then show the results. For that, we will pass `watch=False` and use the run objects function `wait_for_completion` and `show`. \n",
"\n",
"In order to see the logs, you are welcome to go into the UI by clicking the blue hyperlink \"<span style=\"color:blue\">**click here**</span>\" after running the function and see the logs there:"
"To see the logs, you can go into the UI by clicking the blue hyperlink \"<span style=\"color:blue\">**click here**</span>\" after running the function:"
]
},
{
Expand Down
42 changes: 30 additions & 12 deletions mask-detection/2-serving.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@
"metadata": {},
"outputs": [],
"source": [
"framework = \"tf-keras\""
"framework = \"tf-keras\"\n",
"# framework = \"pytorch\""
]
},
{
Expand All @@ -90,6 +91,25 @@
"4. `post-process` - Parse the prediction probabilities and wrap them in a dictionary with which to respond."
]
},
{
"cell_type": "markdown",
"source": [
"### PyTorch\n",
"\n",
"The code is taken from the python file [serving.py](pytorch/serving.py). Our data will go through the following structure:\n",
"1. `resize` - Read the URL into an array and resize it to 224x224.\n",
"2. `preprocess` - Use `torchvision.transforms` to normalize the images for MobileNetV2.\n",
"3. `mlrun.frameworks.pytorch.PyTorchModelServer` - Infer the inputs through the model and return the predictions. It can be imported from:\n",
" ```python\n",
" from mlrun.frameworks.pytorch import PyTorchModelServer\n",
" ```\n",
" This class can be inherited and its pre-process, post-process, predict and explain methods can be overridden. In this demo, we will be using the defaults to showcase the topology feature of our serving functions.\n",
"4. `post-process` - Parse the prediction probabilities and wrap them in a dictionary with which to respond."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -217,10 +237,17 @@
"# Set the topology and get the graph object:\n",
"graph = serving_function.set_topology(\"flow\", engine=\"async\")\n",
"\n",
"# Choose the ModelServer according to the selected framework:\n",
"model_server_class = (\n",
" \"mlrun.frameworks.tf_keras.TFKerasModelServer\"\n",
" if framework == \"tf-keras\"\n",
" else \"mlrun.frameworks.pytorch.PyTorchModelServer\"\n",
")\n",
"\n",
"# Build the serving graph:\n",
"graph.to(handler=\"resize\", name=\"resize\")\\\n",
" .to(handler=\"preprocess\", name=\"preprocess\")\\\n",
" .to(class_name=\"mlrun.frameworks.tf_keras.TFKerasModelServer\", name=\"mask_detector\", model_path=project.get_artifact_uri(\"mask_detector\"))\\\n",
" .to(class_name=model_server_class, name=\"detect_mask\", model_path=project.get_artifact_uri(\"mask_detector\"))\\\n",
" .to(handler=\"postprocess\", name=\"postprocess\").respond()\n",
"\n",
"# Plot to graph:\n",
Expand Down Expand Up @@ -559,17 +586,8 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"metadata": {
"collapsed": false
},
"source": []
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}
}
13 changes: 7 additions & 6 deletions mask-detection/3-automatic-pipeline.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Mask Detection Demo - Automatic Pipeline (3 / 3)\n",
"\n",
"The following example demonstrates how to package a project and how to run an automatic pipeline for training, evaluating, optimizing and serving the mask detection model using our saved MLRun functions from the previous notebooks.\n",
"The following example demonstrates how to package a project and how to run an automatic pipeline to train, evaluate, optimize and serve the mask detection model using our saved MLRun functions from the previous notebooks.\n",
"\n",
"1. [Set up the project](#section_1)\n",
"2. [Write and save the workflow](#section_2)\n",
Expand Down Expand Up @@ -79,7 +79,8 @@
"metadata": {},
"outputs": [],
"source": [
"framework = \"tf-keras\""
"framework = \"tf-keras\"\n",
"# framework = \"pytorch\""
]
},
{
Expand Down Expand Up @@ -209,7 +210,7 @@
"):\n",
" # Get our project object:\n",
" project = mlrun.get_current_project()\n",
" \n",
"\n",
" # Write down the ONNX requirements:\n",
" onnx_requirements = [\n",
" \"onnx~=1.10.1\",\n",
Expand Down Expand Up @@ -276,8 +277,8 @@
" handler=\"to_onnx\",\n",
" name=\"optimizing\",\n",
" params={\n",
" \"model_name\": 'mask_detector',\n",
" \"model_path\": training_run.outputs['mask_detector'],\n",
" \"onnx_model_name\": 'onnx_mask_detector'\n",
" },\n",
" outputs=[\"onnx_mask_detector\"],\n",
" ).after(build_condition)\n",
Expand Down Expand Up @@ -309,7 +310,7 @@
" # Build the serving graph:\n",
" graph.to(handler=\"resize\", name=\"resize\")\\\n",
" .to(handler=\"preprocess\", name=\"preprocess\")\\\n",
" .to(\"mlrun.frameworks.onnx.ONNXModelServer\", \"onnx_mask_detector\", model_path=project.get_artifact_uri(\"onnx_mask_detector\"))\\\n",
" .to(class_name=\"mlrun.frameworks.onnx.ONNXModelServer\", name=\"onnx_mask_detector\", model_path=project.get_artifact_uri(\"onnx_mask_detector\"))\\\n",
" .to(handler=\"postprocess\", name=\"postprocess\").respond()\n",
" # Set the desired requirements:\n",
" serving_function.with_requirements(requirements=onnx_requirements)\n",
Expand All @@ -321,7 +322,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that after running the cell above, the `workflow.py` file is created. Saving your workflow to file allows you to use it if run the project from a different environment.\n",
"Note that after running the cell above, the `workflow.py` file is created. Saving your workflow to file allows you to run the project from a different environment.\n",
"\n",
"In order to take this project with the functions we set and the workflow we saved over to a different environemnt, first set the workflow to the project. The workflow can be set using `project.set_workflow`. After setting it, we will save the project by calling `project.save`. When loaded, it can be run from another environment from both code and from cli. For more information regarding saving and loading a MLRun project, see the [documentation](https://docs.mlrun.org/en/latest/projects/overview.html)."
]
Expand Down
3 changes: 1 addition & 2 deletions mask-detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,7 @@ In the following demo we will demonstrate how to use MLRun to create a mask dete

### Key Technologies:

* Either [**TF.Keras**](https://www.tensorflow.org/api_docs/python/tf/keras) to train and evaluate the model,
* or [**PyTorch**](https://pytorch.org/) (Will be added soon)
* Either [**TF.Keras**](https://www.tensorflow.org/api_docs/python/tf/keras) or [**PyTorch**](https://pytorch.org/) to train and evaluate the model
* [**Horovod**](https://horovod.ai/) to run distributed training
* [**ONNX**](https://onnx.ai/) to optimize and accelerate the model's performance
* [**Nuclio**](https://nuclio.io/) to create a high-performance serverless Serving function
Expand Down
79 changes: 79 additions & 0 deletions mask-detection/pytorch/serving.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
import urllib.request
from typing import Dict, List, Union

import numpy as np
import torchvision
from PIL import Image


def resize(event: Dict) -> List[Image.Image]:
"""
Read images urls into numpy arrays and resize them to MobileNetV2 standard size of 224x224.
:param event: A dictionary with the images urls at the 'data_url' key.
:returns: A list of all the resized images as numpy arrays.
"""
# Read the images urls passed:
images_urls = event["data_url"]

# Initialize an empty list for the resized images:
resized_images = []

# Go through the images urls and read and resize them:
for image_url in images_urls:
# Get the image:
urllib.request.urlretrieve(image_url, "temp.png")
image = Image.open("temp.png")
# Resize it:
image = image.resize((224, 224))
# Collect it:
resized_images.append(image)

return resized_images


def preprocess(images: List[Image.Image]) -> Dict[str, List[np.ndarray]]:
"""
Run the given images through MobileNetV2 preprocessing so they will be ready to be inferred through the mask
detection model.
:param images: A list of images to preprocess.
:returns: A dictionary for the PyTorchModelServer, with the preprocessed images in the 'inputs' key.
"""
# Prepare the transforms composition:
transforms_composition = torchvision.transforms.Compose(
[
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
),
]
)

# Apply the transforms:
preprocessed_images = [np.expand_dims(transforms_composition(image).numpy(), 0) for image in images]
preprocessed_images = [np.vstack(preprocessed_images)]

return {"inputs": preprocessed_images}


def postprocess(model_response: dict) -> Dict[str, Union[int, float]]:
"""
Read the predicted classes probabilities response from the PyTorchModelServer and parse them into a dictionary with
the results.
:param model_response: The PyTorchModelServer response with the predicted probabilities.
:returns: A dictionary with the parsed prediction.
"""
# Read the prediction from the model:
prediction = np.squeeze(model_response["outputs"])

# Parse and return:
return {
"class": int(np.argmax(prediction)),
"with_mask": float(prediction[0]),
"without_mask": float(prediction[1]),
}
Loading

0 comments on commit 3107de7

Please sign in to comment.