Skip to content

Commit

Permalink
Merge pull request #203 from alexiguazio/patch-6
Browse files Browse the repository at this point in the history
Update 1-training-and-evaluation.ipynb
  • Loading branch information
aviaIguazio authored Dec 13, 2021
2 parents 54cc1dd + fbc442e commit a3adca5
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions mask-detection/1-training-and-evaluation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@
"\n",
"### 2.1. Import a Function\n",
"\n",
"We will download the images using `open_archive` - a function from MLRun's functions marketplace. We will import the fucntion using `mlrun.import_function` and describe it to get the function's documentation:"
"We will download the images using `open_archive` - a function from MLRun's functions marketplace. We will import the function using `mlrun.import_function` and describe it to get the function's documentation:"
]
},
{
Expand Down Expand Up @@ -428,7 +428,7 @@
"source": [
"### TF.Keras\n",
"\n",
"The code is taken from the python file [training-and-evaluation.py](tf-keras/training-and-evaluation.py). It is classic and straightforward, we: \n",
"The code is taken from the python file [training-and-evaluation.py](tf-keras/training-and-evaluation.py), which is classic and straightforward. We: \n",
"1. Use `_get_datasets` to get the training and validation datasets (on evaluation - the evaluation dataset).\n",
"2. Use `_get_model` to build our classifier - simple transfer learning from MobileNetV2 (`keras.applications`).\n",
"3. Call `train` to train the model.\n",
Expand All @@ -452,7 +452,7 @@
"source": [
"### PyTorch\n",
"\n",
"The code is taken from the python file [training-and-evaluation.py](pytorch/training-and-evaluation.py). It is classic and straightforward, we:\n",
"The code is taken from the python file [training-and-evaluation.py](pytorch/training-and-evaluation.py), which is classic and straightforward. We:\n",
"1. Use `_get_datasets` to get the training and validation datasets (on evaluation - the evaluation dataset). The function is initiazliing a `MaskDetectionDataset` to handle our images.\n",
"2. Initialize our `MaskDetector` classifier class - a simple transfer learning from MobileNetV2 (`torchvision.models`).\n",
"3. Call `train` to train the model.\n",
Expand All @@ -468,7 +468,7 @@
"* `train` - Training a model.\n",
"* `evaluate` - Evaluating a model.\n",
"\n",
"Both functions enable **automatic logging** (for both MLRun and Tensorboard) and **distributed training** by simply pass the following parameters: `auto_log: bool` and `use_horovod: bool`.\n",
"Both functions enable **automatic logging** (for both MLRun and Tensorboard) and **distributed training** by simply passing the following parameters: `auto_log: bool` and `use_horovod: bool`.\n",
"\n",
"In addition, you can choose to use our classes directly:\n",
"* `PyTorchMLRunInterface` - the interface for training, evaluating and predicting a PyTorch model. Our code is highly generic and should fit for any type of model.\n",
Expand Down Expand Up @@ -1130,7 +1130,7 @@
"<a id=\"section_6\"></a>\n",
"## 6. Run Distributed Training Using Horovod\n",
"\n",
"Now we can see the second benefit of MLRun, we can **distribute** our model **training** across **multiple workers** (i.e., perform distributed training), assign **GPUs**, and more. We don't need to bother with Dockerfiles or K8s YAML configuration files — MLRun does all of this for us. All is needed to be done, is create our function with `kind=\"mpijob\"`.\n",
"Now we can see the second benefit of MLRun, we can **distribute** our model **training** across **multiple workers** (i.e., perform distributed training), assign **GPUs**, and more. We don't need to bother with Dockerfiles or K8s YAML configuration files — MLRun does all of this for us. We will simply create our function with `kind=\"mpijob\"`.\n",
"\n",
"> **Notice**: for this demo, in order to use GPUs in training, set the `use_gpu` variable to `True`. This will later assign the required configurations to use the GPUs and pass the correct image to support GPUs (image with CUDA libraries)."
]
Expand Down Expand Up @@ -1197,7 +1197,7 @@
"source": [
"Call run, and notice each epoch is shorter as we now have 2 workers instead of 1. As the 2 workers will print a lot of outputs we would rather wait for completion and then show the results. For that, we will pass `watch=False` and use the run objects function `wait_for_completion` and `show`. \n",
"\n",
"In order to see the logs, you are welcome to go into the UI by clicking the blue hyperlink \"<span style=\"color:blue\">**click here**</span>\" after running the function and see the logs there:"
"To see the logs, you can go into the UI by clicking the blue hyperlink \"<span style=\"color:blue\">**click here**</span>\" after running the function:"
]
},
{
Expand Down Expand Up @@ -1689,4 +1689,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

0 comments on commit a3adca5

Please sign in to comment.