anyscale · angelinalg · Oct 10, 2024 · Oct 10, 2024
diff --git a/templates/fine-tune-stable-diffusion/README.ipynb b/templates/fine-tune-stable-diffusion/README.ipynb
@@ -22,9 +22,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 1: Install python dependencies\n",
+    "## Step 1: Install Python dependencies\n",
     "\n",
-    "The application requires a few extra Python dependencies. Install them using `pip` and they'll be automatically installed on remote workers when they're launched!"
+    "The application requires a few extra Python dependencies. Install them using `pip`. When launching remote workers, Anyscale automatically installs the dependencies on them."
    ]
   },
   {
@@ -44,18 +44,18 @@
     "\n",
     "First, provide some images of the subject you want to fine-tune on.\n",
     "\n",
-    "We'll use a sample dog dataset to demonstrate, but you can use pictures of your own subject.\n",
-    "Fine-tuning works best if your images are all cropped to a square with your subject in the center!\n",
+    "This example uses a sample dog dataset to demonstrate, but you can use pictures of your own subject.\n",
+    "Fine-tuning works best if your images are all cropped to a square with your subject in the center.\n",
     "\n",
     "A few notes on these constants that you can modify when training on your own custom subject:\n",
-    "* `SUBJECT_TOKEN` is the a unique token that you will teach the model to correspond to your subject. This can be is any token that does not appear much in normal text.\n",
-    "    * Think of it as the name of your subject that the diffusion model will learn to recognize. Feel free to leave it as `sks`.\n",
-    "    * When generating images, make sure to include `sks` in your prompt -- otherwise the model will just generate any random dog, not the dog that we fine-tuned it on!\n",
+    "* `SUBJECT_TOKEN` is the a unique token that you teach the model to correspond to your subject. This token can be any token that doesn't appear much in normal text.\n",
+    "    * Think of it as the name of your subject that the diffusion model learns to recognize. You can leave it as `sks`.\n",
+    "    * When generating images, make sure to include `sks` in your prompt--otherwise the model generates any random dog, not the dog that you fine-tuned it on.\n",
     "* `SUBJECT_CLASS` is the category that your subject falls into.\n",
     "    * For example, if you have a human subject, the class could be `\"man\"` or `\"woman\"`.\n",
-    "    * This class combined with the `SUBJECT_TOKEN` can be used in a prompt to convey the meaning: \"a dog named sks\".\n",
-    "* Put training images of your subject in `SUBJECT_IMAGES_PATH`. We'll later upload it to cloud storage so that all worker nodes can access the dataset.\n",
-    "    * The easiest way to use your own images is to drag files into a folder in the VSCode file explorer, then moving the folder to `SUBJECT_IMAGES_PATH` in the command line. (Ex: `mv ./images /mnt/local_storage/subject_images`)"
+    "    * Use this class in combination with the `SUBJECT_TOKEN` in a prompt to convey the meaning: \"a dog named sks\".\n",
+    "* Put training images of your subject in `SUBJECT_IMAGES_PATH` to upload later to cloud storage so that all worker nodes can access the dataset.\n",
+    "    * The easiest way to use your own images is to drag files into a folder in the VS Code file explorer, then moving the folder to `SUBJECT_IMAGES_PATH` in the command line. For example, `mv ./images /mnt/local_storage/subject_images`."
    ]
   },
   {
@@ -75,15 +75,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Copy the sample dog dataset to the subject images path -- feel free to comment this out.\n",
+    "# Copy the sample dog dataset to the subject images path--feel free to comment this out.\n",
     "!mkdir -p {SUBJECT_IMAGES_PATH} && cp ./assets/dog/*.jpeg {SUBJECT_IMAGES_PATH}"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Take a look at the dataset!"
+    "Take a look at the dataset."
    ]
   },
   {
@@ -102,7 +102,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Next, upload the dataset to cloud storage so that we can download it on each worker node at the start of training."
+    "Next, upload the dataset to cloud storage so that Anyscale can download it on each worker node at the start of training."
    ]
   },
   {
@@ -123,9 +123,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Let's come up with some prompts to test our model on after fine-tuning. Notice the `{SUBJECT_TOKEN} {SUBJECT_CLASS}` included in each of them.\n",
+    "Create some prompts to test our model on after fine-tuning. Notice that every prompt includes the `{SUBJECT_TOKEN} {SUBJECT_CLASS}`.\n",
     "\n",
-    "You can change these to be more fitting for your subject."
+    "You can change these to be more applicable for your subject."
    ]
   },
   {
@@ -147,15 +147,15 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 3: Run fine-tuning with Ray Train + HuggingFace Accelerate\n",
+    "## Step 3: Run fine-tuning with Ray Train and Hugging Face Accelerate\n",
     "\n",
-    "Next, let's launch the distributed fine-tuning job.\n",
+    "Next, launch the distributed fine-tuning job.\n",
     "\n",
-    "We will use the training script provided by the [HuggingFace diffusers Dreambooth fine-tuning example](https://github.com/huggingface/diffusers/blob/d7634cca87641897baf90f5a006f2d6d16eac6ec/examples/dreambooth/README_sdxl.md) with very slight modifications.\n",
+    "Use the training script provided by the [Hugging Face diffusers Dreambooth fine-tuning example](https://github.com/huggingface/diffusers/blob/d7634cca87641897baf90f5a006f2d6d16eac6ec/examples/dreambooth/README_sdxl.md) with very slight modifications.\n",
     "\n",
     "See `train_dreambooth_lora_sdxl.py` for the training script. The example does fine-tuning with [Low Rank Adaptation](https://arxiv.org/abs/2106.09685) (LoRA), which is a method that freezes most layers but injects a small set of trainable layers that get added to existing layers. This method greatly reduces the amount of training state in GPU memory and reduces the checkpoint size, while maintaining the fine-tuned model quality.\n",
     "\n",
-    "This script uses HuggingFace Accelerate, and we will show that it is easy to scale out an existing training script on a Ray cluster with Ray Train."
+    "This script uses Hugging Face Accelerate, and this example shows that it's easy to scale out an existing training script on a Ray cluster with Ray Train."
    ]
   },
   {
@@ -164,9 +164,9 @@
    "source": [
     "### Parse training arguments\n",
     "\n",
-    "The `diffusers` script is originally launched via the command line. Here, we'll launch it with Ray Train instead and pass in the parsed command line arguments, in order to make as few modifications to the training script as possible.\n",
+    "The original example launches the `diffusers` script at the command line. This example launches it with Ray Train instead and passes in the parsed command line arguments, in order to make as few modifications to the training script as possible.\n",
     "\n",
-    "The settings and hyperparameters below are taken from the [HuggingFace example](https://github.com/huggingface/diffusers/blob/d7634cca87641897baf90f5a006f2d6d16eac6ec/examples/dreambooth/README_sdxl.md)."
+    "The settings and hyperparameters below are taken from the [Hugging Face example](https://github.com/huggingface/diffusers/blob/d7634cca87641897baf90f5a006f2d6d16eac6ec/examples/dreambooth/README_sdxl.md)."
    ]
   },
   {
@@ -191,7 +191,7 @@
     "    f\"--instance_prompt=a photo of {SUBJECT_TOKEN} {SUBJECT_CLASS}\",\n",
     "    \"--resolution=1024\",\n",
     "    # The global batch size is: num_workers * train_batch_size * gradient_accumulation_steps\n",
-    "    # We define the number of workers later in the TorchTrainer.\n",
+    "    # Define the number of workers later in the TorchTrainer.\n",
     "    \"--train_batch_size=1\",  # This is the batch size *per* worker.\n",
     "    \"--gradient_accumulation_steps=1\",\n",
     "    \"--learning_rate=1e-4\",\n",
@@ -220,14 +220,14 @@
    "source": [
     "### Launch distributed training with Ray Train\n",
     "\n",
-    "To run distributed training, we'll use a `ray.train.torch.TorchTrainer` to request GPU workers and connect them together in a distributed worker group. Then, when the workers run the training script, HuggingFace Accelerate detects this distributed process group and sets up the model to do data parallel training.\n",
+    "To run distributed training, use a `ray.train.torch.TorchTrainer` to request GPU workers and connect them together in a distributed worker group. Then, when the workers run the training script, Hugging Face Accelerate detects this distributed process group and sets up the model to do data parallel training.\n",
     "\n",
     "A few notes:\n",
     "* `ray.init(runtime_env={\"env_vars\": ...})` sets the environment variables on all workers in the cluster -- setting the environment variable in this notebook on the head node is not enough in a distributed setting.\n",
     "* `train_fn_per_worker` is the function that will run on all distributed training workers. In this case, it's just a light wrapper on top of the `diffusers` example script that copies the latest checkpoint to shared cluster storage.\n",
     "* `ScalingConfig` is the configuration that determines how many workers and what kind of accelerator to use for training. Once the training is launched, **Anyscale will automatically scale up nodes to meet this resource request!**\n",
     "\n",
-    "The result of this fine-tuning will be a fine-tuned LoRA model checkpoint at `MODEL_CHECKPOINT_PATH`."
+    "The result of this fine-tuning is a fine-tuned LoRA model checkpoint at `MODEL_CHECKPOINT_PATH`."
    ]
   },
   {
@@ -306,13 +306,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Step 3: Generate some images with your fine-tuned model!\n",
+    "## Step 3: Generate some images with your fine-tuned model.\n",
     "\n",
-    "Finally, let's generate some images!\n",
+    "Finally, generate some images!\n",
     "\n",
-    "We'll launch 2 remote GPU tasks to generate images from the `PROMPTS` we defined earlier, one using just the base model and one that loads our fine-tuned LoRA weights. Let's compare them to see the results of fine-tuning!\n",
+    "Launch 2 remote GPU tasks to generate images from the `PROMPTS` you defined earlier, one using just the base model and one that loads the fine-tuned LoRA weights. Compare them to see the results of fine-tuning.\n",
     "\n",
-    "Note: If your cluster has already scaled down from the training job due to the workers being idle, then this step might take a little longer to relaunch new GPU workers."
+    "Note: If Anyscale already scaled down your cluster from the training job due to the workers being idle, then this step might take a little longer to relaunch new GPU workers."
    ]
   },
   {
@@ -342,7 +342,7 @@
    "source": [
     "### Images generated with the finetuned model\n",
     "\n",
-    "These images should resemble your subject. If the generated image quality is not satisfactory, refer to the tips in [this blog post](https://huggingface.co/blog/dreambooth#tldr-recommended-settings) to tweak your hyperparameters."
+    "These images should resemble your subject. If the generated image quality isn't satisfactory, see to the tips in [this blog post](https://huggingface.co/blog/dreambooth#tldr-recommended-settings) to tweak your hyperparameters."
    ]
   },
   {
@@ -381,14 +381,14 @@
    "source": [
     "## Summary\n",
     "\n",
-    "Congrats, you've fine-tuned Stable Diffusion XL!\n",
+    "At this point, you've fine-tuned Stable Diffusion XL.\n",
     "\n",
     "As a recap, this notebook:\n",
     "1. Installed cluster-wide dependencies.\n",
     "2. Scaled out fine-tuning to multiple GPU workers.\n",
     "3. Compared the generated output results before and after fine-tuning.\n",
     "\n",
-    "As a next step, you can take the fine-tuned model checkpoint and use it to serve the model. See the tutorial on serving stable diffusion on the home page to get started!"
+    "As a next step, you can take the fine-tuned model checkpoint and use it to serve the model. See the tutorial on serving stable diffusion on the home page to get started."
    ]
   }
  ],