diff --git a/404.html b/404.html
index bd801c47..64da1a3e 100644
--- a/404.html
+++ b/404.html
@@ -91,7 +91,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/LICENSE-text.html b/LICENSE-text.html
index 24b26acf..18a16416 100644
--- a/LICENSE-text.html
+++ b/LICENSE-text.html
@@ -71,7 +71,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/LICENSE.html b/LICENSE.html
index 42cd6e76..850c6ed0 100644
--- a/LICENSE.html
+++ b/LICENSE.html
@@ -60,6 +60,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
     </div>
 
 <div id="mit-license" class="section level1">
+
 <p>Copyright (c) 2021 luz authors</p>
 <p>Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:</p>
 <p>The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.</p>
@@ -74,7 +75,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/articles/accelerator.html b/articles/accelerator.html
index 08b0f4c8..93d70eb9 100644
--- a/articles/accelerator.html
+++ b/articles/accelerator.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="accelerator_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Accelerator API</h1>
             
@@ -90,46 +91,58 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
     
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://mlverse.github.io/luz/" class="external-link">luz</a></span><span class="op">)</span></span></code></pre></div>
-<p>The Accelerator API is a simplified port of the Hugging Face <a href="https://github.com/huggingface/accelerate" class="external-link">Accelerate library</a>. It allows users to avoid the boilerplate code necessary to write training loops that work correctly on both devices. Currently it only handles CPU and single-GPU usage.</p>
-<p>This API is meant to be the most flexible way you can use the luz package. With the Accelerator API, you write the raw torch training loop and, with a few code changes, you automatically handle device placement of the model, optimizers and dataloaders, so you don’t need to add many <code>$to(device="cuda")</code> calls in your code or think about the order in which to create the model and optimizers.</p>
+<p>The Accelerator API is a simplified port of the Hugging Face <a href="https://github.com/huggingface/accelerate" class="external-link">Accelerate library</a>.
+It allows users to avoid the boilerplate code necessary to write
+training loops that work correctly on both devices. Currently it only
+handles CPU and single-GPU usage.</p>
+<p>This API is meant to be the most flexible way you can use the luz
+package. With the Accelerator API, you write the raw torch training loop
+and, with a few code changes, you automatically handle device placement
+of the model, optimizers and dataloaders, so you don’t need to add many
+<code>$to(device="cuda")</code> calls in your code or think about the
+order in which to create the model and optimizers.</p>
 <div class="section level2">
 <h2 id="example">Example<a class="anchor" aria-label="anchor" href="#example"></a>
 </h2>
-<p>The Accelerator API is best explained by showing an example diff in a raw torch training loop.</p>
-<div class="sourceCode" id="cb2"><pre class="sourceCode diff"><code class="sourceCode diff"><span id="cb2-1"><a href="#cb2-1"></a>library(torch)</span>
-<span id="cb2-2"><a href="#cb2-2"></a><span class="va">+ library(luz)</span></span>
-<span id="cb2-3"><a href="#cb2-3"></a></span>
-<span id="cb2-4"><a href="#cb2-4"></a><span class="va">+ acc &lt;- accelerator()</span></span>
-<span id="cb2-5"><a href="#cb2-5"></a><span class="st">- device &lt;- "cpu"</span></span>
-<span id="cb2-6"><a href="#cb2-6"></a></span>
-<span id="cb2-7"><a href="#cb2-7"></a>data &lt;- tensor_dataset(</span>
-<span id="cb2-8"><a href="#cb2-8"></a>  x = torch_randn(100, 10),</span>
-<span id="cb2-9"><a href="#cb2-9"></a>  y = torch_rand(100, 1)</span>
-<span id="cb2-10"><a href="#cb2-10"></a>)</span>
-<span id="cb2-11"><a href="#cb2-11"></a></span>
-<span id="cb2-12"><a href="#cb2-12"></a>dl &lt;- dataloader(data, batch_size = 10)</span>
-<span id="cb2-13"><a href="#cb2-13"></a></span>
-<span id="cb2-14"><a href="#cb2-14"></a>model &lt;- nn_linear(10, 1)</span>
-<span id="cb2-15"><a href="#cb2-15"></a><span class="st">- model$to(device = device)</span></span>
-<span id="cb2-16"><a href="#cb2-16"></a>opt &lt;- optim_adam(model$parameters)</span>
-<span id="cb2-17"><a href="#cb2-17"></a></span>
-<span id="cb2-18"><a href="#cb2-18"></a><span class="va">+ c(model, opt, dl) %&lt;-% acc$prepare(model, opt, dl)</span></span>
-<span id="cb2-19"><a href="#cb2-19"></a></span>
-<span id="cb2-20"><a href="#cb2-20"></a>model$train()</span>
-<span id="cb2-21"><a href="#cb2-21"></a>coro::loop(for (batch in dl) {</span>
-<span id="cb2-22"><a href="#cb2-22"></a></span>
-<span id="cb2-23"><a href="#cb2-23"></a>  opt$zero_grad()</span>
-<span id="cb2-24"><a href="#cb2-24"></a></span>
-<span id="cb2-25"><a href="#cb2-25"></a><span class="st">-  preds &lt;- model(batch$x$to(device = device))</span></span>
-<span id="cb2-26"><a href="#cb2-26"></a><span class="va">+  preds &lt;- model(batch$x)</span></span>
-<span id="cb2-27"><a href="#cb2-27"></a><span class="st">-  loss &lt;- nnf_mse_loss(preds, batch$y$to(device = device))</span></span>
-<span id="cb2-28"><a href="#cb2-28"></a><span class="va">+  loss &lt;- nnf_mse_loss(preds, batch$y)</span></span>
-<span id="cb2-29"><a href="#cb2-29"></a></span>
-<span id="cb2-30"><a href="#cb2-30"></a>  loss$backward()</span>
-<span id="cb2-31"><a href="#cb2-31"></a>  opt$step()</span>
-<span id="cb2-32"><a href="#cb2-32"></a>})</span></code></pre></div>
-<p>With the code changes shown, you no longer need to manually move data and parameters between devices, which makes your code easier to read and less error prone.</p>
-<p>You can find additional documentation using <code><a href="../reference/accelerator.html">help(accelerator)</a></code>.</p>
+<p>The Accelerator API is best explained by showing an example diff in a
+raw torch training loop.</p>
+<div class="sourceCode" id="cb2"><pre class="sourceCode diff"><code class="sourceCode diff"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>library(torch)</span>
+<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="va">+ library(luz)</span></span>
+<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a><span class="va">+ acc &lt;- accelerator()</span></span>
+<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a><span class="st">- device &lt;- "cpu"</span></span>
+<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>data &lt;- tensor_dataset(</span>
+<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>  x = torch_randn(100, 10),</span>
+<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>  y = torch_rand(100, 1)</span>
+<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>)</span>
+<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a>dl &lt;- dataloader(data, batch_size = 10)</span>
+<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a>model &lt;- nn_linear(10, 1)</span>
+<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a><span class="st">- model$to(device = device)</span></span>
+<span id="cb2-16"><a href="#cb2-16" aria-hidden="true" tabindex="-1"></a>opt &lt;- optim_adam(model$parameters)</span>
+<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-18"><a href="#cb2-18" aria-hidden="true" tabindex="-1"></a><span class="va">+ c(model, opt, dl) %&lt;-% acc$prepare(model, opt, dl)</span></span>
+<span id="cb2-19"><a href="#cb2-19" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-20"><a href="#cb2-20" aria-hidden="true" tabindex="-1"></a>model$train()</span>
+<span id="cb2-21"><a href="#cb2-21" aria-hidden="true" tabindex="-1"></a>coro::loop(for (batch in dl) {</span>
+<span id="cb2-22"><a href="#cb2-22" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-23"><a href="#cb2-23" aria-hidden="true" tabindex="-1"></a>  opt$zero_grad()</span>
+<span id="cb2-24"><a href="#cb2-24" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-25"><a href="#cb2-25" aria-hidden="true" tabindex="-1"></a><span class="st">-  preds &lt;- model(batch$x$to(device = device))</span></span>
+<span id="cb2-26"><a href="#cb2-26" aria-hidden="true" tabindex="-1"></a><span class="va">+  preds &lt;- model(batch$x)</span></span>
+<span id="cb2-27"><a href="#cb2-27" aria-hidden="true" tabindex="-1"></a><span class="st">-  loss &lt;- nnf_mse_loss(preds, batch$y$to(device = device))</span></span>
+<span id="cb2-28"><a href="#cb2-28" aria-hidden="true" tabindex="-1"></a><span class="va">+  loss &lt;- nnf_mse_loss(preds, batch$y)</span></span>
+<span id="cb2-29"><a href="#cb2-29" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-30"><a href="#cb2-30" aria-hidden="true" tabindex="-1"></a>  loss$backward()</span>
+<span id="cb2-31"><a href="#cb2-31" aria-hidden="true" tabindex="-1"></a>  opt$step()</span>
+<span id="cb2-32"><a href="#cb2-32" aria-hidden="true" tabindex="-1"></a>})</span></code></pre></div>
+<p>With the code changes shown, you no longer need to manually move data
+and parameters between devices, which makes your code easier to read and
+less error prone.</p>
+<p>You can find additional documentation using
+<code><a href="../reference/accelerator.html">help(accelerator)</a></code>.</p>
 </div>
   </main>
 </div>
@@ -143,7 +156,7 @@ <h2 id="example">Example<a class="anchor" aria-label="anchor" href="#example"></
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/accelerator_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/accelerator_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/accelerator_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/checkpoints.html b/articles/checkpoints.html
index ec51faf5..6d11dcfa 100644
--- a/articles/checkpoints.html
+++ b/articles/checkpoints.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="checkpoints_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Checkpointing your models</h1>
             
@@ -93,15 +94,29 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torch.mlverse.org/docs" class="external-link">torch</a></span><span class="op">)</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/base/Random.html" class="external-link">set.seed</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span>
 <span><span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_manual_seed.html" class="external-link">torch_manual_seed</a></span><span class="op">(</span><span class="fl">1703</span><span class="op">)</span></span></code></pre></div>
-<p>When fitting models take too long you might want to save intermediate state to disk, if something goes wrong during training (eg. process is killed, network fails, etc) you can recover from where it stopped.</p>
-<p>You might also want to recover intermediate results to evaluate the model in different moments of the training, like comparing results after 10 epochs and after 30 epochs.</p>
-<p>This article describes luz features that are built to handle those cases. These features are optional and are enabled once you add specific callbacks to your <code>fit</code> call.</p>
+<p>When fitting models take too long you might want to save intermediate
+state to disk, if something goes wrong during training (eg. process is
+killed, network fails, etc) you can recover from where it stopped.</p>
+<p>You might also want to recover intermediate results to evaluate the
+model in different moments of the training, like comparing results after
+10 epochs and after 30 epochs.</p>
+<p>This article describes luz features that are built to handle those
+cases. These features are optional and are enabled once you add specific
+callbacks to your <code>fit</code> call.</p>
 <div class="section level2">
 <h2 id="resuming-training-runs-that-crashed">Resuming training runs that crashed<a class="anchor" aria-label="anchor" href="#resuming-training-runs-that-crashed"></a>
 </h2>
-<p>If you have a long training run that can crash for whatever reason (computer turned off, process kileed in cluster, etc), we recommend you to add <code>luz_callback_autoresume()</code> to your list of callbacks.</p>
-<p><code>luz_callback_autoresume()</code> will automatically checkpoint the whole state of your model at the end of each epoch. If something fails during training you can simply rerun the same script, whithout any code changes and the checkpoint will be reloaded and the training will start from where it stopped.</p>
-<p>For example, lets’s take a randomly generated training dataset and a linear model to show how autoresume works.</p>
+<p>If you have a long training run that can crash for whatever reason
+(computer turned off, process kileed in cluster, etc), we recommend you
+to add <code>luz_callback_autoresume()</code> to your list of
+callbacks.</p>
+<p><code>luz_callback_autoresume()</code> will automatically checkpoint
+the whole state of your model at the end of each epoch. If something
+fails during training you can simply rerun the same script, whithout any
+code changes and the checkpoint will be reloaded and the training will
+start from where it stopped.</p>
+<p>For example, lets’s take a randomly generated training dataset and a
+linear model to show how autoresume works.</p>
 <p>Here’s the training data:</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">x</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">1000</span>, <span class="fl">10</span><span class="op">)</span></span>
@@ -112,7 +127,9 @@ <h2 id="resuming-training-runs-that-crashed">Resuming training runs that crashed
 <span>  <span class="fu"><a href="../reference/setup.html">setup</a></span><span class="op">(</span>optimizer <span class="op">=</span> <span class="va">optim_sgd</span>, loss <span class="op">=</span> <span class="va">nnf_mse_loss</span><span class="op">)</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="../reference/set_hparams.html">set_hparams</a></span><span class="op">(</span>in_features <span class="op">=</span> <span class="fl">10</span>, out_features <span class="op">=</span> <span class="fl">1</span><span class="op">)</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="../reference/set_opt_hparams.html">set_opt_hparams</a></span><span class="op">(</span>lr <span class="op">=</span> <span class="fl">0.01</span><span class="op">)</span></span></code></pre></div>
-<p>Let’s now create a callback that simulates a random failure that could happen. This callback will just raise an R error on the 5th epoch.</p>
+<p>Let’s now create a callback that simulates a random failure that
+could happen. This callback will just raise an R error on the 5th
+epoch.</p>
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">interrupt</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback.html">luz_callback</a></span><span class="op">(</span></span>
 <span>  <span class="st">"interrupt"</span>,</span>
@@ -124,7 +141,8 @@ <h2 id="resuming-training-runs-that-crashed">Resuming training runs that crashed
 <span>    <span class="op">}</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>Let’s now start training adding the <code><a href="../reference/luz_callback_auto_resume.html">luz_callback_auto_resume()</a></code>:</p>
+<p>Let’s now start training adding the
+<code><a href="../reference/luz_callback_auto_resume.html">luz_callback_auto_resume()</a></code>:</p>
 <div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">autoresume</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback_auto_resume.html">luz_callback_auto_resume</a></span><span class="op">(</span>path <span class="op">=</span> <span class="st">"state.pt"</span><span class="op">)</span></span>
 <span><span class="va">inter</span> <span class="op">&lt;-</span> <span class="fu">interrupt</span><span class="op">(</span><span class="op">)</span></span>
@@ -140,14 +158,17 @@ <h2 id="resuming-training-runs-that-crashed">Resuming training runs that crashed
 <span><span class="co">#&gt;   <span style="color: #00BB00;">on_epoch_end</span>.</span></span>
 <span><span class="co">#&gt; <span style="font-weight: bold;">Caused by error in `self[[callback_nm]]()`:</span></span></span>
 <span><span class="co">#&gt; <span style="color: #BBBB00;">!</span> Error on epoch 5</span></span></code></pre></div>
-<p>To resume model training exactly from where it stopped you just need to restart fitting, using the exact same model, callbacks, etc:</p>
+<p>To resume model training exactly from where it stopped you just need
+to restart fitting, using the exact same model, callbacks, etc:</p>
 <div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">results</span> <span class="op">&lt;-</span> <span class="va">model</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span> <span class="fu"><a href="https://generics.r-lib.org/reference/fit.html" class="external-link">fit</a></span><span class="op">(</span></span>
 <span>  <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span><span class="va">x</span>, <span class="va">y</span><span class="op">)</span>,</span>
 <span>  callbacks <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span><span class="va">inter</span>, <span class="va">autoresume</span><span class="op">)</span>,</span>
 <span>  verbose <span class="op">=</span> <span class="cn">FALSE</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>With this, the model fitting process will be continued exactly from where it stopped. Records, optimizer and model state are recovered from the previous run so you can have the full results:</p>
+<p>With this, the model fitting process will be continued exactly from
+where it stopped. Records, optimizer and model state are recovered from
+the previous run so you can have the full results:</p>
 <div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/graphics/plot.default.html" class="external-link">plot</a></span><span class="op">(</span><span class="va">results</span><span class="op">)</span></span></code></pre></div>
 <p><img src="checkpoints_files/figure-html/unnamed-chunk-7-1.png" width="700"></p>
@@ -155,8 +176,12 @@ <h2 id="resuming-training-runs-that-crashed">Resuming training runs that crashed
 <div class="section level2">
 <h2 id="checkpointing">Checkpointing<a class="anchor" aria-label="anchor" href="#checkpointing"></a>
 </h2>
-<p>Sometimes you want to have more control over how checkpoints are handled. In this case you can use <code><a href="../reference/luz_callback_model_checkpoint.html">luz_callback_model_checkpoint()</a></code> to save checkpoints to a specified file or directory.</p>
-<p>Let’s use the same example as in the resuming section: We first generate some data.</p>
+<p>Sometimes you want to have more control over how checkpoints are
+handled. In this case you can use
+<code><a href="../reference/luz_callback_model_checkpoint.html">luz_callback_model_checkpoint()</a></code> to save checkpoints to a
+specified file or directory.</p>
+<p>Let’s use the same example as in the resuming section: We first
+generate some data.</p>
 <div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">x</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">1000</span>, <span class="fl">10</span><span class="op">)</span></span>
 <span><span class="va">y</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">1000</span>, <span class="fl">1</span><span class="op">)</span></span></code></pre></div>
@@ -166,7 +191,8 @@ <h2 id="checkpointing">Checkpointing<a class="anchor" aria-label="anchor" href="
 <span>  <span class="fu"><a href="../reference/setup.html">setup</a></span><span class="op">(</span>optimizer <span class="op">=</span> <span class="va">optim_sgd</span>, loss <span class="op">=</span> <span class="va">nnf_mse_loss</span><span class="op">)</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="../reference/set_hparams.html">set_hparams</a></span><span class="op">(</span>in_features <span class="op">=</span> <span class="fl">10</span>, out_features <span class="op">=</span> <span class="fl">1</span><span class="op">)</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="../reference/set_opt_hparams.html">set_opt_hparams</a></span><span class="op">(</span>lr <span class="op">=</span> <span class="fl">0.01</span><span class="op">)</span></span></code></pre></div>
-<p>Let’s now fit the model using <code><a href="../reference/luz_callback_model_checkpoint.html">luz_callback_model_checkpoint()</a></code>.</p>
+<p>Let’s now fit the model using
+<code><a href="../reference/luz_callback_model_checkpoint.html">luz_callback_model_checkpoint()</a></code>.</p>
 <div class="sourceCode" id="cb10"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">checkpoint</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback_model_checkpoint.html">luz_callback_model_checkpoint</a></span><span class="op">(</span></span>
 <span>  path <span class="op">=</span> <span class="st">"checkpoints/"</span>, </span>
@@ -178,7 +204,12 @@ <h2 id="checkpointing">Checkpointing<a class="anchor" aria-label="anchor" href="
 <span>  callbacks <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span><span class="va">checkpoint</span><span class="op">)</span>,</span>
 <span>  verbose <span class="op">=</span> <span class="cn">FALSE</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>You can see now that the <code>checkpoints</code> directory contains files with state dumps for each epoch. By default, <code>luz_callback_model_checkpoint</code> will save the state for each epochs and format the name including the resulting loss. This can be configured withing the path parameter, see <code><a href="../reference/luz_callback_model_checkpoint.html">?luz_callback_model_checkpoint</a></code> for details.</p>
+<p>You can see now that the <code>checkpoints</code> directory contains
+files with state dumps for each epoch. By default,
+<code>luz_callback_model_checkpoint</code> will save the state for each
+epochs and format the name including the resulting loss. This can be
+configured withing the path parameter, see
+<code><a href="../reference/luz_callback_model_checkpoint.html">?luz_callback_model_checkpoint</a></code> for details.</p>
 <div class="sourceCode" id="cb11"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu">fs</span><span class="fu">::</span><span class="fu"><a href="https://fs.r-lib.org/reference/dir_ls.html" class="external-link">dir_ls</a></span><span class="op">(</span><span class="st">"checkpoints"</span><span class="op">)</span></span>
 <span><span class="co">#&gt; checkpoints/epoch-01-train_loss-1.237.pt</span></span>
@@ -191,11 +222,21 @@ <h2 id="checkpointing">Checkpointing<a class="anchor" aria-label="anchor" href="
 <span><span class="co">#&gt; checkpoints/epoch-08-train_loss-0.998.pt</span></span>
 <span><span class="co">#&gt; checkpoints/epoch-09-train_loss-1.001.pt</span></span>
 <span><span class="co">#&gt; checkpoints/epoch-10-train_loss-1.002.pt</span></span></code></pre></div>
-<p>Finally, you can load a specific checkpoint to the <code>fitted</code> result using <code>luz_load_checkpoint</code>. Note that loading the checkpoint into a a <code>luz_fitted_module</code> is going to modify the model weights in-place.</p>
+<p>Finally, you can load a specific checkpoint to the
+<code>fitted</code> result using <code>luz_load_checkpoint</code>. Note
+that loading the checkpoint into a a <code>luz_fitted_module</code> is
+going to modify the model weights in-place.</p>
 <div class="sourceCode" id="cb12"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="../reference/luz_load_checkpoint.html">luz_load_checkpoint</a></span><span class="op">(</span><span class="va">results</span>, <span class="fu">fs</span><span class="fu">::</span><span class="fu"><a href="https://fs.r-lib.org/reference/dir_ls.html" class="external-link">dir_ls</a></span><span class="op">(</span><span class="st">"checkpoints"</span><span class="op">)</span><span class="op">[</span><span class="fl">1</span><span class="op">]</span><span class="op">)</span></span></code></pre></div>
-<p>You can then start making predictions, or evaluate your model using the reloeded weights.</p>
-<p>You might also want to start a new training run from a checkpoint. For this, you can use the <code><a href="../reference/luz_callback_resume_from_checkpoint.html">luz_callback_resume_from_checkpoint()</a></code>. By default, it will only recover the model weights from the checkpoint file, but you can configure it to restore records, callback and optimizer state too. If a checkpoint directory is passed then training will resume from the last checkpoint file as returned by <code><a href="https://fs.r-lib.org/reference/dir_ls.html" class="external-link">fs::dir_ls</a></code>.</p>
+<p>You can then start making predictions, or evaluate your model using
+the reloeded weights.</p>
+<p>You might also want to start a new training run from a checkpoint.
+For this, you can use the
+<code><a href="../reference/luz_callback_resume_from_checkpoint.html">luz_callback_resume_from_checkpoint()</a></code>. By default, it will
+only recover the model weights from the checkpoint file, but you can
+configure it to restore records, callback and optimizer state too. If a
+checkpoint directory is passed then training will resume from the last
+checkpoint file as returned by <code><a href="https://fs.r-lib.org/reference/dir_ls.html" class="external-link">fs::dir_ls</a></code>.</p>
 <p>Here’s how you would use this callback:</p>
 <div class="sourceCode" id="cb13"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">resume</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback_resume_from_checkpoint.html">luz_callback_resume_from_checkpoint</a></span><span class="op">(</span>path <span class="op">=</span> <span class="st">"checkpoints/"</span><span class="op">)</span></span>
@@ -209,8 +250,15 @@ <h2 id="checkpointing">Checkpointing<a class="anchor" aria-label="anchor" href="
 <div class="section level3">
 <h3 id="custom-callbacks-state">Custom callbacks state<a class="anchor" aria-label="anchor" href="#custom-callbacks-state"></a>
 </h3>
-<p>Sometimes callbacks also need to keep their internal state in order to allow continuing training exactly from where it stopped. In this case, callbacks can implement the <code>state_dict()</code> and the <code><a href="https://rdrr.io/pkg/torch/man/load_state_dict.html" class="external-link">load_state_dict()</a></code> methods that are automatically called when saving and reloading checkpoints.</p>
-<p>For example, suppose that you have a callback that tracks gradients for weights at every epoch. You want to use the tracked weights to further analyse the training procedure. It could be implemented like:</p>
+<p>Sometimes callbacks also need to keep their internal state in order
+to allow continuing training exactly from where it stopped. In this
+case, callbacks can implement the <code>state_dict()</code> and the
+<code><a href="https://rdrr.io/pkg/torch/man/load_state_dict.html" class="external-link">load_state_dict()</a></code> methods that are automatically called
+when saving and reloading checkpoints.</p>
+<p>For example, suppose that you have a callback that tracks gradients
+for weights at every epoch. You want to use the tracked weights to
+further analyse the training procedure. It could be implemented
+like:</p>
 <div class="sourceCode" id="cb14"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">cb_weight_grad</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback.html">luz_callback</a></span><span class="op">(</span></span>
 <span>  <span class="st">"weight_grad"</span>,</span>
@@ -225,7 +273,14 @@ <h3 id="custom-callbacks-state">Custom callbacks state<a class="anchor" aria-lab
 <span>    <span class="op">}</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>In the above example, the <code>gradients</code> field is a <strong>state</strong> in the callback. If training fails for some reason, <code>gradients</code> will be lost. If it’s important for you to also checkpoint the callback state, you can implement the <code>state_dict()</code> method must returning a named list of objects that compose the state of the callback and <code><a href="https://rdrr.io/pkg/torch/man/load_state_dict.html" class="external-link">load_state_dict()</a></code> taking the same named list returned by <code>state_dict()</code> and restoring the callback state.</p>
+<p>In the above example, the <code>gradients</code> field is a
+<strong>state</strong> in the callback. If training fails for some
+reason, <code>gradients</code> will be lost. If it’s important for you
+to also checkpoint the callback state, you can implement the
+<code>state_dict()</code> method must returning a named list of objects
+that compose the state of the callback and
+<code><a href="https://rdrr.io/pkg/torch/man/load_state_dict.html" class="external-link">load_state_dict()</a></code> taking the same named list returned by
+<code>state_dict()</code> and restoring the callback state.</p>
 <p>The callback above could be reimplemented with:</p>
 <div class="sourceCode" id="cb15"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">cb_weight_grad</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback.html">luz_callback</a></span><span class="op">(</span></span>
@@ -262,7 +317,7 @@ <h3 id="custom-callbacks-state">Custom callbacks state<a class="anchor" aria-lab
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/checkpoints_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/checkpoints_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/checkpoints_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/custom-loop.html b/articles/custom-loop.html
index b27b17d8..658563a7 100644
--- a/articles/custom-loop.html
+++ b/articles/custom-loop.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="custom-loop_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Custom loops with luz</h1>
             
@@ -91,15 +92,36 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torch.mlverse.org/docs" class="external-link">torch</a></span><span class="op">)</span></span>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://mlverse.github.io/luz/" class="external-link">luz</a></span><span class="op">)</span></span></code></pre></div>
-<p>Luz is a higher level API for torch that is designed to be highly flexible by providing a layered API that allows it to be useful no matter the level of control your need for your training loop.</p>
-<p>In the getting started vignette we have seen the basics of luz and how to quickly modify parts of the training loop using callbacks and custom metrics. In this document we will describe how luz allows the user to get fine-grained control of the training loop.</p>
-<p>Apart from the use of callbacks, there are three more ways that you can use luz (depending on how much control you need):</p>
+<p>Luz is a higher level API for torch that is designed to be highly
+flexible by providing a layered API that allows it to be useful no
+matter the level of control your need for your training loop.</p>
+<p>In the getting started vignette we have seen the basics of luz and
+how to quickly modify parts of the training loop using callbacks and
+custom metrics. In this document we will describe how luz allows the
+user to get fine-grained control of the training loop.</p>
+<p>Apart from the use of callbacks, there are three more ways that you
+can use luz (depending on how much control you need):</p>
 <ul>
-<li><p><strong>Multiple optimizers or losses:</strong> You might be optimizing two loss functions each with its own optimizer, but you still don’t want to modify the <code>backward()</code> - <code>zero_grad()</code> and <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> calls. This is common in models like GANs (Generative Adversarial Networks) when you have competing neural networks trained with different losses and optimizers.</p></li>
-<li><p><strong>Fully flexible steps:</strong> You might want to be in control of how to call <code>backward()</code>, <code>zero_grad()</code>and <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code>. You might also want to have more control of gradient computation. For example, you might want to use ‘virtual batch sizes’, where you accumulate the gradients for a few steps before updating the weights.</p></li>
-<li><p><strong>Completely flexible loops:</strong> Your training loop can be anything you want but you still want to use luz to handle device placement of the dataloaders, optimizers and models. See <code><a href="../articles/accelerator.html">vignette("accelerator")</a></code>.</p></li>
+<li><p><strong>Multiple optimizers or losses:</strong> You might be
+optimizing two loss functions each with its own optimizer, but you still
+don’t want to modify the <code>backward()</code> -
+<code>zero_grad()</code> and <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> calls. This is common
+in models like GANs (Generative Adversarial Networks) when you have
+competing neural networks trained with different losses and
+optimizers.</p></li>
+<li><p><strong>Fully flexible steps:</strong> You might want to be in
+control of how to call <code>backward()</code>,
+<code>zero_grad()</code>and <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code>. You might also want to
+have more control of gradient computation. For example, you might want
+to use ‘virtual batch sizes’, where you accumulate the gradients for a
+few steps before updating the weights.</p></li>
+<li><p><strong>Completely flexible loops:</strong> Your training loop
+can be anything you want but you still want to use luz to handle device
+placement of the dataloaders, optimizers and models. See
+<code><a href="../articles/accelerator.html">vignette("accelerator")</a></code>.</p></li>
 </ul>
-<p>Let’s consider a simplified version of the <code>net</code> that we implemented in the getting started vignette:</p>
+<p>Let’s consider a simplified version of the <code>net</code> that we
+implemented in the getting started vignette:</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">net</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
 <span>  <span class="st">"Net"</span>,</span>
@@ -128,11 +150,18 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <div class="section level2">
 <h2 id="multiple-optimizers">Multiple optimizers<a class="anchor" aria-label="anchor" href="#multiple-optimizers"></a>
 </h2>
-<p>Suppose we want to do an experiment where we train the first fully connected layer using a learning rate of 0.1 and the second one using a learning rate of 0.01. We will minimize the same <code><a href="https://rdrr.io/pkg/torch/man/nn_cross_entropy_loss.html" class="external-link">nn_cross_entropy_loss()</a></code> for both, but for the first layer we want to add L1 regularization on the weights.</p>
-<p>In order to use luz for this, we will implement two methods in the <code>net</code> module:</p>
+<p>Suppose we want to do an experiment where we train the first fully
+connected layer using a learning rate of 0.1 and the second one using a
+learning rate of 0.01. We will minimize the same
+<code><a href="https://rdrr.io/pkg/torch/man/nn_cross_entropy_loss.html" class="external-link">nn_cross_entropy_loss()</a></code> for both, but for the first layer
+we want to add L1 regularization on the weights.</p>
+<p>In order to use luz for this, we will implement two methods in the
+<code>net</code> module:</p>
 <ul>
-<li><p><code>set_optimizers</code>: returns a named list of optimizers depending on the <code>ctx</code>.</p></li>
-<li><p><code>loss</code>: computes the loss depending on the selected optimizer.</p></li>
+<li><p><code>set_optimizers</code>: returns a named list of optimizers
+depending on the <code>ctx</code>.</p></li>
+<li><p><code>loss</code>: computes the loss depending on the selected
+optimizer.</p></li>
 </ul>
 <p>Let’s go to the code:</p>
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
@@ -163,19 +192,35 @@ <h2 id="multiple-optimizers">Multiple optimizers<a class="anchor" aria-label="an
 <span>      <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nnf_cross_entropy.html" class="external-link">nnf_cross_entropy</a></span><span class="op">(</span><span class="va">pred</span>, <span class="va">target</span><span class="op">)</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>Notice that the model optimizers will be initialized according to the <code>set_optimizers()</code> method’s return value (a list). In this case, we are initializing the optimizers using different model parameters and learning rates.</p>
-<p>The <code>loss()</code> method is responsible for computing the loss that will then be back-propagated to compute gradients and update the weights. This <code>loss()</code> method can access the <code>ctx</code> object that will contain an <code>opt_name</code> field, describing which optimizer is currently being used. Note that this function will be called once for each optimizer for each training and validation step. See <code><a href="../reference/ctx.html">help("ctx")</a></code> for complete information about the context object.</p>
-<p>We can finally <code>setup</code> and <code>fit</code> this module, however we no longer need to specify optimizers and loss functions.</p>
+<p>Notice that the model optimizers will be initialized according to the
+<code>set_optimizers()</code> method’s return value (a list). In this
+case, we are initializing the optimizers using different model
+parameters and learning rates.</p>
+<p>The <code>loss()</code> method is responsible for computing the loss
+that will then be back-propagated to compute gradients and update the
+weights. This <code>loss()</code> method can access the <code>ctx</code>
+object that will contain an <code>opt_name</code> field, describing
+which optimizer is currently being used. Note that this function will be
+called once for each optimizer for each training and validation step.
+See <code><a href="../reference/ctx.html">help("ctx")</a></code> for complete information about the context
+object.</p>
+<p>We can finally <code>setup</code> and <code>fit</code> this module,
+however we no longer need to specify optimizers and loss functions.</p>
 <div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">fitted</span> <span class="op">&lt;-</span> <span class="va">net</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span> </span>
 <span>  <span class="fu"><a href="../reference/setup.html">setup</a></span><span class="op">(</span>metrics <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span><span class="va">luz_metric_accuracy</span><span class="op">)</span><span class="op">)</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span> </span>
 <span>  <span class="fu"><a href="https://generics.r-lib.org/reference/fit.html" class="external-link">fit</a></span><span class="op">(</span><span class="va">train_dl</span>, epochs <span class="op">=</span> <span class="fl">10</span>, valid_data <span class="op">=</span> <span class="va">test_dl</span><span class="op">)</span></span></code></pre></div>
-<p>Now let’s re-implement this same model using the slightly more flexible approach of overriding the training and validation step.</p>
+<p>Now let’s re-implement this same model using the slightly more
+flexible approach of overriding the training and validation step.</p>
 </div>
 <div class="section level2">
 <h2 id="fully-flexible-step">Fully flexible step<a class="anchor" aria-label="anchor" href="#fully-flexible-step"></a>
 </h2>
-<p>Instead of implementing the <code>loss()</code> method, we can implement the <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method. This allows us to flexibly modify what happens when training and validating for each batch in the dataset. You are now responsible for updating the weights by stepping the optimizers and back-propagating the loss.</p>
+<p>Instead of implementing the <code>loss()</code> method, we can
+implement the <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method. This allows us to flexibly
+modify what happens when training and validating for each batch in the
+dataset. You are now responsible for updating the weights by stepping
+the optimizers and back-propagating the loss.</p>
 <div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">net</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
 <span>  <span class="st">"Net"</span>,</span>
@@ -221,19 +266,43 @@ <h2 id="fully-flexible-step">Fully flexible step<a class="anchor" aria-label="an
 <span><span class="op">)</span></span></code></pre></div>
 <p>The important things to notice here are:</p>
 <ul>
-<li><p>The <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method is used for both training and validation. You need to be careful to only modify the weights when training. Again, you can get complete information regarding the context object using <code><a href="../reference/ctx.html">help("ctx")</a></code>.</p></li>
-<li><p><code>ctx$optimizers</code> is a named list holding each optimizer that was created when the <code>set_optimizers()</code> method was called.</p></li>
-<li><p>You need to manually track the losses by saving saving them in a named list in <code>ctx$loss</code>. By convention, we use the same name as the optimizer it refers to. It is good practice to <code><a href="https://rdrr.io/r/base/detach.html" class="external-link">detach()</a></code> them before saving to reduce memory usage.</p></li>
-<li><p>Callbacks that would be called inside the default <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method like <code>on_train_batch_after_pred</code>, <code>on_train_batch_after_loss</code>, etc, won’t be automatically called. You can still cal them manually by adding <code>ctx$call_callbacks("&lt;callback name&gt;")</code> inside your training step. See the code for <code>fit_one_batch()</code> and <code>valid_one_batch</code> to find all the callbacks that won’t be called.</p></li>
-<li><p>If you want luz metrics to work with your custom <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method, you must assign <code>ctx$pred</code> with the model predictions as metrics will always be called with <code>metric$update(ctx$pred, ctx$target)</code>.</p></li>
+<li><p>The <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method is used for both training and
+validation. You need to be careful to only modify the weights when
+training. Again, you can get complete information regarding the context
+object using <code><a href="../reference/ctx.html">help("ctx")</a></code>.</p></li>
+<li><p><code>ctx$optimizers</code> is a named list holding each
+optimizer that was created when the <code>set_optimizers()</code> method
+was called.</p></li>
+<li><p>You need to manually track the losses by saving saving them in a
+named list in <code>ctx$loss</code>. By convention, we use the same name
+as the optimizer it refers to. It is good practice to
+<code><a href="https://rdrr.io/r/base/detach.html" class="external-link">detach()</a></code> them before saving to reduce memory
+usage.</p></li>
+<li><p>Callbacks that would be called inside the default
+<code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method like <code>on_train_batch_after_pred</code>,
+<code>on_train_batch_after_loss</code>, etc, won’t be automatically
+called. You can still cal them manually by adding
+<code>ctx$call_callbacks("&lt;callback name&gt;")</code> inside your
+training step. See the code for <code>fit_one_batch()</code> and
+<code>valid_one_batch</code> to find all the callbacks that won’t be
+called.</p></li>
+<li><p>If you want luz metrics to work with your custom
+<code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> method, you must assign <code>ctx$pred</code> with
+the model predictions as metrics will always be called with
+<code>metric$update(ctx$pred, ctx$target)</code>.</p></li>
 </ul>
 </div>
 <div class="section level2">
 <h2 id="next-steps">Next steps<a class="anchor" aria-label="anchor" href="#next-steps"></a>
 </h2>
-<p>In this article you learned how to customize the <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> of your training loop using luz layered functionality.</p>
-<p>Luz also allows more flexible modifications of the training loop described in the Accelerator vignette (<code><a href="../articles/accelerator.html">vignette("accelerator")</a></code>).</p>
-<p>You should now be able to follow the examples marked with the ‘intermediate’ and ‘advanced’ category in the <a href="https://mlverse.github.io/luz/articles/examples/index.html" class="external-link">examples gallery</a>.</p>
+<p>In this article you learned how to customize the <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code>
+of your training loop using luz layered functionality.</p>
+<p>Luz also allows more flexible modifications of the training loop
+described in the Accelerator vignette
+(<code><a href="../articles/accelerator.html">vignette("accelerator")</a></code>).</p>
+<p>You should now be able to follow the examples marked with the
+‘intermediate’ and ‘advanced’ category in the <a href="https://mlverse.github.io/luz/articles/examples/index.html" class="external-link">examples
+gallery</a>.</p>
 </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
     </nav></aside>
@@ -248,7 +317,7 @@ <h2 id="next-steps">Next steps<a class="anchor" aria-label="anchor" href="#next-
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/custom-loop_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/custom-loop_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/custom-loop_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/chargpt.html b/articles/examples/chargpt.html
index 98f09978..270e7efa 100644
--- a/articles/examples/chargpt.html
+++ b/articles/examples/chargpt.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="chargpt_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>CharGPT</h1>
             
@@ -88,15 +89,24 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
     
     
-<p>This example is inspired by the <a href="https://github.com/karpathy/minGPT/tree/master/projects/chargpt" class="external-link">chargpt</a> project by Andrey Karpathy. We are going to train character-level language model on Shakespeare texts.</p>
+<p>This example is inspired by the <a href="https://github.com/karpathy/minGPT/tree/master/projects/chargpt" class="external-link">chargpt</a>
+project by Andrey Karpathy. We are going to train character-level
+language model on Shakespeare texts.</p>
 <p>We first load the libraries that we plan to use:</p>
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torch.mlverse.org/docs" class="external-link">torch</a></span><span class="op">)</span></span>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://mlverse.github.io/luz/" class="external-link">luz</a></span><span class="op">)</span></span>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/nteetor/zeallot" class="external-link">zeallot</a></span><span class="op">)</span></span></code></pre></div>
-<p>Next we define the torch dataset that will pre-process data for the model. It splits the text into a character vector, each element containing exactly one character.</p>
-<p>Then lists all unique characters into the <code>vocab</code> attribute. The order of the characters in the vocabulary is used to encode each character to an integer value, that will be used in the embedding layer.</p>
-<p>The <code>.getitem()</code> method, can take chunks of <code>block_size</code> characters and encode them into their integer representation.</p>
+<p>Next we define the torch dataset that will pre-process data for the
+model. It splits the text into a character vector, each element
+containing exactly one character.</p>
+<p>Then lists all unique characters into the <code>vocab</code>
+attribute. The order of the characters in the vocabulary is used to
+encode each character to an integer value, that will be used in the
+embedding layer.</p>
+<p>The <code>.getitem()</code> method, can take chunks of
+<code>block_size</code> characters and encode them into their integer
+representation.</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">url</span> <span class="op">&lt;-</span> <span class="st">"https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"</span></span>
 <span></span>
@@ -124,8 +134,15 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span></span>
 <span><span class="va">dataset</span> <span class="op">&lt;-</span> <span class="fu">char_dataset</span><span class="op">(</span><span class="fu">readr</span><span class="fu">::</span><span class="fu">read_file</span><span class="op">(</span><span class="va">url</span><span class="op">)</span><span class="op">)</span></span>
 <span><span class="va">dataset</span><span class="op">[</span><span class="fl">1</span><span class="op">]</span> <span class="co"># this allows us to see an element of the dataset</span></span></code></pre></div>
-<p>We then define the neural net we are going to train. Defining a GPT-2 model is quite verbose, so we are going to use the minhub implementation directly. You can find the full model definition <a href="https://github.com/mlverse/minhub/blob/main/R/gpt2.R" class="external-link">here</a>, and this code is entirely self-contained, so you don’t need to install minhub, if you don’t want to.</p>
-<p>We also implemented the <code>generate</code> method for the model, that allows one to generate completions using the model. It applies the model in a loop, at each iteration prediction what’s the next character.</p>
+<p>We then define the neural net we are going to train. Defining a GPT-2
+model is quite verbose, so we are going to use the minhub implementation
+directly. You can find the full model definition <a href="https://github.com/mlverse/minhub/blob/main/R/gpt2.R" class="external-link">here</a>,
+and this code is entirely self-contained, so you don’t need to install
+minhub, if you don’t want to.</p>
+<p>We also implemented the <code>generate</code> method for the model,
+that allows one to generate completions using the model. It applies the
+model in a loop, at each iteration prediction what’s the next
+character.</p>
 <div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">model</span> <span class="op">&lt;-</span> <span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
 <span>    initialize <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">vocab_size</span><span class="op">)</span> <span class="op">{</span></span>
@@ -155,7 +172,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span>        <span class="va">x</span></span>
 <span>    <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>Next, we implemented a callback that is used for nicely displaying generated samples during the model training:</p>
+<p>Next, we implemented a callback that is used for nicely displaying
+generated samples during the model training:</p>
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="co"># samples from the model using the context.</span></span>
 <span><span class="va">generate</span> <span class="op">&lt;-</span> <span class="kw">function</span><span class="op">(</span><span class="va">model</span>, <span class="va">vocab</span>, <span class="va">context</span>, <span class="va">...</span><span class="op">)</span> <span class="op">{</span></span>
@@ -203,7 +221,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span>        <span class="fu"><a href="../../reference/luz_callback_gradient_clip.html">luz_callback_gradient_clip</a></span><span class="op">(</span>max_norm <span class="op">=</span> <span class="fl">1</span><span class="op">)</span></span>
 <span>      <span class="op">)</span></span>
 <span>    <span class="op">)</span></span></code></pre></div>
-<p>One epoch, is reasonable for this dataset and takes ~1h on the M1 MBP. You can generate new samples with:</p>
+<p>One epoch, is reasonable for this dataset and takes ~1h on the M1
+MBP. You can generate new samples with:</p>
 <div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">context</span> <span class="op">&lt;-</span> <span class="st">"O God, O God!"</span></span>
 <span><span class="va">text</span> <span class="op">&lt;-</span> <span class="fu">generate</span><span class="op">(</span><span class="va">fitted</span><span class="op">$</span><span class="va">model</span>, <span class="va">dataset</span><span class="op">$</span><span class="va">vocab</span>, <span class="va">context</span>, iter <span class="op">=</span> <span class="fl">100</span><span class="op">)</span></span>
@@ -220,7 +239,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/chargpt_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/chargpt_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/chargpt_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/dogs-vs-cats-binary-classification.html b/articles/examples/dogs-vs-cats-binary-classification.html
index 3e4361be..d4d6932e 100644
--- a/articles/examples/dogs-vs-cats-binary-classification.html
+++ b/articles/examples/dogs-vs-cats-binary-classification.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="dogs-vs-cats-binary-classification_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Binary classification</h1>
             
@@ -175,7 +176,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/dogs-vs-cats-binary-classification_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/dogs-vs-cats-binary-classification_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/dogs-vs-cats-binary-classification_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/index.html b/articles/examples/index.html
index 4714a06a..980d72f7 100644
--- a/articles/examples/index.html
+++ b/articles/examples/index.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="index_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Examples</h1>
             
@@ -88,7 +89,9 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
     
     
-<p>This gallery of examples uses luz to train and validate a range of common deep learning architectures. The gallery also demonstrates basic and advanced usage of luz.</p>
+<p>This gallery of examples uses luz to train and validate a range of
+common deep learning architectures. The gallery also demonstrates basic
+and advanced usage of luz.</p>
 <div class="container">
 <div class="row mt-3">
 <div class="col-6">
@@ -113,9 +116,11 @@ <h5 class="card-title mb-1">
 </h5>
 <span class="badge rounded-pill bg-success mb-1">basic</span>
 <p class="card-text">
-Demonstrates using pre-trained models to build a binary classification model.
+Demonstrates using pre-trained models to build a binary classification
+model.
 </p>
-<a href="dogs-vs-cats-binary-classification.html" class="btn btn-primary">See code</a>
+<a href="dogs-vs-cats-binary-classification.html" class="btn btn-primary">See
+code</a>
 </div>
 </div>
 </div>
@@ -129,7 +134,8 @@ <h5 class="card-title mb-1">
 </h5>
 <span class="badge rounded-pill bg-success mb-1">basic</span>
 <p class="card-text">
-Builds an autoencoder for the MNIST dataset. Demonstrates overwriting the predict method
+Builds an autoencoder for the MNIST dataset. Demonstrates overwriting
+the predict method
 </p>
 <a href="mnist-autoencoder.html" class="btn btn-primary">See code</a>
 </div>
@@ -145,7 +151,8 @@ <h5 class="card-title mb-1">
 <p class="card-text">
 Showcases how to create a custom fully customized training step
 </p>
-<a href="mnist-cnn-virtual-batch-size.html" class="btn btn-primary">See code</a>
+<a href="mnist-cnn-virtual-batch-size.html" class="btn btn-primary">See
+code</a>
 </div>
 </div>
 </div>
@@ -219,7 +226,8 @@ <h5 class="card-title mb-1">
 </h5>
 <span class="badge rounded-pill bg-warning mb-1">intermediate</span>
 <p class="card-text">
-Implements a UNET model to separate the background of images of cats and dogs.
+Implements a UNET model to separate the background of images of cats and
+dogs.
 </p>
 <a href="pets-unet.html" class="btn btn-primary">See code</a>
 </div>
@@ -240,6 +248,23 @@ <h5 class="card-title mb-1">
 </div>
 </div>
 </div>
+<div class="row mt-3">
+<div class="col-6">
+<div class="card">
+<div class="card-body">
+<h5 class="card-title mb-1">
+Training a causal language model from scratch
+</h5>
+<span class="badge rounded-pill bg-danger mb-1">advanced</span>
+<p class="card-text">
+Implements datasets and trains a causal language model from scratch
+using R source code.
+</p>
+<a href="text-generation.html" class="btn btn-primary">See code</a>
+</div>
+</div>
+</div>
+</div>
 </div>
 <!-- <div class="row"> -->
 <!--   <div class="col-sm-6"> -->
@@ -274,7 +299,7 @@ <h5 class="card-title mb-1">
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/index_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/index_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/index_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/mnist-autoencoder.html b/articles/examples/mnist-autoencoder.html
index 4e2ff1e5..8827238a 100644
--- a/articles/examples/mnist-autoencoder.html
+++ b/articles/examples/mnist-autoencoder.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="mnist-autoencoder_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Autoencoder</h1>
             
@@ -180,7 +181,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/mnist-autoencoder_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/mnist-autoencoder_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/mnist-autoencoder_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/mnist-cnn-virtual-batch-size.html b/articles/examples/mnist-cnn-virtual-batch-size.html
index cca80ae9..874ecdfd 100644
--- a/articles/examples/mnist-cnn-virtual-batch-size.html
+++ b/articles/examples/mnist-cnn-virtual-batch-size.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="mnist-cnn-virtual-batch-size_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Virtual batch size</h1>
             
@@ -199,7 +200,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/mnist-cnn-virtual-batch-size_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/mnist-cnn-virtual-batch-size_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/mnist-cnn-virtual-batch-size_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/mnist-cnn.html b/articles/examples/mnist-cnn.html
index cbb2b246..addd84ac 100644
--- a/articles/examples/mnist-cnn.html
+++ b/articles/examples/mnist-cnn.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="mnist-cnn_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Simple CNN</h1>
             
@@ -177,7 +178,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/mnist-cnn_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/mnist-cnn_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/mnist-cnn_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/mnist-dcgan.html b/articles/examples/mnist-dcgan.html
index 5343ee76..02b9214e 100644
--- a/articles/examples/mnist-dcgan.html
+++ b/articles/examples/mnist-dcgan.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="mnist-dcgan_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>DCGAN</h1>
             
@@ -266,7 +267,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/mnist-dcgan_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/mnist-dcgan_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/mnist-dcgan_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/mnist-mixup.html b/articles/examples/mnist-mixup.html
index 31937768..d9510598 100644
--- a/articles/examples/mnist-mixup.html
+++ b/articles/examples/mnist-mixup.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="mnist-mixup_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>MixUp augmentation</h1>
             
@@ -187,7 +188,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/mnist-mixup_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/mnist-mixup_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/mnist-mixup_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/mnist-triplet.html b/articles/examples/mnist-triplet.html
index 5ae47134..fdb036ab 100644
--- a/articles/examples/mnist-triplet.html
+++ b/articles/examples/mnist-triplet.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="mnist-triplet_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Triplet loss</h1>
             
@@ -196,7 +197,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/mnist-triplet_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/mnist-triplet_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/mnist-triplet_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/pets-unet.html b/articles/examples/pets-unet.html
index 6543772e..6ab3a0be 100644
--- a/articles/examples/pets-unet.html
+++ b/articles/examples/pets-unet.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="pets-unet_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>UNET implementation</h1>
             
@@ -309,7 +310,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/pets-unet_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/pets-unet_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/pets-unet_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/text-classification.html b/articles/examples/text-classification.html
index f48170e1..f6437dee 100644
--- a/articles/examples/text-classification.html
+++ b/articles/examples/text-classification.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="text-classification_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Text classification from scratch</h1>
             
@@ -88,14 +89,21 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
     
     
-<p>This example is a port of <a href="https://keras.io/examples/nlp/text_classification_from_scratch/" class="external-link">‘Text classification from scratch’</a> from Keras documentation by Mark Omerick and François Chollet.</p>
-<p>First we implement a torch dataset that downloads and pre-process the data. The initialize method is called when we instantiate a dataset. Our implementation:</p>
+<p>This example is a port of <a href="https://keras.io/examples/nlp/text_classification_from_scratch/" class="external-link">‘Text
+classification from scratch’</a> from Keras documentation by Mark
+Omerick and François Chollet.</p>
+<p>First we implement a torch dataset that downloads and pre-process the
+data. The initialize method is called when we instantiate a dataset. Our
+implementation:</p>
 <ul>
-<li>Downloads the IMDB dataset if it doesn’t exist in the <code>root</code> directory.</li>
+<li>Downloads the IMDB dataset if it doesn’t exist in the
+<code>root</code> directory.</li>
 <li>Extracts the files into <code>root</code>.</li>
 <li>Creates a tokenizer using the files in the training set.</li>
 </ul>
-<p>We also implement the <code>.getitem</code> method that is used to extract a single element from the dataset and pre-process the file contents.</p>
+<p>We also implement the <code>.getitem</code> method that is used to
+extract a single element from the dataset and pre-process the file
+contents.</p>
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torch.mlverse.org/docs" class="external-link">torch</a></span><span class="op">)</span></span>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/mlverse/tok" class="external-link">tok</a></span><span class="op">)</span></span>
@@ -174,7 +182,9 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span></span>
 <span><span class="va">train_ds</span> <span class="op">&lt;-</span> <span class="fu">imdb_dataset</span><span class="op">(</span><span class="va">output_length</span>, <span class="va">vocab_size</span>,  <span class="st">"./imdb"</span>, split <span class="op">=</span> <span class="st">"train"</span><span class="op">)</span></span>
 <span><span class="va">test_ds</span> <span class="op">&lt;-</span> <span class="fu">imdb_dataset</span><span class="op">(</span><span class="va">output_length</span>, <span class="va">vocab_size</span>,  <span class="st">"./imdb"</span>, split <span class="op">=</span> <span class="st">"test"</span><span class="op">)</span></span></code></pre></div>
-<p>We now define the model we want to train. The model is a 1D convnet starting with an embedding layer and we plug a classifier at the output.</p>
+<p>We now define the model we want to train. The model is a 1D convnet
+starting with an embedding layer and we plug a classifier at the
+output.</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">model</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
 <span>  initialize <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">vocab_size</span>, <span class="va">embedding_dim</span><span class="op">)</span> <span class="op">{</span></span>
@@ -226,7 +236,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <p>We can finally obtain the metrics on the test dataset:</p>
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">fitted_model</span> <span class="op"><a href="../../reference/pipe.html">%&gt;%</a></span> <span class="fu"><a href="../../reference/evaluate.html">evaluate</a></span><span class="op">(</span><span class="va">test_ds</span><span class="op">)</span></span></code></pre></div>
-<p>Remember that in order to predict for texts, we need make the same pre-processing as used in the dataset definition.</p>
+<p>Remember that in order to predict for texts, we need make the same
+pre-processing as used in the dataset definition.</p>
   </main>
 </div>
 
@@ -239,7 +250,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/examples/text-classification_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/examples/text-classification_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/examples/text-classification_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/examples/text-generation.html b/articles/examples/text-generation.html
new file mode 100644
index 00000000..4558aaf8
--- /dev/null
+++ b/articles/examples/text-generation.html
@@ -0,0 +1,393 @@
+<!DOCTYPE html>
+<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<meta charset="utf-8">
+<meta http-equiv="X-UA-Compatible" content="IE=edge">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+<meta name="description" content="luz">
+<title>Training a causal language model from scratch • luz</title>
+<script src="../../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+<link href="../../deps/bootstrap-5.2.2/bootstrap.min.css" rel="stylesheet">
+<script src="../../deps/bootstrap-5.2.2/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
+<!-- bootstrap-toc --><script src="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js" integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="../../pkgdown.js"></script><meta property="og:title" content="Training a causal language model from scratch">
+<meta property="og:description" content="luz">
+<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
+<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
+<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
+<![endif]-->
+</head>
+<body>
+    <a href="#main" class="visually-hidden-focusable">Skip to contents</a>
+    
+
+    <nav class="navbar fixed-top navbar-light navbar-expand-lg bg-light"><div class="container">
+    
+    <a class="navbar-brand me-2" href="../../index.html">luz</a>
+
+    <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">0.4.0.9000</small>
+
+    
+    <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
+      <span class="navbar-toggler-icon"></span>
+    </button>
+
+    <div id="navbar" class="collapse navbar-collapse ms-3">
+      <ul class="navbar-nav me-auto">
+<li class="nav-item dropdown">
+  <a href="#" class="nav-link dropdown-toggle" data-bs-toggle="dropdown" role="button" aria-expanded="false" aria-haspopup="true" id="dropdown-articles">Articles</a>
+  <div class="dropdown-menu" aria-labelledby="dropdown-articles">
+    <h6 class="dropdown-header" data-toc-skip>Using luz</h6>
+    <a class="dropdown-item" href="../../articles/get-started.html">Get started</a>
+    <a class="dropdown-item" href="../../articles/custom-loop.html">Custom loops</a>
+    <a class="dropdown-item" href="../../articles/accelerator.html">Accelerator API</a>
+    <h6 class="dropdown-header" data-toc-skip>Guides</h6>
+    <a class="dropdown-item" href="../../articles/lr-finder.html">Using the lr_finder</a>
+    <a class="dropdown-item" href="../../articles/checkpoints.html">Checkpoints models</a>
+  </div>
+</li>
+<li class="active nav-item">
+  <a class="nav-link" href="../../articles/examples/index.html">Examples</a>
+</li>
+<li class="nav-item">
+  <a class="nav-link" href="../../reference/index.html">Reference</a>
+</li>
+<li class="nav-item">
+  <a class="nav-link" href="../../news/index.html">Changelog</a>
+</li>
+      </ul>
+<form class="form-inline my-2 my-lg-0" role="search">
+        <input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../../search.json" id="search-input" placeholder="Search for" autocomplete="off">
+</form>
+
+      <ul class="navbar-nav">
+<li class="nav-item">
+  <a class="external-link nav-link" href="https://github.com/mlverse/luz/" aria-label="github">
+    <span class="fab fa fab fa-github fa-lg"></span>
+     
+  </a>
+</li>
+      </ul>
+</div>
+
+    
+  </div>
+</nav><div class="container template-article">
+
+
+
+
+<div class="row">
+  <main id="main" class="col-md-9"><div class="page-header">
+      <img src="" class="logo" alt=""><h1>Training a causal language model from scratch</h1>
+            
+      
+      <small class="dont-index">Source: <a href="https://github.com/mlverse/luz/blob/HEAD/vignettes/examples/text-generation.Rmd" class="external-link"><code>vignettes/examples/text-generation.Rmd</code></a></small>
+      <div class="d-none name"><code>text-generation.Rmd</code></div>
+    </div>
+
+    
+    
+<p>This example is an adaptation of the ‘Training a causal language
+model from scratch’ class from the <a href="https://huggingface.co/learn/nlp-course/chapter7/6?fw=pt" class="external-link">Hugging
+Face NLP course</a>.</p>
+<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torch.mlverse.org/docs" class="external-link">torch</a></span><span class="op">)</span></span>
+<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/mlverse/tok" class="external-link">tok</a></span><span class="op">)</span></span>
+<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://mlverse.github.io/luz/" class="external-link">luz</a></span><span class="op">)</span></span>
+<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va">minhub</span><span class="op">)</span> <span class="co"># remotes::install_github("mlverse/minhub")</span></span>
+<span><span class="co">#library(tidyverse)</span></span>
+<span><span class="fu"><a href="https://rdrr.io/r/base/options.html" class="external-link">options</a></span><span class="op">(</span>arrow.skip_nul <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span>
+<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/apache/arrow/" class="external-link">arrow</a></span><span class="op">)</span></span></code></pre></div>
+<div class="section level2">
+<h2 id="data">Data<a class="anchor" aria-label="anchor" href="#data"></a>
+</h2>
+<p>First step is to implement a torch dataset that gathers data and
+pre-process it into a format that is suitable for training the
+model.</p>
+<p>That means that we need to:</p>
+<ol style="list-style-type: decimal">
+<li>Download data</li>
+<li>Train a tokenizer for this dataset</li>
+<li>Be able to produce sequences of tokens in the format expected by the
+model</li>
+</ol>
+<p>We are going to use 2 datasets available in Hugging Face Hub. The
+first contain all R packages source code available on CRAN. The second
+contains all R code that is available in GitHub data dumps. Both
+datasets are in the Parquet format. Following we implement a function
+that downloads and caches the data and then returns a single arrow table
+containing all data.</p>
+<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">read_dataset</span> <span class="op">&lt;-</span> <span class="kw">function</span><span class="op">(</span><span class="va">source</span><span class="op">)</span> <span class="op">{</span></span>
+<span>  <span class="va">d</span> <span class="op">&lt;-</span> <span class="va">source</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">hfhub</span><span class="fu">::</span><span class="fu">hub_snapshot</span><span class="op">(</span>repo_type <span class="op">=</span> <span class="st">"dataset"</span>, allow_patterns <span class="op">=</span> <span class="st">"parquet$"</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">fs</span><span class="fu">::</span><span class="fu"><a href="https://fs.r-lib.org/reference/path.html" class="external-link">path</a></span><span class="op">(</span><span class="st">"data/r"</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">arrow</span><span class="fu">::</span><span class="fu">open_dataset</span><span class="op">(</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html" class="external-link">filter</a></span><span class="op">(</span><span class="fu">stringr</span><span class="fu">::</span><span class="fu"><a href="https://stringr.tidyverse.org/reference/str_detect.html" class="external-link">str_detect</a></span><span class="op">(</span><span class="va">path</span>, <span class="st">".*\\.[rR]$"</span><span class="op">)</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html" class="external-link">select</a></span><span class="op">(</span><span class="va">content</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html" class="external-link">mutate</a></span><span class="op">(</span>content <span class="op">=</span> <span class="fu">arrow</span><span class="fu">::</span><span class="fu">cast</span><span class="op">(</span><span class="va">content</span>, <span class="fu">arrow</span><span class="fu">::</span><span class="fu">string</span><span class="op">(</span><span class="op">)</span><span class="op">)</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html" class="external-link">filter</a></span><span class="op">(</span><span class="op">!</span><span class="fu"><a href="https://rdrr.io/r/base/NA.html" class="external-link">is.na</a></span><span class="op">(</span><span class="va">content</span><span class="op">)</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>    <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/compute.html" class="external-link">collect</a></span><span class="op">(</span><span class="op">)</span> <span class="op"><a href="../../reference/pipe.html">%&gt;%</a></span></span>
+<span>    <span class="co"># the dataset contains invalid utf8 characters...</span></span>
+<span>    <span class="co"># we need to remove them, otherwise we get an error from tokenizers</span></span>
+<span>    <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html" class="external-link">filter</a></span><span class="op">(</span><span class="fu">utf8</span><span class="fu">::</span><span class="fu"><a href="https://ptrckprry.com/r-utf8/reference/as_utf8.html" class="external-link">utf8_valid</a></span><span class="op">(</span><span class="va">content</span><span class="op">)</span><span class="op">)</span></span>
+<span><span class="op">}</span></span>
+<span></span>
+<span><span class="va">read_datasets</span> <span class="op">&lt;-</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>  <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/bind_rows.html" class="external-link">bind_rows</a></span><span class="op">(</span></span>
+<span>    <span class="fu">read_dataset</span><span class="op">(</span><span class="st">"dfalbel/cran-packages"</span><span class="op">)</span>,</span>
+<span>    <span class="fu">read_dataset</span><span class="op">(</span><span class="st">"dfalbel/github-r-repos"</span><span class="op">)</span></span>
+<span>  <span class="op">)</span></span>
+<span><span class="op">}</span></span></code></pre></div>
+<p>Next we implement a function that trains a tokenizer for our
+dataset.</p>
+<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">create_tokenizer</span> <span class="op">&lt;-</span> <span class="kw">function</span><span class="op">(</span><span class="va">text</span>, <span class="va">vocab_size</span>, <span class="va">special_tokens</span><span class="op">)</span> <span class="op">{</span></span>
+<span>  <span class="va">tok</span> <span class="op">&lt;-</span> <span class="fu">tok</span><span class="fu">::</span><span class="va">tokenizer</span><span class="op">$</span><span class="fu">new</span><span class="op">(</span><span class="fu">tok</span><span class="fu">::</span><span class="va">model_bpe</span><span class="op">$</span><span class="fu">new</span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span>
+<span></span>
+<span>  <span class="va">tok</span><span class="op">$</span><span class="va">pre_tokenizer</span> <span class="op">&lt;-</span> <span class="fu">tok</span><span class="fu">::</span><span class="va">pre_tokenizer_byte_level</span><span class="op">$</span><span class="fu">new</span><span class="op">(</span>add_prefix_space <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span>
+<span>  <span class="va">tok</span><span class="op">$</span><span class="va">decoder</span> <span class="op">&lt;-</span> <span class="fu">tok</span><span class="fu">::</span><span class="va">decoder_byte_level</span><span class="op">$</span><span class="fu">new</span><span class="op">(</span><span class="op">)</span></span>
+<span>  <span class="va">tok</span><span class="op">$</span><span class="va">post_processor</span> <span class="op">&lt;-</span> <span class="fu">tok</span><span class="fu">::</span><span class="va">processor_byte_level</span><span class="op">$</span><span class="fu">new</span><span class="op">(</span>trim_offsets <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span>
+<span></span>
+<span>  <span class="va">tok</span><span class="op">$</span><span class="fu">train_from_memory</span><span class="op">(</span></span>
+<span>    <span class="va">text</span>,</span>
+<span>    <span class="fu">tok</span><span class="fu">::</span><span class="va">trainer_bpe</span><span class="op">$</span><span class="fu">new</span><span class="op">(</span>vocab_size <span class="op">=</span> <span class="va">vocab_size</span>, special_tokens <span class="op">=</span> <span class="va">special_tokens</span><span class="op">)</span></span>
+<span>  <span class="op">)</span></span>
+<span>  <span class="va">tok</span></span>
+<span><span class="op">}</span></span>
+<span></span>
+<span><span class="co"># test code to debug the tokenizer</span></span>
+<span><span class="co"># data &lt;- read_datasets()</span></span>
+<span><span class="co"># tok &lt;- create_tokenizer(data$content)</span></span></code></pre></div>
+<p>We can finally implement the torch dataset that we are going to use
+for training the model. We are going to use the
+<code><a href="https://rdrr.io/pkg/torch/man/iterable_dataset.html" class="external-link">torch::iterable_dataset</a></code> instead of
+<code><a href="https://rdrr.io/pkg/torch/man/dataset.html" class="external-link">torch::dataset</a></code>. The main motivation is that we can’t really
+know the total number of samples in the dataset, so we can implement a
+<code>.getitem()</code> method to get any arbiratrary sample. Thus we
+implement the <code>.iter</code> method that returns a new sample every
+time it’s called.</p>
+<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">r_sources_dataset</span> <span class="op">&lt;-</span> <span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/iterable_dataset.html" class="external-link">iterable_dataset</a></span><span class="op">(</span></span>
+<span>  <span class="st">"r_sources_dataset"</span>,</span>
+<span>  initialize <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">root</span> <span class="op">=</span> <span class="st">"."</span>, <span class="va">vocab_size</span> <span class="op">=</span> <span class="fl">20000</span>, <span class="va">context_length</span> <span class="op">=</span> <span class="fl">128</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">data</span> <span class="op">&lt;-</span> <span class="fu">read_datasets</span><span class="op">(</span><span class="op">)</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">context_length</span> <span class="op">&lt;-</span> <span class="va">context_length</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">index</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/sample.html" class="external-link">sample.int</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/nrow.html" class="external-link">nrow</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">data</span><span class="op">)</span><span class="op">)</span></span>
+<span></span>
+<span>    <span class="co"># we only create a tokenizer if it doesn't exist, otherwise we just load it</span></span>
+<span>    <span class="va">tok_path</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/file.path.html" class="external-link">file.path</a></span><span class="op">(</span><span class="va">root</span>, <span class="fu">glue</span><span class="fu">::</span><span class="fu"><a href="https://glue.tidyverse.org/reference/glue.html" class="external-link">glue</a></span><span class="op">(</span><span class="st">"tokenizer-{vocab_size}.json"</span><span class="op">)</span><span class="op">)</span></span>
+<span>    <span class="kw">if</span> <span class="op">(</span><span class="op">!</span><span class="fu"><a href="https://rdrr.io/r/base/files.html" class="external-link">file.exists</a></span><span class="op">(</span><span class="va">tok_path</span><span class="op">)</span><span class="op">)</span> <span class="op">{</span></span>
+<span>      <span class="va">self</span><span class="op">$</span><span class="va">tok</span> <span class="op">&lt;-</span> <span class="fu">create_tokenizer</span><span class="op">(</span></span>
+<span>        <span class="fu"><a href="https://rdrr.io/r/base/character.html" class="external-link">as.character</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">data</span><span class="op">$</span><span class="va">content</span><span class="op">)</span>,</span>
+<span>        <span class="va">vocab_size</span>,</span>
+<span>        <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"&lt;fbegin&gt;"</span>, <span class="st">"&lt;fend&gt;"</span><span class="op">)</span></span>
+<span>      <span class="op">)</span></span>
+<span>      <span class="fu">fs</span><span class="fu">::</span><span class="fu"><a href="https://fs.r-lib.org/reference/create.html" class="external-link">dir_create</a></span><span class="op">(</span><span class="va">root</span><span class="op">)</span></span>
+<span>      <span class="va">self</span><span class="op">$</span><span class="va">tok</span><span class="op">$</span><span class="fu">save</span><span class="op">(</span><span class="va">tok_path</span><span class="op">)</span></span>
+<span>    <span class="op">}</span> <span class="kw">else</span> <span class="op">{</span></span>
+<span>      <span class="va">self</span><span class="op">$</span><span class="va">tok</span> <span class="op">&lt;-</span> <span class="fu">tok</span><span class="fu">::</span><span class="va">tokenizer</span><span class="op">$</span><span class="fu">from_file</span><span class="op">(</span><span class="va">tok_path</span><span class="op">)</span></span>
+<span>    <span class="op">}</span></span>
+<span>  <span class="op">}</span>,</span>
+<span>  .iter <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">i</span> <span class="op">&lt;-</span> <span class="fl">1L</span></span>
+<span>    <span class="va">sequence</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="op">)</span></span>
+<span>    <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>      <span class="kw">while</span> <span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/length.html" class="external-link">length</a></span><span class="op">(</span><span class="va">sequence</span><span class="op">)</span> <span class="op">&lt;</span> <span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">context_length</span> <span class="op">+</span> <span class="fl">1</span><span class="op">)</span> <span class="op">&amp;&amp;</span> <span class="va">i</span> <span class="op">&lt;=</span> <span class="fu"><a href="https://rdrr.io/r/base/nrow.html" class="external-link">nrow</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">data</span><span class="op">)</span><span class="op">)</span> <span class="op">{</span></span>
+<span>        <span class="va">sequence</span> <span class="op">&lt;&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span></span>
+<span>          <span class="va">sequence</span>,</span>
+<span>          <span class="va">self</span><span class="op">$</span><span class="va">tok</span><span class="op">$</span><span class="fu">encode</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/paste.html" class="external-link">paste</a></span><span class="op">(</span><span class="st">"&lt;fbegin&gt;"</span>, <span class="fu"><a href="https://rdrr.io/r/base/character.html" class="external-link">as.character</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">data</span><span class="op">$</span><span class="va">content</span><span class="op">[</span><span class="va">self</span><span class="op">$</span><span class="va">index</span><span class="op">[</span><span class="va">i</span><span class="op">]</span><span class="op">]</span><span class="op">)</span>, <span class="st">"&lt;fend&gt;"</span><span class="op">)</span><span class="op">)</span><span class="op">$</span><span class="va">ids</span></span>
+<span>        <span class="op">)</span></span>
+<span>        <span class="va">i</span> <span class="op">&lt;-</span> <span class="va">i</span> <span class="op">+</span> <span class="fl">1L</span></span>
+<span>      <span class="op">}</span></span>
+<span></span>
+<span>      <span class="kw">if</span> <span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/length.html" class="external-link">length</a></span><span class="op">(</span><span class="va">sequence</span><span class="op">)</span> <span class="op">&lt;</span> <span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">context_length</span> <span class="op">+</span> <span class="fl">1</span><span class="op">)</span><span class="op">)</span> <span class="op">{</span></span>
+<span>        <span class="kw"><a href="https://rdrr.io/r/base/function.html" class="external-link">return</a></span><span class="op">(</span><span class="fu">coro</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/coro/man/iterator.html" class="external-link">exhausted</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span>
+<span>      <span class="op">}</span></span>
+<span></span>
+<span>      <span class="fu"><a href="https://rdrr.io/r/base/on.exit.html" class="external-link">on.exit</a></span><span class="op">(</span><span class="op">{</span></span>
+<span>        <span class="va">sequence</span> <span class="op">&lt;&lt;-</span> <span class="va">sequence</span><span class="op">[</span><span class="op">-</span><span class="fu"><a href="https://rdrr.io/r/base/seq.html" class="external-link">seq_len</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">context_length</span><span class="op">)</span><span class="op">]</span></span>
+<span>      <span class="op">}</span><span class="op">)</span></span>
+<span>      <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span></span>
+<span>        input_ids <span class="op">=</span> <span class="va">sequence</span><span class="op">[</span><span class="fu"><a href="https://rdrr.io/r/base/seq.html" class="external-link">seq_len</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">context_length</span><span class="op">)</span><span class="op">]</span> <span class="op">+</span> <span class="fl">1L</span>,</span>
+<span>        labels <span class="op">=</span> <span class="va">sequence</span><span class="op">[</span><span class="fl">2</span><span class="op">:</span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">context_length</span> <span class="op">+</span> <span class="fl">1</span><span class="op">)</span><span class="op">]</span> <span class="op">+</span> <span class="fl">1L</span></span>
+<span>      <span class="op">)</span></span>
+<span>    <span class="op">}</span></span>
+<span>  <span class="op">}</span></span>
+<span><span class="op">)</span></span>
+<span></span>
+<span><span class="co"># debug code for the dataset</span></span>
+<span><span class="co"># ds &lt;- r_sources_dataset("~/Downloads/")</span></span>
+<span><span class="co"># it &lt;- ds$.iter()</span></span>
+<span><span class="co"># it()</span></span>
+<span><span class="co"># ds$tok$get_vocab_size()</span></span></code></pre></div>
+<p>This dataset is likely too large for us to train the model on all
+documents in this example. It’s also hard to predict how long it will
+take for it to train until the end. In order to make it easier, we
+define a wraper dataset that is used to run the above dataset for a
+fixed number of steps. This is not required, but makes using luz more
+pleasant, as we can easily define for how many tokens we want to train
+our model.</p>
+<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">fixed_steps_iterable_dataset</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/iterable_dataset.html" class="external-link">iterable_dataset</a></span><span class="op">(</span></span>
+<span>  <span class="st">"fixed_steps_dataset"</span>,</span>
+<span>  initialize <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">dataset</span>, <span class="va">steps</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">dataset</span> <span class="op">&lt;-</span> <span class="va">dataset</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">steps</span> <span class="op">&lt;-</span> <span class="va">steps</span></span>
+<span>  <span class="op">}</span>,</span>
+<span>  .iter <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">i</span> <span class="op">&lt;-</span> <span class="fl">1L</span></span>
+<span>    <span class="va">iter</span> <span class="op">&lt;-</span> <span class="cn">NULL</span></span>
+<span>    <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>      <span class="kw">if</span> <span class="op">(</span><span class="va">i</span> <span class="op">&gt;</span> <span class="va">self</span><span class="op">$</span><span class="va">steps</span><span class="op">)</span> <span class="op">{</span></span>
+<span>        <span class="kw"><a href="https://rdrr.io/r/base/function.html" class="external-link">return</a></span><span class="op">(</span><span class="fu">coro</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/coro/man/iterator.html" class="external-link">exhausted</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span>
+<span>      <span class="op">}</span></span>
+<span></span>
+<span>      <span class="va">i</span> <span class="op">&lt;&lt;-</span> <span class="va">i</span> <span class="op">+</span> <span class="fl">1L</span></span>
+<span></span>
+<span>      <span class="kw">if</span> <span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/NULL.html" class="external-link">is.null</a></span><span class="op">(</span><span class="va">iter</span><span class="op">)</span> <span class="op">||</span> <span class="fu">coro</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/coro/man/iterator.html" class="external-link">is_exhausted</a></span><span class="op">(</span><span class="va">data</span> <span class="op">&lt;-</span> <span class="fu">iter</span><span class="op">(</span><span class="op">)</span><span class="op">)</span><span class="op">)</span> <span class="op">{</span></span>
+<span>        <span class="va">iter</span> <span class="op">&lt;&lt;-</span> <span class="va">self</span><span class="op">$</span><span class="va">dataset</span><span class="op">$</span><span class="fu">.iter</span><span class="op">(</span><span class="op">)</span></span>
+<span>        <span class="va">data</span> <span class="op">&lt;-</span> <span class="fu">iter</span><span class="op">(</span><span class="op">)</span></span>
+<span>      <span class="op">}</span></span>
+<span></span>
+<span>      <span class="va">data</span></span>
+<span>    <span class="op">}</span></span>
+<span>  <span class="op">}</span>,</span>
+<span>  .length <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">steps</span></span>
+<span>  <span class="op">}</span></span>
+<span><span class="op">)</span></span></code></pre></div>
+<p>We finally define the model we are going to train. We’ll use a small
+version of GPT2. We also define a <code>generate</code> method allowing
+us to sample from the model given an initial context.</p>
+<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">net</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
+<span>  initialize <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="va">gpt</span> <span class="op">&lt;-</span> <span class="fu">minhub</span><span class="fu">::</span><span class="fu">gpt2</span><span class="op">(</span></span>
+<span>      vocab_size <span class="op">=</span> <span class="fl">20000</span>,</span>
+<span>      pdrop <span class="op">=</span> <span class="fl">0.1</span></span>
+<span>    <span class="op">)</span></span>
+<span>  <span class="op">}</span>,</span>
+<span>  forward <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">x</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="va">self</span><span class="op">$</span><span class="fu">gpt</span><span class="op">(</span><span class="va">x</span><span class="op">)</span><span class="op">$</span><span class="fu">transpose</span><span class="op">(</span><span class="fl">2</span>,<span class="fl">3</span><span class="op">)</span></span>
+<span>  <span class="op">}</span>,</span>
+<span>  generate <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">x</span>, <span class="va">temperature</span> <span class="op">=</span> <span class="fl">1</span>, <span class="va">iter</span> <span class="op">=</span> <span class="fl">50</span>, <span class="va">top_k</span> <span class="op">=</span> <span class="fl">10</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="co"># samples from the model givn a context vector.</span></span>
+<span>    <span class="kw">for</span> <span class="op">(</span><span class="va">i</span> <span class="kw">in</span> <span class="fu"><a href="https://rdrr.io/r/base/seq.html" class="external-link">seq_len</a></span><span class="op">(</span><span class="va">iter</span><span class="op">)</span><span class="op">)</span> <span class="op">{</span></span>
+<span>      <span class="va">logits</span> <span class="op">&lt;-</span> <span class="va">self</span><span class="op">$</span><span class="fu">forward</span><span class="op">(</span><span class="va">x</span><span class="op">)</span><span class="op">[</span>,,<span class="op">-</span><span class="fl">1</span><span class="op">]</span></span>
+<span>      <span class="va">logits</span> <span class="op">&lt;-</span> <span class="va">logits</span><span class="op">/</span><span class="va">temperature</span></span>
+<span>      <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="va">prob</span>, <span class="va">ind</span><span class="op">)</span> <span class="op"><a href="https://rdrr.io/pkg/zeallot/man/operator.html" class="external-link">%&lt;-%</a></span> <span class="va">logits</span><span class="op">$</span><span class="fu">topk</span><span class="op">(</span><span class="va">top_k</span><span class="op">)</span></span>
+<span>      <span class="va">logits</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_full_like.html" class="external-link">torch_full_like</a></span><span class="op">(</span><span class="va">logits</span>, <span class="op">-</span><span class="cn">Inf</span><span class="op">)</span><span class="op">$</span><span class="fu">scatter_</span><span class="op">(</span><span class="op">-</span><span class="fl">1</span>, <span class="va">ind</span>, <span class="va">prob</span><span class="op">)</span></span>
+<span>      <span class="va">logits</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nnf_softmax.html" class="external-link">nnf_softmax</a></span><span class="op">(</span><span class="va">logits</span>, dim <span class="op">=</span> <span class="op">-</span><span class="fl">1</span><span class="op">)</span></span>
+<span>      <span class="va">id_next</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_multinomial.html" class="external-link">torch_multinomial</a></span><span class="op">(</span><span class="va">logits</span>, num_samples <span class="op">=</span> <span class="fl">1</span><span class="op">)</span></span>
+<span>      <span class="va">x</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_cat.html" class="external-link">torch_cat</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span><span class="va">x</span>, <span class="va">id_next</span><span class="op">)</span>, dim <span class="op">=</span> <span class="fl">2</span><span class="op">)</span></span>
+<span>    <span class="op">}</span></span>
+<span>    <span class="va">x</span></span>
+<span>  <span class="op">}</span></span>
+<span><span class="op">)</span></span>
+<span></span>
+<span><span class="co"># debug code for the model</span></span>
+<span><span class="co"># ds &lt;- torch::dataloader(r_sources_dataset("~/Downloads/"), batch_size = 32)</span></span>
+<span><span class="co"># batch &lt;- coro::collect(ds, 1)[[1]]</span></span>
+<span><span class="co"># str(batch)</span></span>
+<span><span class="co"># m &lt;- net()</span></span>
+<span><span class="co"># str(m(batch$input_ids))</span></span></code></pre></div>
+<p>To make it easier to inspect training, we will also define a callback
+that prints a sample from the model every epoch.</p>
+<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="co"># samples from the model using the context.</span></span>
+<span><span class="va">generate</span> <span class="op">&lt;-</span> <span class="kw">function</span><span class="op">(</span><span class="va">model</span>, <span class="va">tok</span>, <span class="va">context</span>, <span class="va">...</span><span class="op">)</span> <span class="op">{</span></span>
+<span>  <span class="fu"><a href="https://rdrr.io/pkg/torch/man/with_no_grad.html" class="external-link">local_no_grad</a></span><span class="op">(</span><span class="op">)</span> <span class="co"># disables gradient for sampling</span></span>
+<span>  <span class="va">x</span> <span class="op">&lt;-</span> <span class="va">tok</span><span class="op">$</span><span class="fu">encode</span><span class="op">(</span><span class="va">context</span><span class="op">)</span><span class="op">$</span><span class="va">ids</span> <span class="op">+</span> <span class="fl">1L</span></span>
+<span>  <span class="va">x</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_tensor.html" class="external-link">torch_tensor</a></span><span class="op">(</span><span class="va">x</span><span class="op">)</span><span class="op">[</span><span class="cn">NULL</span>,<span class="op">]</span><span class="op">$</span><span class="fu">to</span><span class="op">(</span>device <span class="op">=</span> <span class="va">model</span><span class="op">$</span><span class="va">device</span><span class="op">)</span></span>
+<span>  <span class="va">content</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/integer.html" class="external-link">as.integer</a></span><span class="op">(</span><span class="va">model</span><span class="op">$</span><span class="fu">generate</span><span class="op">(</span><span class="va">x</span>, <span class="va">...</span><span class="op">)</span><span class="op">$</span><span class="fu">cpu</span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span>
+<span>  <span class="va">tok</span><span class="op">$</span><span class="fu">decode</span><span class="op">(</span><span class="va">content</span> <span class="op">-</span> <span class="fl">1L</span><span class="op">)</span></span>
+<span><span class="op">}</span></span>
+<span></span>
+<span><span class="va">display_cb</span> <span class="op">&lt;-</span> <span class="fu"><a href="../../reference/luz_callback.html">luz_callback</a></span><span class="op">(</span></span>
+<span>  initialize <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span><span class="op">}</span>,</span>
+<span>  on_epoch_end <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="op">)</span> <span class="op">{</span></span>
+<span>    <span class="fu"><a href="https://rdrr.io/pkg/torch/man/with_no_grad.html" class="external-link">local_no_grad</a></span><span class="op">(</span><span class="op">)</span></span>
+<span>    <span class="co"># sample from the model...</span></span>
+<span>    <span class="va">context</span> <span class="op">&lt;-</span> <span class="st">"# creates a linear model"</span></span>
+<span>    <span class="va">text</span> <span class="op">&lt;-</span> <span class="fu">generate</span><span class="op">(</span><span class="va">ctx</span><span class="op">$</span><span class="va">model</span>, <span class="va">dataset</span><span class="op">$</span><span class="va">dataset</span><span class="op">$</span><span class="va">tok</span>, <span class="va">context</span>, iter <span class="op">=</span> <span class="fl">100</span><span class="op">)</span></span>
+<span>    <span class="fu">cli</span><span class="fu">::</span><span class="fu"><a href="https://cli.r-lib.org/reference/cli_rule.html" class="external-link">cli_rule</a></span><span class="op">(</span><span class="op">)</span></span>
+<span>    <span class="fu"><a href="https://rdrr.io/r/base/cat.html" class="external-link">cat</a></span><span class="op">(</span><span class="va">text</span>, <span class="st">"\n"</span><span class="op">)</span></span>
+<span>    <span class="fu">cli</span><span class="fu">::</span><span class="fu"><a href="https://cli.r-lib.org/reference/cli_rule.html" class="external-link">cli_rule</a></span><span class="op">(</span><span class="op">)</span></span>
+<span>  <span class="op">}</span></span>
+<span><span class="op">)</span></span></code></pre></div>
+<p>We can finally train the model. We define that we want to train the
+model for half a billion tokens in a total of 100 epochs.</p>
+<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">n_tokens</span> <span class="op">&lt;-</span> <span class="fl">500e6</span></span>
+<span><span class="va">batch_size</span> <span class="op">&lt;-</span> <span class="fl">16</span></span>
+<span><span class="va">epochs</span> <span class="op">&lt;-</span> <span class="fl">100</span></span>
+<span><span class="va">context_length</span> <span class="op">&lt;-</span> <span class="fl">256L</span></span>
+<span></span>
+<span><span class="va">steps</span> <span class="op">&lt;-</span> <span class="va">n_tokens</span> <span class="op">/</span> <span class="va">context_length</span> <span class="op">/</span> <span class="va">epochs</span></span>
+<span><span class="va">dataset</span> <span class="op">&lt;-</span> <span class="fu">fixed_steps_iterable_dataset</span><span class="op">(</span></span>
+<span>  <span class="fu">r_sources_dataset</span><span class="op">(</span>context_length <span class="op">=</span> <span class="va">context_length</span><span class="op">)</span>,</span>
+<span>  steps <span class="op">=</span> <span class="va">steps</span></span>
+<span><span class="op">)</span></span>
+<span></span>
+<span><span class="va">fitted</span> <span class="op">&lt;-</span> <span class="va">net</span> <span class="op"><a href="../../reference/pipe.html">%&gt;%</a></span></span>
+<span>  <span class="fu"><a href="../../reference/setup.html">setup</a></span><span class="op">(</span></span>
+<span>    optimizer <span class="op">=</span> <span class="va">optim_adam</span>,</span>
+<span>    loss <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_cross_entropy_loss.html" class="external-link">nn_cross_entropy_loss</a></span><span class="op">(</span><span class="op">)</span></span>
+<span>  <span class="op">)</span> <span class="op"><a href="../../reference/pipe.html">%&gt;%</a></span></span>
+<span>  <span class="fu"><a href="../../reference/set_opt_hparams.html">set_opt_hparams</a></span><span class="op">(</span>lr <span class="op">=</span> <span class="fl">3e-4</span><span class="op">)</span> <span class="op">|&gt;</span></span>
+<span>  <span class="fu"><a href="https://generics.r-lib.org/reference/fit.html" class="external-link">fit</a></span><span class="op">(</span></span>
+<span>    <span class="va">dataset</span>,</span>
+<span>    epochs <span class="op">=</span> <span class="va">epochs</span>,</span>
+<span>    dataloader_options <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span>batch_size <span class="op">=</span> <span class="va">batch_size</span><span class="op">)</span>,</span>
+<span>    callbacks <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span></span>
+<span>      <span class="fu"><a href="../../reference/luz_callback_lr_scheduler.html">luz_callback_lr_scheduler</a></span><span class="op">(</span></span>
+<span>        <span class="fu">torch</span><span class="fu">::</span><span class="va"><a href="https://rdrr.io/pkg/torch/man/lr_one_cycle.html" class="external-link">lr_one_cycle</a></span>,</span>
+<span>        max_lr <span class="op">=</span> <span class="fl">0.1</span>,</span>
+<span>        epochs <span class="op">=</span> <span class="va">epochs</span>,</span>
+<span>        steps_per_epoch <span class="op">=</span> <span class="va">steps</span><span class="op">/</span><span class="va">batch_size</span>,</span>
+<span>        call_on <span class="op">=</span> <span class="st">"on_batch_end"</span></span>
+<span>      <span class="op">)</span>,</span>
+<span>      <span class="fu"><a href="../../reference/luz_callback_gradient_clip.html">luz_callback_gradient_clip</a></span><span class="op">(</span>max_norm <span class="op">=</span> <span class="fl">1</span><span class="op">)</span>,</span>
+<span>      <span class="fu">display_cb</span><span class="op">(</span><span class="op">)</span></span>
+<span>    <span class="op">)</span>,</span>
+<span>    verbose <span class="op">=</span> <span class="cn">TRUE</span></span>
+<span>  <span class="op">)</span></span>
+<span></span>
+<span><span class="fu">luz</span><span class="fu">::</span><span class="fu"><a href="../../reference/luz_save.html">luz_save</a></span><span class="op">(</span><span class="va">fitted</span>, <span class="st">"model.pt"</span><span class="op">)</span></span></code></pre></div>
+<p>We can then use the model to generate text given a prompt with:</p>
+<div class="sourceCode" id="cb9"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="va">fitted</span> <span class="op">&lt;-</span> <span class="fu">luz</span><span class="fu">::</span><span class="fu"><a href="../../reference/luz_load.html">luz_load</a></span><span class="op">(</span><span class="st">"model.pt"</span><span class="op">)</span></span>
+<span><span class="va">tok</span> <span class="op">&lt;-</span> <span class="fu">tok</span><span class="fu">::</span><span class="va">tokenizer</span><span class="op">$</span><span class="fu">from_file</span><span class="op">(</span><span class="st">"tokenizer-20000.json"</span><span class="op">)</span></span>
+<span><span class="va">context</span> <span class="op">&lt;-</span> <span class="st">"#' Creates a linear model</span></span>
+<span><span class="st">linear_model &lt;- function(x, y) {</span></span>
+<span><span class="st">"</span></span>
+<span><span class="va">text</span> <span class="op">&lt;-</span> <span class="fu">generate</span><span class="op">(</span><span class="va">fitted</span><span class="op">$</span><span class="va">model</span>, <span class="va">tok</span>, <span class="va">context</span>, iter <span class="op">=</span> <span class="fl">100</span><span class="op">)</span></span>
+<span><span class="fu"><a href="https://rdrr.io/r/base/cat.html" class="external-link">cat</a></span><span class="op">(</span><span class="va">text</span><span class="op">)</span></span></code></pre></div>
+</div>
+  </main>
+</div>
+
+
+
+    <footer><div class="pkgdown-footer-left">
+  <p></p>
+<p>Developed by Daniel Falbel.</p>
+</div>
+
+<div class="pkgdown-footer-right">
+  <p></p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
+</div>
+
+    </footer>
+</div>
+
+  
+
+  
+
+  </body>
+</html>
diff --git a/articles/get-started.html b/articles/get-started.html
index b5ff5a94..d82dc2d4 100644
--- a/articles/get-started.html
+++ b/articles/get-started.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="get-started_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Get started with luz</h1>
             
@@ -91,18 +92,42 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://mlverse.github.io/luz/" class="external-link">luz</a></span><span class="op">)</span></span>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torch.mlverse.org/docs" class="external-link">torch</a></span><span class="op">)</span></span></code></pre></div>
-<p>Luz is a high-level API for torch that aims to encapsulate the <strong>training loop</strong> into a set of reusable pieces of code. Luz reduces the boilerplate code required to train a model with torch and avoids the error prone <code>zero_grad()</code> - <code>backward()</code> - <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> sequence of calls, and also simplifies the process of moving data and models between CPUs and GPUs. Luz is designed to be highly flexible by providing a layered API that allows it to be useful no matter the level of control you need for your training loop.</p>
-<p>Luz is heavily inspired by other higher level frameworks for deep learning, to cite a few:</p>
+<p>Luz is a high-level API for torch that aims to encapsulate the
+<strong>training loop</strong> into a set of reusable pieces of code.
+Luz reduces the boilerplate code required to train a model with torch
+and avoids the error prone <code>zero_grad()</code> -
+<code>backward()</code> - <code><a href="https://rdrr.io/r/stats/step.html" class="external-link">step()</a></code> sequence of calls, and
+also simplifies the process of moving data and models between CPUs and
+GPUs. Luz is designed to be highly flexible by providing a layered API
+that allows it to be useful no matter the level of control you need for
+your training loop.</p>
+<p>Luz is heavily inspired by other higher level frameworks for deep
+learning, to cite a few:</p>
 <ul>
-<li><p><a href="https://docs.fast.ai/" class="external-link">FastAI</a>: we are heavily inspired by the FastAI library, especially the <code>Learner</code> object and the callbacks API.</p></li>
-<li><p><a href="https://keras.io/" class="external-link">Keras</a>: We are also heavily inspired by Keras, especially callback names. The lightning module interface is similar to <code>compile</code>, too.</p></li>
-<li><p><a href="https://lightning.ai/pages/open-source/" class="external-link">PyTorch Lightning</a>: The idea of the <code>luz_module</code> being a subclass of <code>nn_module</code> is inspired by the <strong><code>LightningModule</code></strong> object in lightning.</p></li>
-<li><p><a href="https://huggingface.co/docs/accelerate/" class="external-link">HuggingFace Accelerate</a>: The internal device placement API is heavily inspired by Accelerate, but is much more modest in features. Currently only CPU and Single GPU are supported.</p></li>
+<li><p><a href="https://docs.fast.ai/" class="external-link">FastAI</a>: we are heavily
+inspired by the FastAI library, especially the <code>Learner</code>
+object and the callbacks API.</p></li>
+<li><p><a href="https://keras.io/" class="external-link">Keras</a>: We are also heavily
+inspired by Keras, especially callback names. The lightning module
+interface is similar to <code>compile</code>, too.</p></li>
+<li><p><a href="https://lightning.ai/pages/open-source/" class="external-link">PyTorch
+Lightning</a>: The idea of the <code>luz_module</code> being a subclass
+of <code>nn_module</code> is inspired by the
+<strong><code>LightningModule</code></strong> object in
+lightning.</p></li>
+<li><p><a href="https://huggingface.co/docs/accelerate/" class="external-link">HuggingFace
+Accelerate</a>: The internal device placement API is heavily inspired by
+Accelerate, but is much more modest in features. Currently only CPU and
+Single GPU are supported.</p></li>
 </ul>
 <div class="section level2">
 <h2 id="training-a-nn_module">Training a <code>nn_module</code><a class="anchor" aria-label="anchor" href="#training-a-nn_module"></a>
 </h2>
-<p>As much as possible, luz tries to reuse the existing structures from torch. A model in luz is defined identically as you would define it if using raw torch. For a specific example, this is the definition of a feed-forward CNN that can be used to classify digits from the MNIST dataset:</p>
+<p>As much as possible, luz tries to reuse the existing structures from
+torch. A model in luz is defined identically as you would define it if
+using raw torch. For a specific example, this is the definition of a
+feed-forward CNN that can be used to classify digits from the MNIST
+dataset:</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">net</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
 <span>  <span class="st">"Net"</span>,</span>
@@ -129,7 +154,9 @@ <h2 id="training-a-nn_module">Training a <code>nn_module</code><a class="anchor"
 <span>    <span class="va">x</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>We can now train this model in the <code>train_dl</code> and validate it in the <code>test_dl</code> <code>torch::dataloaders()</code> with:</p>
+<p>We can now train this model in the <code>train_dl</code> and validate
+it in the <code>test_dl</code> <code>torch::dataloaders()</code>
+with:</p>
 <div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">fitted</span> <span class="op">&lt;-</span> <span class="va">net</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="../reference/setup.html">setup</a></span><span class="op">(</span></span>
@@ -144,21 +171,51 @@ <h2 id="training-a-nn_module">Training a <code>nn_module</code><a class="anchor"
 <span>  <span class="fu"><a href="https://generics.r-lib.org/reference/fit.html" class="external-link">fit</a></span><span class="op">(</span><span class="va">train_dl</span>, epochs <span class="op">=</span> <span class="fl">10</span>, valid_data <span class="op">=</span> <span class="va">test_dl</span><span class="op">)</span></span></code></pre></div>
 <p>Let’s understand what happens in this chunk of code:</p>
 <ol style="list-style-type: decimal">
-<li>The <code>setup</code> function allows you to configure the loss (objective) function and the optimizer that you will use to train your model. Optionally you can pass a list of metrics that are tracked during the training procedure. <strong>Note:</strong> the loss function can be any function taking <code>input</code> and <code>target</code> tensors and returning a scalar tensor value, and the optimizer can be any core torch optimizer or custom ones created with the <code><a href="https://rdrr.io/pkg/torch/man/optimizer.html" class="external-link">torch::optimizer()</a></code> function.</li>
-<li>The <code><a href="../reference/set_hparams.html">set_hparams()</a></code> function allows you to set hyper-parameters that should be passed to the module <code>initialize()</code> method. For example in this case we pass <code>num_classes = 10</code>.</li>
-<li>The <code><a href="../reference/set_opt_hparams.html">set_opt_hparams()</a></code> function allows you to pass hyper-parameters that are used by the optimizer function. For example, <code><a href="https://rdrr.io/pkg/torch/man/optim_adam.html" class="external-link">optim_adam()</a></code> can take the <code>lr</code> parameter specifying the learning rate and we specify it with <code>lr = 0.003</code>.</li>
-<li>The <code>fit</code> method will take the model specification provided by <code><a href="../reference/setup.html">setup()</a></code> and run the training procedure using the specified training and validation <code>torch::dataloaders()</code> as well as the number of epochs. <strong>Note:</strong> we again reuse core torch data structures, instead of providing our own data loading functionality.</li>
-<li>The returned object <code>fitted</code> contains the trained model as well as the record of metrics and losses produced during training. It can also be used for producing predictions and for evaluating the trained model on other datasets.</li>
+<li>The <code>setup</code> function allows you to configure the loss
+(objective) function and the optimizer that you will use to train your
+model. Optionally you can pass a list of metrics that are tracked during
+the training procedure. <strong>Note:</strong> the loss function can be
+any function taking <code>input</code> and <code>target</code> tensors
+and returning a scalar tensor value, and the optimizer can be any core
+torch optimizer or custom ones created with the
+<code><a href="https://rdrr.io/pkg/torch/man/optimizer.html" class="external-link">torch::optimizer()</a></code> function.</li>
+<li>The <code><a href="../reference/set_hparams.html">set_hparams()</a></code> function allows you to set
+hyper-parameters that should be passed to the module
+<code>initialize()</code> method. For example in this case we pass
+<code>num_classes = 10</code>.</li>
+<li>The <code><a href="../reference/set_opt_hparams.html">set_opt_hparams()</a></code> function allows you to pass
+hyper-parameters that are used by the optimizer function. For example,
+<code><a href="https://rdrr.io/pkg/torch/man/optim_adam.html" class="external-link">optim_adam()</a></code> can take the <code>lr</code> parameter
+specifying the learning rate and we specify it with
+<code>lr = 0.003</code>.</li>
+<li>The <code>fit</code> method will take the model specification
+provided by <code><a href="../reference/setup.html">setup()</a></code> and run the training procedure using
+the specified training and validation <code>torch::dataloaders()</code>
+as well as the number of epochs. <strong>Note:</strong> we again reuse
+core torch data structures, instead of providing our own data loading
+functionality.</li>
+<li>The returned object <code>fitted</code> contains the trained model
+as well as the record of metrics and losses produced during training. It
+can also be used for producing predictions and for evaluating the
+trained model on other datasets.</li>
 </ol>
-<p>When fitting, luz will use the fastest possible accelerator; if a CUDA-capable GPU is available it will be used, otherwise we fall back to the CPU. It also automatically moves data, optimizers, and models to the selected device so you don’t need to handle it manually (which is in general very error prone).</p>
-<p>To create predictions from the trained model you can use the <code>predict</code> method:</p>
+<p>When fitting, luz will use the fastest possible accelerator; if a
+CUDA-capable GPU is available it will be used, otherwise we fall back to
+the CPU. It also automatically moves data, optimizers, and models to the
+selected device so you don’t need to handle it manually (which is in
+general very error prone).</p>
+<p>To create predictions from the trained model you can use the
+<code>predict</code> method:</p>
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">predictions</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/stats/predict.html" class="external-link">predict</a></span><span class="op">(</span><span class="va">fitted</span>, <span class="va">test_dl</span><span class="op">)</span></span></code></pre></div>
 </div>
 <div class="section level2">
 <h2 id="the-training-loop">The training loop<a class="anchor" aria-label="anchor" href="#the-training-loop"></a>
 </h2>
-<p>You now have a general idea of how to use the <code>fit</code> function and now it’s important to have an overview of what’s happening inside it. In pseudocode, here’s what <code>fit</code> does. This is not fully detailed but should help you to build your intuition:</p>
+<p>You now have a general idea of how to use the <code>fit</code>
+function and now it’s important to have an overview of what’s happening
+inside it. In pseudocode, here’s what <code>fit</code> does. This is not
+fully detailed but should help you to build your intuition:</p>
 <div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="co"># -&gt; Initialize objects: model, optimizers.</span></span>
 <span><span class="co"># -&gt; Select fitting device.</span></span>
@@ -184,25 +241,45 @@ <h2 id="the-training-loop">The training loop<a class="anchor" aria-label="anchor
 <div class="section level2">
 <h2 id="metrics">Metrics<a class="anchor" aria-label="anchor" href="#metrics"></a>
 </h2>
-<p>One of the most important parts in machine learning projects is choosing the evaluation metric. Luz allows tracking many different metrics during training with minimal code changes.</p>
-<p>In order to track metrics, you only need to modify the <code>metrics</code> parameter in the <code>setup</code> function:</p>
-<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1"></a>fitted &lt;-<span class="st"> </span>net <span class="op">%&gt;%</span></span>
-<span id="cb6-2"><a href="#cb6-2"></a><span class="st">  </span><span class="kw">setup</span>(</span>
-<span id="cb6-3"><a href="#cb6-3"></a>    ...</span>
-<span id="cb6-4"><a href="#cb6-4"></a>    <span class="dt">metrics =</span> <span class="kw">list</span>(</span>
-<span id="cb6-5"><a href="#cb6-5"></a>      luz_metric_accuracy</span>
-<span id="cb6-6"><a href="#cb6-6"></a>    )</span>
-<span id="cb6-7"><a href="#cb6-7"></a>  ) <span class="op">%&gt;%</span></span>
-<span id="cb6-8"><a href="#cb6-8"></a><span class="st">  </span><span class="kw">fit</span>(...)</span></code></pre></div>
-<p>Luz provides implementations of a few of the most used metrics. If a metric is not available you can always implement a new one using the <code>luz_metric</code> function.</p>
-<p>In order to implement a new <code>luz_metric</code> we need to implement 3 methods:</p>
+<p>One of the most important parts in machine learning projects is
+choosing the evaluation metric. Luz allows tracking many different
+metrics during training with minimal code changes.</p>
+<p>In order to track metrics, you only need to modify the
+<code>metrics</code> parameter in the <code>setup</code> function:</p>
+<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>fitted <span class="ot">&lt;-</span> net <span class="sc">%&gt;%</span></span>
+<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>  <span class="fu">setup</span>(</span>
+<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>    ...</span>
+<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a>    <span class="at">metrics =</span> <span class="fu">list</span>(</span>
+<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a>      luz_metric_accuracy</span>
+<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a>    )</span>
+<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a>  <span class="fu">fit</span>(...)</span></code></pre></div>
+<p>Luz provides implementations of a few of the most used metrics. If a
+metric is not available you can always implement a new one using the
+<code>luz_metric</code> function.</p>
+<p>In order to implement a new <code>luz_metric</code> we need to
+implement 3 methods:</p>
 <ul>
-<li><p><code>initialize</code>: defines the metric initial state. This function is called for each epoch for both training and validation loops.</p></li>
-<li><p><code>update</code>: updates the metric internal state. This function is called at every training and validation step with the predictions obtained by the model and the target values obtained from the dataloader.</p></li>
-<li><p><code>compute</code>: uses the internal state to compute metric values. This function is called whenever we need to obtain the current metric value. Eg, it’s called every training step for metrics displayed in the progress bar, but only called once per epoch to record it’s value when the progress bar is not displayed.</p></li>
+<li><p><code>initialize</code>: defines the metric initial state. This
+function is called for each epoch for both training and validation
+loops.</p></li>
+<li><p><code>update</code>: updates the metric internal state. This
+function is called at every training and validation step with the
+predictions obtained by the model and the target values obtained from
+the dataloader.</p></li>
+<li><p><code>compute</code>: uses the internal state to compute metric
+values. This function is called whenever we need to obtain the current
+metric value. Eg, it’s called every training step for metrics displayed
+in the progress bar, but only called once per epoch to record it’s value
+when the progress bar is not displayed.</p></li>
 </ul>
-<p>Optionally, you can implement an <code>abbrev</code> field that gives the metric an abbreviation that will be used when displaying metric information in the console or tracking record. If no <code>abbrev</code> is passed, the class name will be used.</p>
-<p>Let’s take a look at the implementation of <code>luz_metric_accuracy</code> so you can see how to implement a new one:</p>
+<p>Optionally, you can implement an <code>abbrev</code> field that gives
+the metric an abbreviation that will be used when displaying metric
+information in the console or tracking record. If no <code>abbrev</code>
+is passed, the class name will be used.</p>
+<p>Let’s take a look at the implementation of
+<code>luz_metric_accuracy</code> so you can see how to implement a new
+one:</p>
 <div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">luz_metric_accuracy</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_metric.html">luz_metric</a></span><span class="op">(</span></span>
 <span>  <span class="co"># An abbreviation to be shown in progress bars, or </span></span>
@@ -230,13 +307,20 @@ <h2 id="metrics">Metrics<a class="anchor" aria-label="anchor" href="#metrics"></
 <span>    <span class="va">self</span><span class="op">$</span><span class="va">correct</span><span class="op">/</span><span class="va">self</span><span class="op">$</span><span class="va">total</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p><strong>Note</strong>: It’s good practice that the <code>compute</code> metric returns regular R values instead of torch tensors and other parts of luz will expect that.</p>
+<p><strong>Note</strong>: It’s good practice that the
+<code>compute</code> metric returns regular R values instead of torch
+tensors and other parts of luz will expect that.</p>
 </div>
 <div class="section level2">
 <h2 id="evaluate">Evaluate<a class="anchor" aria-label="anchor" href="#evaluate"></a>
 </h2>
-<p>Once a model has been trained you might want to evaluate its performance on a different dataset. For that reason, luz provides the <code><a href="../reference/evaluate.html">?evaluate</a></code> function that takes a fitted model and a dataset and computes the metrics attached to the model.</p>
-<p>Evaluate returns a <code>luz_module_evaluation</code> object that you can query for metrics using the <code>get_metrics</code> function or simply <code>print</code> to see the results.</p>
+<p>Once a model has been trained you might want to evaluate its
+performance on a different dataset. For that reason, luz provides the
+<code><a href="../reference/evaluate.html">?evaluate</a></code> function that takes a fitted model and a dataset
+and computes the metrics attached to the model.</p>
+<p>Evaluate returns a <code>luz_module_evaluation</code> object that you
+can query for metrics using the <code>get_metrics</code> function or
+simply <code>print</code> to see the results.</p>
 <p>For example:</p>
 <div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">evaluation</span> <span class="op">&lt;-</span> <span class="va">fitted</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span> <span class="fu"><a href="../reference/evaluate.html">evaluate</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">valid_dl</span><span class="op">)</span></span>
@@ -252,16 +336,32 @@ <h2 id="evaluate">Evaluate<a class="anchor" aria-label="anchor" href="#evaluate"
 <div class="section level2">
 <h2 id="customizing-with-callbacks">Customizing with callbacks<a class="anchor" aria-label="anchor" href="#customizing-with-callbacks"></a>
 </h2>
-<p>Luz provides different ways to customize the training progress depending on the level of control you need in the training loop. The fastest way and the more ‘reusable’, in the sense that you can create training modifications that can be used in many different situations, is via <strong>callbacks</strong>.</p>
-<p>The training loop in luz has many <em>breakpoints</em> that can call arbitrary R functions. This functionality allows you to customize the training process without having to modify the general training logic.</p>
-<p>Luz implements 3 default callbacks that occur in every training procedure:</p>
+<p>Luz provides different ways to customize the training progress
+depending on the level of control you need in the training loop. The
+fastest way and the more ‘reusable’, in the sense that you can create
+training modifications that can be used in many different situations, is
+via <strong>callbacks</strong>.</p>
+<p>The training loop in luz has many <em>breakpoints</em> that can call
+arbitrary R functions. This functionality allows you to customize the
+training process without having to modify the general training
+logic.</p>
+<p>Luz implements 3 default callbacks that occur in every training
+procedure:</p>
 <ul>
-<li><p><strong>train-eval callback</strong>: Sets the model to <code>train()</code> or <code><a href="https://rdrr.io/r/base/eval.html" class="external-link">eval()</a></code> depending on if the procedure is doing training or validation.</p></li>
-<li><p><strong>metrics callback</strong>: evaluate metrics during training and validation process.</p></li>
-<li><p><strong>progress callback</strong>: implements a progress bar and prints progress information during training.</p></li>
+<li><p><strong>train-eval callback</strong>: Sets the model to
+<code>train()</code> or <code><a href="https://rdrr.io/r/base/eval.html" class="external-link">eval()</a></code> depending on if the
+procedure is doing training or validation.</p></li>
+<li><p><strong>metrics callback</strong>: evaluate metrics during
+training and validation process.</p></li>
+<li><p><strong>progress callback</strong>: implements a progress bar and
+prints progress information during training.</p></li>
 </ul>
-<p>You can also implement custom callbacks that modify or act specifically for your training procedure. For example:</p>
-<p>Let’s implement a callback that prints ‘Iteration <code>n</code>’ (where <code>n</code> is the iteration number) for every batch in the training set and ‘Done’ when an epoch is finished. For that task we use the <code>luz_callback</code> function:</p>
+<p>You can also implement custom callbacks that modify or act
+specifically for your training procedure. For example:</p>
+<p>Let’s implement a callback that prints ‘Iteration <code>n</code>’
+(where <code>n</code> is the iteration number) for every batch in the
+training set and ‘Done’ when an epoch is finished. For that task we use
+the <code>luz_callback</code> function:</p>
 <div class="sourceCode" id="cb10"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">print_callback</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/luz_callback.html">luz_callback</a></span><span class="op">(</span></span>
 <span>  name <span class="op">=</span> <span class="st">"print_callback"</span>,</span>
@@ -275,16 +375,30 @@ <h2 id="customizing-with-callbacks">Customizing with callbacks<a class="anchor"
 <span>    <span class="fu"><a href="https://rdrr.io/r/base/cat.html" class="external-link">cat</a></span><span class="op">(</span><span class="va">self</span><span class="op">$</span><span class="va">message</span>, <span class="st">"\n"</span><span class="op">)</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p><code><a href="../reference/luz_callback.html">luz_callback()</a></code> takes named functions as <code>...</code> arguments, where the name indicates the moment at which the callback should be called. For instance <code>on_train_batch_end()</code> is called for every batch at the end of the training procedure, and <code>on_epoch_end()</code> is called at the end of every epoch.</p>
-<p>The returned value of <code><a href="../reference/luz_callback.html">luz_callback()</a></code> is a function that initializes an instance of the callback. Callbacks can have initialization parameters, like the name of a file where you want to log the results. In that case, you can pass an <code>initialize</code> method when creating the callback definition, and save these parameters to the <code>self</code> object. In the above example, the callback has a <code>message</code> parameter that is printed at the end of each epoch.</p>
-<p>Once a callback is defined it can be passed to the <code>fit</code> function via the <code>callbacks</code> parameter:</p>
+<p><code><a href="../reference/luz_callback.html">luz_callback()</a></code> takes named functions as <code>...</code>
+arguments, where the name indicates the moment at which the callback
+should be called. For instance <code>on_train_batch_end()</code> is
+called for every batch at the end of the training procedure, and
+<code>on_epoch_end()</code> is called at the end of every epoch.</p>
+<p>The returned value of <code><a href="../reference/luz_callback.html">luz_callback()</a></code> is a function that
+initializes an instance of the callback. Callbacks can have
+initialization parameters, like the name of a file where you want to log
+the results. In that case, you can pass an <code>initialize</code>
+method when creating the callback definition, and save these parameters
+to the <code>self</code> object. In the above example, the callback has
+a <code>message</code> parameter that is printed at the end of each
+epoch.</p>
+<p>Once a callback is defined it can be passed to the <code>fit</code>
+function via the <code>callbacks</code> parameter:</p>
 <div class="sourceCode" id="cb11"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">fitted</span> <span class="op">&lt;-</span> <span class="va">net</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="../reference/setup.html">setup</a></span><span class="op">(</span><span class="va">...</span><span class="op">)</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span></span>
 <span>  <span class="fu"><a href="https://generics.r-lib.org/reference/fit.html" class="external-link">fit</a></span><span class="op">(</span><span class="va">...</span>, callbacks <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span></span>
 <span>    <span class="fu">print_callback</span><span class="op">(</span>message <span class="op">=</span> <span class="st">"Done!"</span><span class="op">)</span></span>
 <span>  <span class="op">)</span><span class="op">)</span></span></code></pre></div>
-<p>Callbacks can be called in many different positions of the training loop, including combinations of them. Here’s an overview of possible callback <em>breakpoints</em>:</p>
+<p>Callbacks can be called in many different positions of the training
+loop, including combinations of them. Here’s an overview of possible
+callback <em>breakpoints</em>:</p>
 <pre><code>Start Fit
    - on_fit_begin
   Start Epoch Loop
@@ -320,10 +434,27 @@ <h2 id="customizing-with-callbacks">Customizing with callbacks<a class="anchor"
   End Epoch Loop
    - on_fit_end
 End Fit</code></pre>
-<p>Every step market with <code>on_*</code> is a point in the training procedure that is available for callbacks to be called.</p>
-<p>The other important part of callbacks is the <code>ctx</code> (context) object. See <code><a href="../reference/ctx.html">help("ctx")</a></code> for details.</p>
-<p>By default, callbacks are called in the same order as they were passed to <code>fit</code> (or <code>predict</code> or <code>evaluate</code>), but you can provide a <code>weight</code> attribute that will control the order in which it will be called. For example, if one callback has <code>weight = 10</code> and another has <code>weight = 1</code>, then the first one is called after the second one. Callbacks that don’t specify a <code>weight</code> attribute are considered <code>weight = 0</code>. A few built-in callbacks in luz already provide a weight value. For example, the <code><a href="../reference/luz_callback_early_stopping.html">?luz_callback_early_stopping</a></code> has a weight of <code>Inf</code>, since in general we want to run it as the last thing in the loop.</p>
-<p>The <code>ctx</code> object is used in luz to share information between the training loop and callbacks, model methods, and metrics. The table below describes information available in the <code>ctx</code> by default. Other callbacks could potentially modify these attributes or add new ones.</p>
+<p>Every step market with <code>on_*</code> is a point in the training
+procedure that is available for callbacks to be called.</p>
+<p>The other important part of callbacks is the <code>ctx</code>
+(context) object. See <code><a href="../reference/ctx.html">help("ctx")</a></code> for details.</p>
+<p>By default, callbacks are called in the same order as they were
+passed to <code>fit</code> (or <code>predict</code> or
+<code>evaluate</code>), but you can provide a <code>weight</code>
+attribute that will control the order in which it will be called. For
+example, if one callback has <code>weight = 10</code> and another has
+<code>weight = 1</code>, then the first one is called after the second
+one. Callbacks that don’t specify a <code>weight</code> attribute are
+considered <code>weight = 0</code>. A few built-in callbacks in luz
+already provide a weight value. For example, the
+<code><a href="../reference/luz_callback_early_stopping.html">?luz_callback_early_stopping</a></code> has a weight of
+<code>Inf</code>, since in general we want to run it as the last thing
+in the loop.</p>
+<p>The <code>ctx</code> object is used in luz to share information
+between the training loop and callbacks, model methods, and metrics. The
+table below describes information available in the <code>ctx</code> by
+default. Other callbacks could potentially modify these attributes or
+add new ones.</p>
 <!-- It's recommended to use the RStudio Visual editor to edit this table. -->
 <table class="table">
 <caption>Context attributes</caption>
@@ -338,15 +469,19 @@ <h2 id="customizing-with-callbacks">Customizing with callbacks<a class="anchor"
 <tbody>
 <tr class="odd">
 <td><code>verbose</code></td>
-<td>The value (<code>TRUE</code> or <code>FALSE</code>) attributed to the <code>verbose</code> argument in <code>fit</code> .</td>
+<td>The value (<code>TRUE</code> or <code>FALSE</code>) attributed to
+the <code>verbose</code> argument in <code>fit</code> .</td>
 </tr>
 <tr class="even">
 <td><code>accelerator</code></td>
-<td>Accelerator object used to query the correct device to place models, data, etc. It assumes the value passed to the <code>accelerator</code> parameter in <code>fit</code>.</td>
+<td>Accelerator object used to query the correct device to place models,
+data, etc. It assumes the value passed to the <code>accelerator</code>
+parameter in <code>fit</code>.</td>
 </tr>
 <tr class="odd">
 <td><code>model</code></td>
-<td>Initialized <code>nn_module</code> object that will be trained during the <code>fit</code> procedure.</td>
+<td>Initialized <code>nn_module</code> object that will be trained
+during the <code>fit</code> procedure.</td>
 </tr>
 <tr class="even">
 <td><code>optimizers</code></td>
@@ -354,15 +489,20 @@ <h2 id="customizing-with-callbacks">Customizing with callbacks<a class="anchor"
 </tr>
 <tr class="odd">
 <td><code>data</code></td>
-<td>The currently in-use dataloader. When training it’s <code>ctx$train_data</code>, when doing validation its <code>ctx$valid_data</code>. It can also be the prediction dataset when in <code>predict</code>.</td>
+<td>The currently in-use dataloader. When training it’s
+<code>ctx$train_data</code>, when doing validation its
+<code>ctx$valid_data</code>. It can also be the prediction dataset when
+in <code>predict</code>.</td>
 </tr>
 <tr class="even">
 <td><code>train_data</code></td>
-<td>Dataloader passed to the <code>data</code> argument in <code>fit</code>. Modified to yield data in the selected device.</td>
+<td>Dataloader passed to the <code>data</code> argument in
+<code>fit</code>. Modified to yield data in the selected device.</td>
 </tr>
 <tr class="odd">
 <td><code>valid_data</code></td>
-<td>Dataloader passed to the <code>valid_data</code> argument in <code>fit</code>. Modified to yield data in the selected device.</td>
+<td>Dataloader passed to the <code>valid_data</code> argument in
+<code>fit</code>. Modified to yield data in the selected device.</td>
 </tr>
 <tr class="even">
 <td><code>min_epochs</code></td>
@@ -378,86 +518,128 @@ <h2 id="customizing-with-callbacks">Customizing with callbacks<a class="anchor"
 </tr>
 <tr class="odd">
 <td><code>iter</code></td>
-<td>Current training iteration. It’s reset every epoch and when going from training to validation.</td>
+<td>Current training iteration. It’s reset every epoch and when going
+from training to validation.</td>
 </tr>
 <tr class="even">
 <td><code>training</code></td>
-<td>Whether the model is in training or validation mode. See also <code><a href="../reference/luz_callback_train_valid.html">help("luz_callback_train_valid")</a></code>
+<td>Whether the model is in training or validation mode. See also
+<code><a href="../reference/luz_callback_train_valid.html">help("luz_callback_train_valid")</a></code>
 </td>
 </tr>
 <tr class="odd">
 <td><code>callbacks</code></td>
-<td>List of callbacks that will be called during the training procedure. It’s the union of the list passed to the <code>callbacks</code> parameter and the default <code>callbacks</code>.</td>
+<td>List of callbacks that will be called during the training procedure.
+It’s the union of the list passed to the <code>callbacks</code>
+parameter and the default <code>callbacks</code>.</td>
 </tr>
 <tr class="even">
 <td><code>step</code></td>
-<td>Closure that will be used to do one <code>step</code> of the model. It’s used for both training and validation. Takes no argument, but can access the <code>ctx</code> object.</td>
+<td>Closure that will be used to do one <code>step</code> of the model.
+It’s used for both training and validation. Takes no argument, but can
+access the <code>ctx</code> object.</td>
 </tr>
 <tr class="odd">
 <td><code>call_callbacks</code></td>
-<td>Call callbacks by name. For example <code>call_callbacks("on_train_begin")</code> will call all callbacks that provide methods for this point.</td>
+<td>Call callbacks by name. For example
+<code>call_callbacks("on_train_begin")</code> will call all callbacks
+that provide methods for this point.</td>
 </tr>
 <tr class="even">
 <td><code>batch</code></td>
-<td>Last batch obtained by the dataloader. A batch is a <code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> with 2 elements, one that is used as <code>input</code> and the other as <code>target</code>.</td>
+<td>Last batch obtained by the dataloader. A batch is a
+<code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> with 2 elements, one that is used as
+<code>input</code> and the other as <code>target</code>.</td>
 </tr>
 <tr class="odd">
 <td><code>input</code></td>
-<td>First element of the last batch obtained by the current dataloader.</td>
+<td>First element of the last batch obtained by the current
+dataloader.</td>
 </tr>
 <tr class="even">
 <td><code>target</code></td>
-<td>Second element of the last batch obtained by the current dataloader.</td>
+<td>Second element of the last batch obtained by the current
+dataloader.</td>
 </tr>
 <tr class="odd">
 <td><code>pred</code></td>
-<td>Last predictions obtained by <code>ctx$model$forward</code> . <strong>Note:</strong> can be potentially modified by previously ran callbacks. Also note that this might not be available if you used a custom training step.</td>
+<td>Last predictions obtained by <code>ctx$model$forward</code> .
+<strong>Note:</strong> can be potentially modified by previously ran
+callbacks. Also note that this might not be available if you used a
+custom training step.</td>
 </tr>
 <tr class="even">
 <td><code>loss_fn</code></td>
-<td>The active loss function that will be minimized during training.</td>
+<td>The active loss function that will be minimized during
+training.</td>
 </tr>
 <tr class="odd">
 <td><code>loss</code></td>
-<td>Last computed loss from the model. <strong>Note:</strong> this might not be available if you modified the training or validation step.</td>
+<td>Last computed loss from the model. <strong>Note:</strong> this might
+not be available if you modified the training or validation step.</td>
 </tr>
 <tr class="even">
 <td><code>opt</code></td>
-<td>Current optimizer, ie. the optimizer that will be used to do the next <code>step</code> to update parameters.</td>
+<td>Current optimizer, ie. the optimizer that will be used to do the
+next <code>step</code> to update parameters.</td>
 </tr>
 <tr class="odd">
 <td><code>opt_nm</code></td>
-<td>Current optimizer name. By default it’s <code>opt</code> , but can change if your model uses more than one optimizer depending on the set of parameters being optimized.</td>
+<td>Current optimizer name. By default it’s <code>opt</code> , but can
+change if your model uses more than one optimizer depending on the set
+of parameters being optimized.</td>
 </tr>
 <tr class="even">
 <td><code>metrics</code></td>
 <td>
-<code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> with current metric objects that are <code>update</code>d at every <code>on_train_batch_end()</code> or <code>on_valid_batch_end()</code>. See also <code><a href="../reference/luz_callback_metrics.html">help("luz_callback_metrics")</a></code>
+<code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> with current metric objects that are
+<code>update</code>d at every <code>on_train_batch_end()</code> or
+<code>on_valid_batch_end()</code>. See also
+<code><a href="../reference/luz_callback_metrics.html">help("luz_callback_metrics")</a></code>
 </td>
 </tr>
 <tr class="odd">
 <td><code>records</code></td>
 <td>
-<code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> recording metric values for training and validation for each epoch. See also <code><a href="../reference/luz_callback_metrics.html">help("luz_callback_metrics")</a></code> . Also records profiling metrics. See <code><a href="../reference/luz_callback_profile.html">help("luz_callback_profile")</a></code> for more information.</td>
+<code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> recording metric values for training and
+validation for each epoch. See also
+<code><a href="../reference/luz_callback_metrics.html">help("luz_callback_metrics")</a></code> . Also records profiling
+metrics. See <code><a href="../reference/luz_callback_profile.html">help("luz_callback_profile")</a></code> for more
+information.</td>
 </tr>
 <tr class="even">
 <td><code>handlers</code></td>
-<td>A named <code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> of handlers that is passed to <code><a href="https://rlang.r-lib.org/reference/with_handlers.html" class="external-link">rlang::with_handlers()</a></code> during the training loop and can be used to handle errors or conditions that might be raised by other callbacks.</td>
+<td>A named <code><a href="https://rdrr.io/r/base/list.html" class="external-link">list()</a></code> of handlers that is passed to
+<code><a href="https://rlang.r-lib.org/reference/with_handlers.html" class="external-link">rlang::with_handlers()</a></code> during the training loop and can be
+used to handle errors or conditions that might be raised by other
+callbacks.</td>
 </tr>
 <tr class="odd">
 <td><code>epoch_handlers</code></td>
-<td>A named list of handlers that is used with <code><a href="https://rlang.r-lib.org/reference/with_handlers.html" class="external-link">rlang::with_handlers()</a></code>. Those handlers are used inside the epochs loop, thus you can handle epoch specific conditions, that won’t necessarily end training.</td>
+<td>A named list of handlers that is used with
+<code><a href="https://rlang.r-lib.org/reference/with_handlers.html" class="external-link">rlang::with_handlers()</a></code>. Those handlers are used inside the
+epochs loop, thus you can handle epoch specific conditions, that won’t
+necessarily end training.</td>
 </tr>
 </tbody>
 </table>
-<p>Attributes in <code>ctx</code> can be used to produce the desired behavior of callbacks. You can find information about the context object using <code><a href="../reference/ctx.html">help("ctx")</a></code>. In our example, we use the <code>ctx$iter</code> attribute to print the iteration number for each training batch.</p>
+<p>Attributes in <code>ctx</code> can be used to produce the desired
+behavior of callbacks. You can find information about the context object
+using <code><a href="../reference/ctx.html">help("ctx")</a></code>. In our example, we use the
+<code>ctx$iter</code> attribute to print the iteration number for each
+training batch.</p>
 </div>
 <div class="section level2">
 <h2 id="next-steps">Next steps<a class="anchor" aria-label="anchor" href="#next-steps"></a>
 </h2>
-<p>In this article you learned how to train your first model using luz and the basics of customization using both custom metrics and callbacks.</p>
-<p>Luz also allows more flexible modifications of the training loop described in <code><a href="../articles/custom-loop.html">vignette("custom-loop")</a></code>.</p>
-<p>You should now be able to follow the examples marked with the ‘basic’ category in the <a href="https://mlverse.github.io/luz/articles/examples/index.html" class="external-link">examples gallery</a>.</p>
+<p>In this article you learned how to train your first model using luz
+and the basics of customization using both custom metrics and
+callbacks.</p>
+<p>Luz also allows more flexible modifications of the training loop
+described in <code><a href="../articles/custom-loop.html">vignette("custom-loop")</a></code>.</p>
+<p>You should now be able to follow the examples marked with the ‘basic’
+category in the <a href="https://mlverse.github.io/luz/articles/examples/index.html" class="external-link">examples
+gallery</a>.</p>
 </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
     </nav></aside>
@@ -472,7 +654,7 @@ <h2 id="next-steps">Next steps<a class="anchor" aria-label="anchor" href="#next-
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/get-started_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/get-started_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/get-started_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/articles/index.html b/articles/index.html
index 2bd5852c..81608f8f 100644
--- a/articles/index.html
+++ b/articles/index.html
@@ -63,21 +63,19 @@ <h3>All vignettes</h3>
       <p class="section-desc"></p>
 
       <dl><dt><a href="accelerator.html">Accelerator API</a></dt>
-        <dd>
-        </dd><dt><a href="examples/chargpt.html">CharGPT</a></dt>
         <dd>
         </dd><dt><a href="checkpoints.html">Checkpointing your models</a></dt>
         <dd>
+        </dd><dt><a href="lr-finder.html">Using the learning rate finder</a></dt>
+        <dd>
         </dd><dt><a href="custom-loop.html">Custom loops with luz</a></dt>
         <dd>
-        </dd><dt><a href="examples/dogs-vs-cats-binary-classification.html">Binary classification</a></dt>
+        </dd><dt><a href="examples/chargpt.html">CharGPT</a></dt>
         <dd>
-        </dd><dt><a href="get-started.html">Get started with luz</a></dt>
+        </dd><dt><a href="examples/dogs-vs-cats-binary-classification.html">Binary classification</a></dt>
         <dd>
         </dd><dt><a href="examples/index.html">Examples</a></dt>
         <dd>
-        </dd><dt><a href="lr-finder.html">Using the learning rate finder</a></dt>
-        <dd>
         </dd><dt><a href="examples/mnist-autoencoder.html">Autoencoder</a></dt>
         <dd>
         </dd><dt><a href="examples/mnist-cnn-virtual-batch-size.html">Virtual batch size</a></dt>
@@ -94,6 +92,10 @@ <h3>All vignettes</h3>
         <dd>
         </dd><dt><a href="examples/text-classification.html">Text classification from scratch</a></dt>
         <dd>
+        </dd><dt><a href="examples/text-generation.html">Training a causal language model from scratch</a></dt>
+        <dd>
+        </dd><dt><a href="get-started.html">Get started with luz</a></dt>
+        <dd>
       </dd></dl></div>
   </main></div>
 
@@ -103,7 +105,7 @@ <h3>All vignettes</h3>
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/articles/lr-finder.html b/articles/lr-finder.html
index 876a03ce..7a8d20d0 100644
--- a/articles/lr-finder.html
+++ b/articles/lr-finder.html
@@ -77,7 +77,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 
 
-<script src="lr-finder_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
+
+<div class="row">
   <main id="main" class="col-md-9"><div class="page-header">
       <img src="" class="logo" alt=""><h1>Using the learning rate finder</h1>
             
@@ -94,9 +95,25 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://torchvision.mlverse.org" class="external-link">torchvision</a></span><span class="op">)</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/base/Random.html" class="external-link">set.seed</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span>
 <span><span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_manual_seed.html" class="external-link">torch_manual_seed</a></span><span class="op">(</span><span class="fl">1703</span><span class="op">)</span></span></code></pre></div>
-<p>In this article we discuss how to find a good learning rate for your model. Finding a good learning rate is essential to be able to fit your model. If it’s too low, you will need too many iterations for your loss to converge, and that might be impractical if your model takes too long to run. If it’s too high, the loss can explode and you might never be able to minimize the loss.</p>
-<p>The learning rate can be considered another hyperparameter of your model that needs to be tuned but, there are techniques that allow you to select a good learning rate for your model without having to use the costly strategy of fitting many models with different learning rates and then choosing the one with better results.</p>
-<p>This <a href="https://arxiv.org/abs/1506.01186" class="external-link">article</a> by Leslie Smith that became popular once their approach had been implemented in the popular FastAI framework, proposes that we should start with a very small learning rate and slowly increase it until we reach a high learning rate. At each iteration we record the loss value and in the end we plot it against the learning rate. We can then use these results to decide on a good learning rate. That’s what <code>lr_finder</code> does, and we will show how to use it.</p>
+<p>In this article we discuss how to find a good learning rate for your
+model. Finding a good learning rate is essential to be able to fit your
+model. If it’s too low, you will need too many iterations for your loss
+to converge, and that might be impractical if your model takes too long
+to run. If it’s too high, the loss can explode and you might never be
+able to minimize the loss.</p>
+<p>The learning rate can be considered another hyperparameter of your
+model that needs to be tuned but, there are techniques that allow you to
+select a good learning rate for your model without having to use the
+costly strategy of fitting many models with different learning rates and
+then choosing the one with better results.</p>
+<p>This <a href="https://arxiv.org/abs/1506.01186" class="external-link">article</a> by Leslie
+Smith that became popular once their approach had been implemented in
+the popular FastAI framework, proposes that we should start with a very
+small learning rate and slowly increase it until we reach a high
+learning rate. At each iteration we record the loss value and in the end
+we plot it against the learning rate. We can then use these results to
+decide on a good learning rate. That’s what <code>lr_finder</code> does,
+and we will show how to use it.</p>
 <p>First let’s download and prepare the MNIST dataset:</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">dir</span> <span class="op">&lt;-</span> <span class="st">"~/Downloads/mnist"</span> <span class="co"># caching directory</span></span>
@@ -108,7 +125,8 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span><span class="op">)</span></span>
 <span><span class="co">#&gt; Processing...</span></span>
 <span><span class="co">#&gt; Done!</span></span></code></pre></div>
-<p>We can now define our model. We are going to use a small, straightforward CNN in the LeNet style.</p>
+<p>We can now define our model. We are going to use a small,
+straightforward CNN in the LeNet style.</p>
 <div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">net</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_module.html" class="external-link">nn_module</a></span><span class="op">(</span></span>
 <span>  <span class="st">"net"</span>,</span>
@@ -135,7 +153,11 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span>      <span class="va">self</span><span class="op">$</span><span class="fu">classifier</span><span class="op">(</span><span class="op">)</span></span>
 <span>  <span class="op">}</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p>We can now use the <code>lr_finder</code> function to record the loss with different learning rates. It’s important to use the learning rate finder with all other hyperparameters of the model fixed because they can influence the choice of the learning rate. For example, depending on the batch size, you might want to choose different learning rates.</p>
+<p>We can now use the <code>lr_finder</code> function to record the loss
+with different learning rates. It’s important to use the learning rate
+finder with all other hyperparameters of the model fixed because they
+can influence the choice of the learning rate. For example, depending on
+the batch size, you might want to choose different learning rates.</p>
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">model</span> <span class="op">&lt;-</span> <span class="va">net</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span> <span class="fu"><a href="../reference/setup.html">setup</a></span><span class="op">(</span></span>
 <span>  loss <span class="op">=</span> <span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/nn_cross_entropy_loss.html" class="external-link">nn_cross_entropy_loss</a></span><span class="op">(</span><span class="op">)</span>,</span>
@@ -155,15 +177,26 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <span><span class="co">#&gt; Classes 'lr_records' and 'data.frame':   100 obs. of  2 variables:</span></span>
 <span><span class="co">#&gt;  $ lr  : num  1.15e-06 1.32e-06 1.51e-06 1.74e-06 2.00e-06 ...</span></span>
 <span><span class="co">#&gt;  $ loss: num  2.31 2.3 2.29 2.3 2.31 ...</span></span></code></pre></div>
-<p>The result is a data frame with the losses and the learning rate in each step. You can use the built-in plot method to display the exact results, along with a exponentially smoothed value of the loss.</p>
+<p>The result is a data frame with the losses and the learning rate in
+each step. You can use the built-in plot method to display the exact
+results, along with a exponentially smoothed value of the loss.</p>
 <div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/graphics/plot.default.html" class="external-link">plot</a></span><span class="op">(</span><span class="va">records</span><span class="op">)</span> <span class="op">+</span></span>
 <span>  <span class="fu">ggplot2</span><span class="fu">::</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/coord_cartesian.html" class="external-link">coord_cartesian</a></span><span class="op">(</span>ylim <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="cn">NA</span>, <span class="fl">5</span><span class="op">)</span><span class="op">)</span></span></code></pre></div>
 <p><img src="lr-finder_files/figure-html/unnamed-chunk-5-1.png" width="700"></p>
-<p>We can see that with small learning rates the loss doesn’t decrease. At some point the loss starts decreasing until it reaches a point where it starts increasing and explodes.</p>
-<p>And how do we choose the learning rate using this plot? Sylvain Gugger asked the same question in this <a href="https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html" class="external-link">blog post</a> and we are quoting his answer:</p>
+<p>We can see that with small learning rates the loss doesn’t decrease.
+At some point the loss starts decreasing until it reaches a point where
+it starts increasing and explodes.</p>
+<p>And how do we choose the learning rate using this plot? Sylvain
+Gugger asked the same question in this <a href="https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html" class="external-link">blog
+post</a> and we are quoting his answer:</p>
 <blockquote>
-<p>Not the one corresponding to the minimum. Why? Well the learning rate that corresponds to the minimum value is already a bit too high, since we are at the edge between improving and getting all over the place. We want to go one order of magnitude before, a value that’s still aggressive (so that we train quickly) but still on the safe side from an explosion.</p>
+<p>Not the one corresponding to the minimum. Why? Well the learning rate
+that corresponds to the minimum value is already a bit too high, since
+we are at the edge between improving and getting all over the place. We
+want to go one order of magnitude before, a value that’s still
+aggressive (so that we train quickly) but still on the safe side from an
+explosion.</p>
 </blockquote>
 <p>In the above example we would choose 1e-3 instead of 1e-2.</p>
   </main>
@@ -178,7 +211,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/articles/lr-finder_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/lr-finder_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6..00000000
--- a/articles/lr-finder_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty <a> tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
-  const codeList = document.getElementsByClassName("sourceCode");
-  for (var i = 0; i < codeList.length; i++) {
-    var linkList = codeList[i].getElementsByTagName('a');
-    for (var j = 0; j < linkList.length; j++) {
-      if (linkList[j].innerHTML === "") {
-        linkList[j].setAttribute('aria-hidden', 'true');
-      }
-    }
-  }
-});
diff --git a/authors.html b/authors.html
index 91358a77..60012493 100644
--- a/authors.html
+++ b/authors.html
@@ -95,7 +95,7 @@ <h2 id="citation">Citation</h2>
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/index.html b/index.html
index 6d2484de..02b9dcd1 100644
--- a/index.html
+++ b/index.html
@@ -5,14 +5,24 @@
 <meta charset="utf-8">
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
 <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
-<meta name="description" content="A high level interface for torch providing utilities to reduce the the amount of code needed for common tasks, abstract away torch details and make the same code work on both the CPU and GPU. Its flexible enough to support expressing a large range of models. Its heavily inspired by fastai by Howard et al. (2020) &lt;arXiv:2002.04688&gt;, Keras by Chollet et al. (2015) and PyTorch Lightning by Falcon et al. (2019) &lt;doi:10.5281/zenodo.3828935&gt;.">
+<meta name="description" content="A high level interface for torch providing utilities to reduce the
+    the amount of code needed for common tasks, abstract away torch details and 
+    make the same code work on both the CPU and GPU. Its flexible enough to
+    support expressing a large range of models. Its heavily inspired by fastai by 
+    Howard et al. (2020) &lt;arXiv:2002.04688&gt;, Keras by Chollet et al. (2015) and 
+    PyTorch Lightning by Falcon et al. (2019) &lt;doi:10.5281/zenodo.3828935&gt;.">
 <title>Higher Level API for torch • luz</title>
 <script src="deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
 <link href="deps/bootstrap-5.2.2/bootstrap.min.css" rel="stylesheet">
 <script src="deps/bootstrap-5.2.2/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
 <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
 <!-- bootstrap-toc --><script src="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js" integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="pkgdown.js"></script><meta property="og:title" content="Higher Level API for torch">
-<meta property="og:description" content="A high level interface for torch providing utilities to reduce the the amount of code needed for common tasks, abstract away torch details and make the same code work on both the CPU and GPU. Its flexible enough to support expressing a large range of models. Its heavily inspired by fastai by Howard et al. (2020) &lt;arXiv:2002.04688&gt;, Keras by Chollet et al. (2015) and PyTorch Lightning by Falcon et al. (2019) &lt;doi:10.5281/zenodo.3828935&gt;.">
+<meta property="og:description" content="A high level interface for torch providing utilities to reduce the
+    the amount of code needed for common tasks, abstract away torch details and 
+    make the same code work on both the CPU and GPU. Its flexible enough to
+    support expressing a large range of models. Its heavily inspired by fastai by 
+    Howard et al. (2020) &lt;arXiv:2002.04688&gt;, Keras by Chollet et al. (2015) and 
+    PyTorch Lightning by Falcon et al. (2019) &lt;doi:10.5281/zenodo.3828935&gt;.">
 <!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
 <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
 <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
@@ -79,6 +89,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 <div class="page-header"><h1 id="luz">luz<a class="anchor" aria-label="anchor" href="#luz"></a>
 </h1></div>
 <!-- badges: start -->
+
 <p>Luz is a higher level API for torch providing abstractions to allow for much less verbose training loops.</p>
 <p>This package is still under development.</p>
 <p>It is heavily inspired by other higher level frameworks for deep learning, to cite a few:</p>
@@ -190,7 +201,7 @@ <h2 data-toc-skip>Dev status</h2>
 
 <div class="pkgdown-footer-right">
   <p></p>
-<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer>
diff --git a/news/index.html b/news/index.html
index 1caa70f2..488f9cb4 100644
--- a/news/index.html
+++ b/news/index.html
@@ -101,7 +101,8 @@ <h3 id="bug-fixes-0-4-0">Bug fixes<a class="anchor" aria-label="anchor" href="#b
 </div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="0.3.1" id="luz-031">luz 0.3.1<a class="anchor" aria-label="anchor" href="#luz-031"></a></h2><p class="text-muted">CRAN release: 2022-09-06</p>
-<ul><li>Re-submission to fix vignette rendering.</li></ul></div>
+<ul><li>Re-submission to fix vignette rendering.</li>
+</ul></div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="0.3.0" id="luz-030">luz 0.3.0<a class="anchor" aria-label="anchor" href="#luz-030"></a></h2><p class="text-muted">CRAN release: 2022-08-19</p>
 <div class="section level3">
@@ -113,7 +114,8 @@ <h3 id="breaking-changes-0-3-0">Breaking changes<a class="anchor" aria-label="an
 </ul></div>
 <div class="section level3">
 <h3 id="documentation-0-3-0">Documentation<a class="anchor" aria-label="anchor" href="#documentation-0-3-0"></a></h3>
-<ul><li>Many wording improvements in the getting started guides (<a href="https://github.com/mlverse/luz/issues/81" class="external-link">#81</a> <a href="https://github.com/mlverse/luz/issues/94" class="external-link">#94</a>, <a href="https://github.com/jonthegeek" class="external-link">@jonthegeek</a>).</li></ul></div>
+<ul><li>Many wording improvements in the getting started guides (<a href="https://github.com/mlverse/luz/issues/81" class="external-link">#81</a> <a href="https://github.com/mlverse/luz/issues/94" class="external-link">#94</a>, <a href="https://github.com/jonthegeek" class="external-link">@jonthegeek</a>).</li>
+</ul></div>
 <div class="section level3">
 <h3 id="new-features-0-3-0">New features<a class="anchor" aria-label="anchor" href="#new-features-0-3-0"></a></h3>
 <ul><li>Added MixUp callback and helper loss function and functional logic. (<a href="https://github.com/mlverse/luz/issues/82" class="external-link">#82</a>, <a href="https://github.com/skeydan" class="external-link">@skeydan</a>).</li>
@@ -151,7 +153,8 @@ <h3 id="internal-changes-0-2-0">Internal changes<a class="anchor" aria-label="an
 </div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="0.1.0" id="luz-010">luz 0.1.0<a class="anchor" aria-label="anchor" href="#luz-010"></a></h2><p class="text-muted">CRAN release: 2021-06-17</p>
-<ul><li>Added a <code>NEWS.md</code> file to track changes to the package.</li></ul></div>
+<ul><li>Added a <code>NEWS.md</code> file to track changes to the package.</li>
+</ul></div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
     </nav></aside></div>
 
@@ -161,7 +164,7 @@ <h2 class="pkg-version" data-toc-text="0.1.0" id="luz-010">luz 0.1.0<a class="an
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/pkgdown.yml b/pkgdown.yml
index 3b8ad870..5832483b 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -1,15 +1,14 @@
-pandoc: 2.7.3
-pkgdown: 2.0.7.9000
-pkgdown_sha: c9206802f2888992de92aa41f517ba7812f05331
+pandoc: 2.19.2
+pkgdown: 2.0.7
+pkgdown_sha: ~
 articles:
   accelerator: accelerator.html
-  chargpt: examples/chargpt.html
   checkpoints: checkpoints.html
+  lr-finder: lr-finder.html
   custom-loop: custom-loop.html
+  chargpt: examples/chargpt.html
   dogs-vs-cats-binary-classification: examples/dogs-vs-cats-binary-classification.html
-  get-started: get-started.html
   index: examples/index.html
-  lr-finder: lr-finder.html
   mnist-autoencoder: examples/mnist-autoencoder.html
   mnist-cnn-virtual-batch-size: examples/mnist-cnn-virtual-batch-size.html
   mnist-cnn: examples/mnist-cnn.html
@@ -18,5 +17,7 @@ articles:
   mnist-triplet: examples/mnist-triplet.html
   pets-unet: examples/pets-unet.html
   text-classification: examples/text-classification.html
-last_built: 2023-09-15T17:29Z
+  text-generation: examples/text-generation.html
+  get-started: get-started.html
+last_built: 2023-10-17T16:26Z
 
diff --git a/reference/accelerator.html b/reference/accelerator.html
index 6e94c1a0..ad0990cf 100644
--- a/reference/accelerator.html
+++ b/reference/accelerator.html
@@ -99,7 +99,7 @@ <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#argumen
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/as_dataloader.html b/reference/as_dataloader.html
index e9bfbfe2..404c070e 100644
--- a/reference/as_dataloader.html
+++ b/reference/as_dataloader.html
@@ -159,7 +159,7 @@ <h2 id="overriding">Overriding<a class="anchor" aria-label="anchor" href="#overr
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/context.html b/reference/context.html
index 9c598532..a377b0d8 100644
--- a/reference/context.html
+++ b/reference/context.html
@@ -517,7 +517,7 @@ <h4 id="arguments-10">Arguments<a class="anchor" aria-label="anchor" href="#argu
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/ctx.html b/reference/ctx.html
index fe747f6a..0e921c9b 100644
--- a/reference/ctx.html
+++ b/reference/ctx.html
@@ -90,7 +90,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/evaluate.html b/reference/evaluate.html
index 9bdcad2d..9ef5c86f 100644
--- a/reference/evaluate.html
+++ b/reference/evaluate.html
@@ -141,12 +141,12 @@ <h2 id="details">Details<a class="anchor" aria-label="anchor" href="#details"></
 <p></p><div class="sourceCode r"><pre><code><span><span class="va">evaluation</span> <span class="op">&lt;-</span> <span class="va">fitted</span> <span class="op"><a href="../reference/pipe.html">%&gt;%</a></span> <span class="fu"><a href="../reference/evaluate.html">evaluate</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">valid_dl</span><span class="op">)</span></span>
 <span><span class="va">metrics</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/get_metrics.html">get_metrics</a></span><span class="op">(</span><span class="va">evaluation</span><span class="op">)</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/base/print.html" class="external-link">print</a></span><span class="op">(</span><span class="va">evaluation</span><span class="op">)</span></span></code></pre><p></p></div>
-<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1"></a><span class="co">## A `luz_module_evaluation`</span></span>
-<span id="cb1-2"><a href="#cb1-2"></a><span class="co">## -- Results ---------------------------------------------------------------------</span></span>
-<span id="cb1-3"><a href="#cb1-3"></a><span class="co">## loss: 1.5146</span></span>
-<span id="cb1-4"><a href="#cb1-4"></a><span class="co">## mae: 1.0251</span></span>
-<span id="cb1-5"><a href="#cb1-5"></a><span class="co">## mse: 1.5159</span></span>
-<span id="cb1-6"><a href="#cb1-6"></a><span class="co">## rmse: 1.2312</span></span></code></pre><p></p></div>
+<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="do">## A `luz_module_evaluation`</span></span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="do">## -- Results ---------------------------------------------------------------------</span></span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="do">## loss: 1.5146</span></span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="do">## mae: 1.0251</span></span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="do">## mse: 1.5159</span></span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="do">## rmse: 1.2312</span></span></code></pre><p></p></div>
     </div>
     <div class="section level2">
     <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"></a></h2>
@@ -165,7 +165,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/fit.luz_module_generator.html b/reference/fit.luz_module_generator.html
index 3850f33c..885905d8 100644
--- a/reference/fit.luz_module_generator.html
+++ b/reference/fit.luz_module_generator.html
@@ -170,7 +170,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/get_metrics.html b/reference/get_metrics.html
index 2add3e85..692a7bf7 100644
--- a/reference/get_metrics.html
+++ b/reference/get_metrics.html
@@ -103,7 +103,7 @@ <h2 id="methods-by-class-">Methods (by class)<a class="anchor" aria-label="ancho
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/index.html b/reference/index.html
index 29fd15f7..5c6cf36f 100644
--- a/reference/index.html
+++ b/reference/index.html
@@ -359,7 +359,7 @@ <h2 id="serialization">Serialization<a class="anchor" aria-label="anchor" href="
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/lr_finder-1.png b/reference/lr_finder-1.png
index 6801c27f..2b724bd4 100644
Binary files a/reference/lr_finder-1.png and b/reference/lr_finder-1.png differ
diff --git a/reference/lr_finder.html b/reference/lr_finder.html
index 382b47f7..6cdb2739 100644
--- a/reference/lr_finder.html
+++ b/reference/lr_finder.html
@@ -146,7 +146,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback.html b/reference/luz_callback.html
index fdcd20ed..a2d9fdae 100644
--- a/reference/luz_callback.html
+++ b/reference/luz_callback.html
@@ -152,41 +152,41 @@ <h2 id="details">Details<a class="anchor" aria-label="anchor" href="#details"></
 <p>Callbacks can be called in many different positions of the training
 loop, including combinations of them. Here’s an overview of possible
 callback <em>breakpoints</em>:</p>
-<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1"></a>Start Fit</span>
-<span id="cb1-2"><a href="#cb1-2"></a>   <span class="op">-</span><span class="st"> </span>on_fit_begin</span>
-<span id="cb1-3"><a href="#cb1-3"></a>  Start Epoch Loop</span>
-<span id="cb1-4"><a href="#cb1-4"></a>     <span class="op">-</span><span class="st"> </span>on_epoch_begin</span>
-<span id="cb1-5"><a href="#cb1-5"></a>    Start Train</span>
-<span id="cb1-6"><a href="#cb1-6"></a>       <span class="op">-</span><span class="st"> </span>on_train_begin</span>
-<span id="cb1-7"><a href="#cb1-7"></a>      Start Batch Loop</span>
-<span id="cb1-8"><a href="#cb1-8"></a>         <span class="op">-</span><span class="st"> </span>on_train_batch_begin</span>
-<span id="cb1-9"><a href="#cb1-9"></a>          Start Default Training Step</span>
-<span id="cb1-10"><a href="#cb1-10"></a>            <span class="op">-</span><span class="st"> </span>on_train_batch_after_pred</span>
-<span id="cb1-11"><a href="#cb1-11"></a>            <span class="op">-</span><span class="st"> </span>on_train_batch_after_loss</span>
-<span id="cb1-12"><a href="#cb1-12"></a>            <span class="op">-</span><span class="st"> </span>on_train_batch_before_backward</span>
-<span id="cb1-13"><a href="#cb1-13"></a>            <span class="op">-</span><span class="st"> </span>on_train_batch_before_step</span>
-<span id="cb1-14"><a href="#cb1-14"></a>            <span class="op">-</span><span class="st"> </span>on_train_batch_after_step</span>
-<span id="cb1-15"><a href="#cb1-15"></a>          End Default Training Step<span class="op">:</span></span>
-<span id="cb1-16"><a href="#cb1-16"></a><span class="st">         </span><span class="op">-</span><span class="st"> </span>on_train_batch_end</span>
-<span id="cb1-17"><a href="#cb1-17"></a>      End Batch Loop</span>
-<span id="cb1-18"><a href="#cb1-18"></a>       <span class="op">-</span><span class="st"> </span>on_train_end</span>
-<span id="cb1-19"><a href="#cb1-19"></a>    End Train</span>
-<span id="cb1-20"><a href="#cb1-20"></a>    Start Valid</span>
-<span id="cb1-21"><a href="#cb1-21"></a>       <span class="op">-</span><span class="st"> </span>on_valid_begin</span>
-<span id="cb1-22"><a href="#cb1-22"></a>      Start Batch Loop</span>
-<span id="cb1-23"><a href="#cb1-23"></a>         <span class="op">-</span><span class="st"> </span>on_valid_batch_begin</span>
-<span id="cb1-24"><a href="#cb1-24"></a>          Start Default Validation Step</span>
-<span id="cb1-25"><a href="#cb1-25"></a>            <span class="op">-</span><span class="st"> </span>on_valid_batch_after_pred</span>
-<span id="cb1-26"><a href="#cb1-26"></a>            <span class="op">-</span><span class="st"> </span>on_valid_batch_after_loss</span>
-<span id="cb1-27"><a href="#cb1-27"></a>          End Default Validation Step</span>
-<span id="cb1-28"><a href="#cb1-28"></a>         <span class="op">-</span><span class="st"> </span>on_valid_batch_end</span>
-<span id="cb1-29"><a href="#cb1-29"></a>      End Batch Loop</span>
-<span id="cb1-30"><a href="#cb1-30"></a>       <span class="op">-</span><span class="st"> </span>on_valid_end</span>
-<span id="cb1-31"><a href="#cb1-31"></a>    End Valid</span>
-<span id="cb1-32"><a href="#cb1-32"></a>      <span class="op">-</span><span class="st"> </span>on_epoch_end</span>
-<span id="cb1-33"><a href="#cb1-33"></a>  End Epoch Loop</span>
-<span id="cb1-34"><a href="#cb1-34"></a>   <span class="op">-</span><span class="st"> </span>on_fit_end</span>
-<span id="cb1-35"><a href="#cb1-35"></a>End Fit</span></code></pre><p></p></div>
+<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>Start Fit</span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>   <span class="sc">-</span> on_fit_begin</span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>  Start Epoch Loop</span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>     <span class="sc">-</span> on_epoch_begin</span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>    Start Train</span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>       <span class="sc">-</span> on_train_begin</span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>      Start Batch Loop</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>         <span class="sc">-</span> on_train_batch_begin</span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>          Start Default Training Step</span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_train_batch_after_pred</span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_train_batch_after_loss</span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_train_batch_before_backward</span>
+<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_train_batch_before_step</span>
+<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_train_batch_after_step</span>
+<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>          End Default Training Step<span class="sc">:</span></span>
+<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a>         <span class="sc">-</span> on_train_batch_end</span>
+<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a>      End Batch Loop</span>
+<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>       <span class="sc">-</span> on_train_end</span>
+<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a>    End Train</span>
+<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a>    Start Valid</span>
+<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a>       <span class="sc">-</span> on_valid_begin</span>
+<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a>      Start Batch Loop</span>
+<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a>         <span class="sc">-</span> on_valid_batch_begin</span>
+<span id="cb1-24"><a href="#cb1-24" aria-hidden="true" tabindex="-1"></a>          Start Default Validation Step</span>
+<span id="cb1-25"><a href="#cb1-25" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_valid_batch_after_pred</span>
+<span id="cb1-26"><a href="#cb1-26" aria-hidden="true" tabindex="-1"></a>            <span class="sc">-</span> on_valid_batch_after_loss</span>
+<span id="cb1-27"><a href="#cb1-27" aria-hidden="true" tabindex="-1"></a>          End Default Validation Step</span>
+<span id="cb1-28"><a href="#cb1-28" aria-hidden="true" tabindex="-1"></a>         <span class="sc">-</span> on_valid_batch_end</span>
+<span id="cb1-29"><a href="#cb1-29" aria-hidden="true" tabindex="-1"></a>      End Batch Loop</span>
+<span id="cb1-30"><a href="#cb1-30" aria-hidden="true" tabindex="-1"></a>       <span class="sc">-</span> on_valid_end</span>
+<span id="cb1-31"><a href="#cb1-31" aria-hidden="true" tabindex="-1"></a>    End Valid</span>
+<span id="cb1-32"><a href="#cb1-32" aria-hidden="true" tabindex="-1"></a>      <span class="sc">-</span> on_epoch_end</span>
+<span id="cb1-33"><a href="#cb1-33" aria-hidden="true" tabindex="-1"></a>  End Epoch Loop</span>
+<span id="cb1-34"><a href="#cb1-34" aria-hidden="true" tabindex="-1"></a>   <span class="sc">-</span> on_fit_end</span>
+<span id="cb1-35"><a href="#cb1-35" aria-hidden="true" tabindex="-1"></a>End Fit</span></code></pre><p></p></div>
 <p>Every step market with <code>on_*</code> is a point in the training procedure that
 is available for callbacks to be called.</p>
 <p>The other important part of callbacks is the <code>ctx</code> (context) object. See
@@ -208,14 +208,14 @@ <h2 id="prediction-callbacks">Prediction callbacks<a class="anchor" aria-label="
 
 <p>You can also use callbacks when using <code><a href="https://rdrr.io/r/stats/predict.html" class="external-link">predict()</a></code>. In this case the supported
 callback methods are detailed above.</p>
-<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1"></a>Start predict</span>
-<span id="cb1-2"><a href="#cb1-2"></a> <span class="op">-</span><span class="st"> </span>on_predict_begin</span>
-<span id="cb1-3"><a href="#cb1-3"></a> Start prediction loop</span>
-<span id="cb1-4"><a href="#cb1-4"></a>  <span class="op">-</span><span class="st"> </span>on_predict_batch_begin</span>
-<span id="cb1-5"><a href="#cb1-5"></a>  <span class="op">-</span><span class="st"> </span>on_predict_batch_end</span>
-<span id="cb1-6"><a href="#cb1-6"></a> End prediction loop</span>
-<span id="cb1-7"><a href="#cb1-7"></a> <span class="op">-</span><span class="st"> </span>on_predict_end</span>
-<span id="cb1-8"><a href="#cb1-8"></a>End predict</span></code></pre><p></p></div>
+<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>Start predict</span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="sc">-</span> on_predict_begin</span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> Start prediction loop</span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>  <span class="sc">-</span> on_predict_batch_begin</span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>  <span class="sc">-</span> on_predict_batch_end</span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> End prediction loop</span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="sc">-</span> on_predict_end</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>End predict</span></code></pre><p></p></div>
     </div>
     <div class="section level2">
     <h2 id="evaluate-callbacks">Evaluate callbacks<a class="anchor" aria-label="anchor" href="#evaluate-callbacks"></a></h2>
@@ -224,18 +224,18 @@ <h2 id="evaluate-callbacks">Evaluate callbacks<a class="anchor" aria-label="anch
 
 <p>Callbacks can also be used with <code><a href="evaluate.html">evaluate()</a></code>, in this case, the callbacks that
 are used are equivalent to those of the validation loop when using <code><a href="https://generics.r-lib.org/reference/fit.html" class="external-link">fit()</a></code>:</p>
-<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1"></a>Start Valid</span>
-<span id="cb1-2"><a href="#cb1-2"></a> <span class="op">-</span><span class="st"> </span>on_valid_begin</span>
-<span id="cb1-3"><a href="#cb1-3"></a> Start Batch Loop</span>
-<span id="cb1-4"><a href="#cb1-4"></a>  <span class="op">-</span><span class="st"> </span>on_valid_batch_begin</span>
-<span id="cb1-5"><a href="#cb1-5"></a>  Start Default Validation Step</span>
-<span id="cb1-6"><a href="#cb1-6"></a>   <span class="op">-</span><span class="st"> </span>on_valid_batch_after_pred</span>
-<span id="cb1-7"><a href="#cb1-7"></a>   <span class="op">-</span><span class="st"> </span>on_valid_batch_after_loss</span>
-<span id="cb1-8"><a href="#cb1-8"></a>  End Default Validation Step</span>
-<span id="cb1-9"><a href="#cb1-9"></a>  <span class="op">-</span><span class="st"> </span>on_valid_batch_end</span>
-<span id="cb1-10"><a href="#cb1-10"></a> End Batch Loop</span>
-<span id="cb1-11"><a href="#cb1-11"></a> <span class="op">-</span><span class="st"> </span>on_valid_end</span>
-<span id="cb1-12"><a href="#cb1-12"></a>End Valid</span></code></pre><p></p></div>
+<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>Start Valid</span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="sc">-</span> on_valid_begin</span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> Start Batch Loop</span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>  <span class="sc">-</span> on_valid_batch_begin</span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>  Start Default Validation Step</span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>   <span class="sc">-</span> on_valid_batch_after_pred</span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>   <span class="sc">-</span> on_valid_batch_after_loss</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>  End Default Validation Step</span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>  <span class="sc">-</span> on_valid_batch_end</span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> End Batch Loop</span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> <span class="sc">-</span> on_valid_end</span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>End Valid</span></code></pre><p></p></div>
     </div>
     <div class="section level2">
     <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"></a></h2>
@@ -278,7 +278,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_auto_resume.html b/reference/luz_callback_auto_resume.html
index 2013b39d..b853084c 100644
--- a/reference/luz_callback_auto_resume.html
+++ b/reference/luz_callback_auto_resume.html
@@ -177,16 +177,16 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-out co"><span class="r-pr">#&gt;</span> <span style="font-weight: bold;">Caused by error in `self[[callback_nm]]()`:</span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> <span style="color: #BBBB00;">!</span> Error on epoch 5</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span>      set metric epoch    value</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 1  train   loss     1 1.302326</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 2  train   loss     2 1.141849</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 3  train   loss     3 1.094023</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 4  train   loss     4 1.082328</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 5  train   loss     5 1.083923</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 6  train   loss     6 1.072870</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 7  train   loss     7 1.083111</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 8  train   loss     8 1.079866</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 9  train   loss     9 1.074621</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 10 train   loss    10 1.075743</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 1  train   loss     1 1.217334</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 2  train   loss     2 1.079304</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 3  train   loss     3 1.040630</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 4  train   loss     4 1.027106</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 5  train   loss     5 1.023069</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 6  train   loss     6 1.017577</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 7  train   loss     7 1.016829</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 8  train   loss     8 1.020484</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 9  train   loss     9 1.022464</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 10 train   loss    10 1.025988</span>
 </code></pre></div>
     </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
@@ -198,7 +198,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_csv_logger.html b/reference/luz_callback_csv_logger.html
index ef9e3e98..8caa5ca8 100644
--- a/reference/luz_callback_csv_logger.html
+++ b/reference/luz_callback_csv_logger.html
@@ -106,7 +106,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_early_stopping.html b/reference/luz_callback_early_stopping.html
index 594b229e..6e55924f 100644
--- a/reference/luz_callback_early_stopping.html
+++ b/reference/luz_callback_early_stopping.html
@@ -150,7 +150,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_gradient_clip.html b/reference/luz_callback_gradient_clip.html
index 9cd58b87..51e17643 100644
--- a/reference/luz_callback_gradient_clip.html
+++ b/reference/luz_callback_gradient_clip.html
@@ -101,7 +101,7 @@ <h2 id="references">References<a class="anchor" aria-label="anchor" href="#refer
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_interrupt.html b/reference/luz_callback_interrupt.html
index 1f59d51c..0c117239 100644
--- a/reference/luz_callback_interrupt.html
+++ b/reference/luz_callback_interrupt.html
@@ -122,7 +122,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_keep_best_model.html b/reference/luz_callback_keep_best_model.html
index a73fda89..fc321a38 100644
--- a/reference/luz_callback_keep_best_model.html
+++ b/reference/luz_callback_keep_best_model.html
@@ -131,7 +131,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_lr_scheduler.html b/reference/luz_callback_lr_scheduler.html
index 7415ecd5..2103eef2 100644
--- a/reference/luz_callback_lr_scheduler.html
+++ b/reference/luz_callback_lr_scheduler.html
@@ -138,7 +138,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_metrics.html b/reference/luz_callback_metrics.html
index 4a82981e..9a5b1649 100644
--- a/reference/luz_callback_metrics.html
+++ b/reference/luz_callback_metrics.html
@@ -118,7 +118,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_mixed_precision.html b/reference/luz_callback_mixed_precision.html
index cd0d4dfc..e0e6fc05 100644
--- a/reference/luz_callback_mixed_precision.html
+++ b/reference/luz_callback_mixed_precision.html
@@ -120,7 +120,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_mixup.html b/reference/luz_callback_mixup.html
index afd1a876..c299eb81 100644
--- a/reference/luz_callback_mixup.html
+++ b/reference/luz_callback_mixup.html
@@ -154,7 +154,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_model_checkpoint.html b/reference/luz_callback_model_checkpoint.html
index 99003470..7150edf9 100644
--- a/reference/luz_callback_model_checkpoint.html
+++ b/reference/luz_callback_model_checkpoint.html
@@ -203,7 +203,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_profile.html b/reference/luz_callback_profile.html
index 92338216..2a0f2a25 100644
--- a/reference/luz_callback_profile.html
+++ b/reference/luz_callback_profile.html
@@ -122,7 +122,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_progress.html b/reference/luz_callback_progress.html
index 62a38373..66dbd552 100644
--- a/reference/luz_callback_progress.html
+++ b/reference/luz_callback_progress.html
@@ -111,7 +111,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_resume_from_checkpoint.html b/reference/luz_callback_resume_from_checkpoint.html
index e1febc9f..aaded3cf 100644
--- a/reference/luz_callback_resume_from_checkpoint.html
+++ b/reference/luz_callback_resume_from_checkpoint.html
@@ -138,7 +138,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_tfevents.html b/reference/luz_callback_tfevents.html
index b18b5d2b..9f4b6c3e 100644
--- a/reference/luz_callback_tfevents.html
+++ b/reference/luz_callback_tfevents.html
@@ -90,7 +90,7 @@ <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#argumen
 </dl></div>
     <div class="section level2">
     <h2 id="details">Details<a class="anchor" aria-label="anchor" href="#details"></a></h2>
-    <p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1"></a>tensorboard <span class="op">--</span>logdir=logs</span></code></pre><p></p></div>
+    <p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>tensorboard <span class="sc">--</span>logdir<span class="ot">=</span>logs</span></code></pre><p></p></div>
     </div>
 
     <div class="section level2">
@@ -113,14 +113,14 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span><span class="op">}</span></span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> A `luz_module_fitted`</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> ── Time ────────────────────────────────────────────────────────────────────────</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> • Total time: 2.7s</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> • Avg time per training epoch: 197ms</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> • Total time: 2.4s</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> • Avg time per training epoch: 177ms</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> ── Results ─────────────────────────────────────────────────────────────────────</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> Metrics observed in the last epoch.</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> <span style="color: #0000BB;">ℹ</span> Training:</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> loss: 1.4131</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> loss: 1.4048</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> ── Model ───────────────────────────────────────────────────────────────────────</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> An `nn_module` containing 11 parameters.</span>
@@ -139,7 +139,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_callback_train_valid.html b/reference/luz_callback_train_valid.html
index 2c77add5..c463dde3 100644
--- a/reference/luz_callback_train_valid.html
+++ b/reference/luz_callback_train_valid.html
@@ -119,7 +119,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_load.html b/reference/luz_load.html
index 69e11446..7be0a17d 100644
--- a/reference/luz_load.html
+++ b/reference/luz_load.html
@@ -90,7 +90,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_load_checkpoint.html b/reference/luz_load_checkpoint.html
index 5ca1dc8f..59e0e6b2 100644
--- a/reference/luz_load_checkpoint.html
+++ b/reference/luz_load_checkpoint.html
@@ -93,7 +93,7 @@ <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#argumen
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_load_model_weights.html b/reference/luz_load_model_weights.html
index 5d8692bf..7a9219a4 100644
--- a/reference/luz_load_model_weights.html
+++ b/reference/luz_load_model_weights.html
@@ -111,7 +111,7 @@ <h2 id="warning">Warning<a class="anchor" aria-label="anchor" href="#warning"></
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric.html b/reference/luz_metric.html
index 493c5cae..6717f1d2 100644
--- a/reference/luz_metric.html
+++ b/reference/luz_metric.html
@@ -214,7 +214,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_accuracy.html b/reference/luz_metric_accuracy.html
index b6d49987..ae3dbac0 100644
--- a/reference/luz_metric_accuracy.html
+++ b/reference/luz_metric_accuracy.html
@@ -103,7 +103,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">update</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">100</span>, <span class="fl">10</span><span class="op">)</span>, <span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randint.html" class="external-link">torch_randint</a></span><span class="op">(</span><span class="fl">1</span>, <span class="fl">10</span>, size <span class="op">=</span> <span class="fl">100</span><span class="op">)</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">compute</span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="op">}</span></span></span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 0.08</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 0.07</span>
 </code></pre></div>
     </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
@@ -115,7 +115,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_binary_accuracy.html b/reference/luz_metric_binary_accuracy.html
index c01849c8..2114e899 100644
--- a/reference/luz_metric_binary_accuracy.html
+++ b/reference/luz_metric_binary_accuracy.html
@@ -106,7 +106,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">update</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_rand.html" class="external-link">torch_rand</a></span><span class="op">(</span><span class="fl">100</span><span class="op">)</span>, <span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randint.html" class="external-link">torch_randint</a></span><span class="op">(</span><span class="fl">0</span>, <span class="fl">1</span>, size <span class="op">=</span> <span class="fl">100</span><span class="op">)</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">compute</span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="op">}</span></span></span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 0.56</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 0.51</span>
 <span class="r-in"><span></span></span>
 </code></pre></div>
     </div>
@@ -119,7 +119,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_binary_accuracy_with_logits.html b/reference/luz_metric_binary_accuracy_with_logits.html
index 491ddd4e..341859e7 100644
--- a/reference/luz_metric_binary_accuracy_with_logits.html
+++ b/reference/luz_metric_binary_accuracy_with_logits.html
@@ -111,7 +111,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">update</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">100</span><span class="op">)</span>, <span class="fu">torch</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randint.html" class="external-link">torch_randint</a></span><span class="op">(</span><span class="fl">0</span>, <span class="fl">1</span>, size <span class="op">=</span> <span class="fl">100</span><span class="op">)</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">compute</span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="op">}</span></span></span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 0.41</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 0.5</span>
 </code></pre></div>
     </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
@@ -123,7 +123,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_binary_auroc.html b/reference/luz_metric_binary_auroc.html
index c856d0c0..60fcfb19 100644
--- a/reference/luz_metric_binary_auroc.html
+++ b/reference/luz_metric_binary_auroc.html
@@ -138,7 +138,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_mae.html b/reference/luz_metric_mae.html
index a20c4428..8b90b522 100644
--- a/reference/luz_metric_mae.html
+++ b/reference/luz_metric_mae.html
@@ -97,7 +97,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">update</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">100</span><span class="op">)</span>, <span class="fu"><a href="https://rdrr.io/pkg/torch/man/torch_randn.html" class="external-link">torch_randn</a></span><span class="op">(</span><span class="fl">100</span><span class="op">)</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">metric</span><span class="op">$</span><span class="fu">compute</span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="op">}</span></span></span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 1.008288</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> [1] 1.080743</span>
 </code></pre></div>
     </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
@@ -109,7 +109,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_mse.html b/reference/luz_metric_mse.html
index b38e4486..f8cd2d63 100644
--- a/reference/luz_metric_mse.html
+++ b/reference/luz_metric_mse.html
@@ -97,7 +97,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_multiclass_auroc.html b/reference/luz_metric_multiclass_auroc.html
index 621a9078..0a57254f 100644
--- a/reference/luz_metric_multiclass_auroc.html
+++ b/reference/luz_metric_multiclass_auroc.html
@@ -160,7 +160,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_rmse.html b/reference/luz_metric_rmse.html
index dbbb70ad..c9f489ea 100644
--- a/reference/luz_metric_rmse.html
+++ b/reference/luz_metric_rmse.html
@@ -97,7 +97,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_metric_set.html b/reference/luz_metric_set.html
index 0c44535d..e6673915 100644
--- a/reference/luz_metric_set.html
+++ b/reference/luz_metric_set.html
@@ -97,7 +97,7 @@ <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#argumen
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/luz_save.html b/reference/luz_save.html
index 39571e7f..eec6f9a6 100644
--- a/reference/luz_save.html
+++ b/reference/luz_save.html
@@ -115,7 +115,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/nn_mixup_loss.html b/reference/nn_mixup_loss.html
index 78e2a0f9..4d198ab7 100644
--- a/reference/nn_mixup_loss.html
+++ b/reference/nn_mixup_loss.html
@@ -102,7 +102,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/nnf_mixup.html b/reference/nnf_mixup.html
index af9a2e7b..7ca497e9 100644
--- a/reference/nnf_mixup.html
+++ b/reference/nnf_mixup.html
@@ -117,36 +117,36 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span><span class="op">}</span></span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> $x</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> torch_tensor</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> Columns 1 to 10-1.6129 -0.6514 -0.0005 -0.1299 -0.0186  0.7856 -0.2318 -0.5054 -1.1800 -0.4573</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3187  0.2511 -0.5413  0.5586 -0.3886  1.2698 -0.5388 -0.4903 -0.3484  0.1465</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.4670 -0.4047  0.5951  1.7964 -0.1125 -0.1080  0.0275 -0.1578  0.4809 -1.3088</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -1.2264 -0.5856  0.8783  0.1327 -0.6009 -0.2881  1.3573 -1.4678  0.5514  0.7259</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -1.3347 -0.1024  1.3148 -0.4473  0.5465 -0.5149  0.9301 -0.0461  0.3143  0.5608</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.8352 -0.4503 -0.6097  0.0943  0.8621 -0.3653 -0.0237  0.9062 -1.2661  1.7011</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.0790 -2.2431  1.7278  1.5395 -0.5357  0.3805  1.7119  0.1466 -0.1981  1.3738</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.5916 -0.0441 -0.1281  0.3864 -0.4095 -0.8772 -0.4108  1.1746 -1.2667 -0.9288</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.5432 -0.8673  0.3779  1.4750  0.9157 -0.8307  1.0736 -0.2050 -1.2962  0.7474</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.1829  0.7280 -1.1095  0.3540  0.9854 -0.2013 -0.1124  1.8542 -0.3396 -1.2286</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> Columns 1 to 6 2.1105e-01 -2.5707e-01  5.0293e-01 -2.6365e-01  2.9616e-01  4.3874e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -5.3747e-01 -5.0843e-01  2.0182e-01  8.5945e-01 -7.9758e-01 -1.6854e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -1.8622e+00 -7.2628e-01 -1.2179e-01 -5.3673e-01  8.8290e-01 -7.7352e-02</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  6.0087e-01  7.7442e-01 -1.8468e+00  4.8284e-01  1.4391e+00  2.0366e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -1.0608e+00 -4.1275e-01 -9.6645e-01 -5.1798e-01  2.5813e-01  1.7352e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -8.6037e-01  1.4365e-01  6.6950e-01 -4.4121e-01  4.2209e-01 -3.8243e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  4.5700e-01  7.9541e-01  4.9467e-01  1.3577e+00 -5.6978e-01 -1.1119e+00</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  7.8966e-01  4.9365e-01  1.0959e+00  6.6656e-01  2.4713e-01  2.4156e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  6.8629e-01  4.3494e-01  1.5368e+00  4.5424e-01 -3.3821e-01 -6.9955e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -2.9487e-01 -2.7045e-01 -9.3513e-01 -3.1766e-01  7.1092e-01 -8.8386e-01</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> Columns 11 to 20-0.4164 -1.8517  0.8905 -0.7747 -0.6324  0.4760 -0.7748 -0.0742 -1.0611  0.7593</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  1.3576  0.3285  0.2676 -0.0533  0.2062 -0.0335  0.3198  0.3276 -1.2097  0.0647</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.2135 -0.0988  0.3074 -0.4857 -1.5481  0.5156 -0.0364 -0.6499 -1.0595 -1.4608</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.0545 -0.1533  1.0694 -1.9981 -0.8471 -0.7479 -0.4441  0.3173  0.9581 -0.1928</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.1570  0.1221 -0.0325  0.5952 -0.7952  0.9146 -0.6144  0.1231 -0.3203 -1.2061</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.0764  1.8272  0.3365 -0.1222 -0.2703  0.5525  1.1011 -1.0144  0.5019  0.7357</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.5516  0.5019  0.3448  0.1197 -0.4418  1.5774  0.5755  0.0448  0.7243 -1.5652</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.1351 -0.4297 -0.2023 -1.3988 -1.3668 -0.4454 -0.5770  0.3981  0.2843  0.3258</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.2701  0.4466  0.5089 -1.1313  0.4318  0.9925  0.6326 -1.3562 -0.3284  0.3340</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.2723  0.8889 -1.0425 -0.6844 -0.2525 -0.4499  0.3906 -0.5498 -0.2378 -0.8349</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> Columns 7 to 12-5.8458e-01 -7.6261e-01 -9.8281e-01  1.0952e-01 -3.8169e-01  4.4187e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.5732e+00 -4.8694e-01  1.8215e-01  2.4406e-01  4.9622e-01  4.1927e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -2.1962e-01 -1.5063e-01 -4.3045e-01  6.1290e-01  1.3646e+00 -7.8468e-02</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -2.7856e-01 -1.4861e+00  5.4135e-01  2.5380e-01 -2.1084e+00 -6.7824e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -7.2378e-01  6.6451e-01 -5.6135e-02 -2.5516e-02 -1.6625e+00 -1.2545e+00</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  4.8056e-01  1.1037e+00  1.6371e+00  1.1139e-01 -6.4466e-01 -1.5184e+00</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -1.2516e+00 -1.3917e-01  1.8302e-01 -7.0514e-01 -2.0332e+00  5.3306e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -5.2930e-01 -3.2553e-01  7.5119e-01 -8.0412e-01  1.3013e+00  1.3164e+00</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  7.5627e-01  5.4333e-03 -1.1749e+00  1.0025e+00 -1.3122e-01  7.3929e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -3.6906e-01  7.2472e-01  6.5900e-01  1.6670e-01 -1.2228e-02  6.5552e-01</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> Columns 21 to 30 0.4978  0.0260  0.1258 -0.8327 -0.7895  0.1126 -0.2001  0.5339  1.1938  0.4961</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.1211 -0.9407  1.1080  1.0563  0.6874 -0.8214  0.4860 -0.4867 -0.2094 -1.9037</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.0040  0.6689 -1.3853  0.1822  0.7930 -0.8359  0.6396 -0.0760  1.2617  0.2426</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.2137  0.6507  0.2203 -2.1912 -0.4301 -0.9777  0.4276 -1.0610  0.7440 -1.2844</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.4721  1.3801  0.1906  0.0381 -1.8842 -0.2035  1.1486 -0.4319  1.2018 -1.0576</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.1008  1.8099  0.1133 -0.3625 -0.8228  0.3376 -1.2784  0.4270 -1.8851 -0.7659</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.1893  0.3863  0.0251 -0.0711  0.2325 -0.3685 -1.2638  0.1694  1.3024 -0.4186</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.6907  1.8215 -0.5999  0.0687 -0.2703 -1.2546 -0.7732  1.1821  0.6891  0.7568</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> Columns 13 to 18 3.4535e-01 -1.2002e+00 -8.3307e-01 -1.6820e+00 -5.6943e-01 -1.2224e+00</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.0747e-01  4.3129e-01  1.0875e+00  4.7297e-01 -5.5352e-01 -6.9736e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -5.6236e-01  5.3038e-01 -4.8145e-01  9.4094e-01  2.5152e+00 -8.0532e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.2296e+00 -5.9918e-01  9.1384e-01  5.5982e-02  1.0325e+00  9.0756e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -5.5983e-01  9.8870e-01  2.4292e-01 -2.4190e-01  4.6381e-01  6.5734e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -5.8572e-01 -5.4169e-01  4.0119e-01  5.8703e-01 -4.3276e-01  9.2243e-01</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -3.0966e-01  9.1974e-02  1.8338e-01  1.0977e+00  9.2757e-01  1.5192e+00</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -4.0285e-01 -1.2765e+00  3.3926e-01 -1.7810e-02 -7.1996e-01 -1.3532e+00</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> ... [the output was truncated (use n=-1 to disable)]</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [ CPUFloatType{10,768} ]</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
@@ -154,30 +154,30 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-out co"><span class="r-pr">#&gt;</span> $y$ys</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> $y$ys$y1</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> torch_tensor</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.8530</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3463</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -2.1121</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.0545</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.7580</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.9011</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.2182</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.6930</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -1.0155</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  1.3686</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.9905</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3795</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.2743</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.5082</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -1.0673</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.4616</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  0.5942</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.0820</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3193</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  0.2713</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [ CPUFloatType{10} ]</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> $y$ys$y2</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> torch_tensor</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.7580</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.9011</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.0545</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.2182</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.6930</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  1.3686</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -1.0155</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>  0.8530</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3463</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> -2.1121</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3193</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.3795</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.0820</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  0.5942</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.4616</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  1.2743</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.5082</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>  0.2713</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -0.9905</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> -1.0673</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [ CPUFloatType{10} ]</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> </span>
@@ -208,7 +208,7 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/pipe.html b/reference/pipe.html
index 315e76d3..7fc55ec8 100644
--- a/reference/pipe.html
+++ b/reference/pipe.html
@@ -78,7 +78,7 @@ <h2 id="ref-usage">Usage<a class="anchor" aria-label="anchor" href="#ref-usage">
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/predict.luz_module_fitted.html b/reference/predict.luz_module_fitted.html
index 038b9953..6623433d 100644
--- a/reference/predict.luz_module_fitted.html
+++ b/reference/predict.luz_module_fitted.html
@@ -137,7 +137,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/reexports.html b/reference/reexports.html
index 59590e2b..1ab7f769 100644
--- a/reference/reexports.html
+++ b/reference/reexports.html
@@ -93,7 +93,7 @@ <h6 class="dropdown-header" data-toc-skip>Guides</h6>
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/set_hparams.html b/reference/set_hparams.html
index ad6137a2..05667110 100644
--- a/reference/set_hparams.html
+++ b/reference/set_hparams.html
@@ -104,7 +104,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/set_opt_hparams.html b/reference/set_opt_hparams.html
index 05839f4b..526c09d7 100644
--- a/reference/set_opt_hparams.html
+++ b/reference/set_opt_hparams.html
@@ -106,7 +106,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/reference/setup.html b/reference/setup.html
index af600455..f4106d2c 100644
--- a/reference/setup.html
+++ b/reference/setup.html
@@ -141,7 +141,7 @@ <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"
 </div>
 
 <div class="pkgdown-footer-right">
-  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.9000.</p>
+  <p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
 </div>
 
     </footer></div>
diff --git a/search.json b/search.json
index 7dc4b180..85c7f8a9 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2021 luz authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"/articles/accelerator.html","id":"example","dir":"Articles","previous_headings":"","what":"Example","title":"Accelerator API","text":"Accelerator API best explained showing example diff raw torch training loop. code changes shown, longer need manually move data parameters devices, makes code easier read less error prone. can find additional documentation using help(accelerator).","code":"library(torch) + library(luz)  + acc <- accelerator() - device <- \"cpu\"  data <- tensor_dataset(   x = torch_randn(100, 10),   y = torch_rand(100, 1) )  dl <- dataloader(data, batch_size = 10)  model <- nn_linear(10, 1) - model$to(device = device) opt <- optim_adam(model$parameters)  + c(model, opt, dl) %<-% acc$prepare(model, opt, dl)  model$train() coro::loop(for (batch in dl) {    opt$zero_grad()  -  preds <- model(batch$x$to(device = device)) +  preds <- model(batch$x) -  loss <- nnf_mse_loss(preds, batch$y$to(device = device)) +  loss <- nnf_mse_loss(preds, batch$y)    loss$backward()   opt$step() })"},{"path":"/articles/checkpoints.html","id":"resuming-training-runs-that-crashed","dir":"Articles","previous_headings":"","what":"Resuming training runs that crashed","title":"Checkpointing your models","text":"long training run can crash whatever reason (computer turned , process kileed cluster, etc), recommend add luz_callback_autoresume() list callbacks. luz_callback_autoresume() automatically checkpoint whole state model end epoch. something fails training can simply rerun script, whithout code changes checkpoint reloaded training start stopped. example, lets’s take randomly generated training dataset linear model show autoresume works. ’s training data: model definition: Let’s now create callback simulates random failure happen. callback just raise R error 5th epoch. Let’s now start training adding luz_callback_auto_resume(): resume model training exactly stopped just need restart fitting, using exact model, callbacks, etc: , model fitting process continued exactly stopped. Records, optimizer model state recovered previous run can full results:","code":"x <- torch_randn(1000, 10) y <- torch_randn(1000, 1) model <- nn_linear %>%   setup(optimizer = optim_sgd, loss = nnf_mse_loss) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 0.01) interrupt <- luz_callback(   \"interrupt\",   failed = FALSE,   on_epoch_end = function() {     if (ctx$epoch == 5 && !self$failed) {       self$failed <- TRUE       stop(\"Error on epoch 5\")     }   } ) autoresume <- luz_callback_auto_resume(path = \"state.pt\") inter <- interrupt()  # An error will happen in the 5th epoch and the model will be stopped. results <- model %>% fit(   list(x, y),   callbacks = list(inter, autoresume),   verbose = FALSE ) #> Error in `FUN()`: #> ! Error while calling callback with class <interrupt/LuzCallback/R6> at #>   on_epoch_end. #> Caused by error in `self[[callback_nm]]()`: #> ! Error on epoch 5 results <- model %>% fit(   list(x, y),   callbacks = list(inter, autoresume),   verbose = FALSE ) plot(results)"},{"path":"/articles/checkpoints.html","id":"checkpointing","dir":"Articles","previous_headings":"","what":"Checkpointing","title":"Checkpointing your models","text":"Sometimes want control checkpoints handled. case can use luz_callback_model_checkpoint() save checkpoints specified file directory. Let’s use example resuming section: first generate data. define model: Let’s now fit model using luz_callback_model_checkpoint(). can see now checkpoints directory contains files state dumps epoch. default, luz_callback_model_checkpoint save state epochs format name including resulting loss. can configured withing path parameter, see ?luz_callback_model_checkpoint details. Finally, can load specific checkpoint fitted result using luz_load_checkpoint. Note loading checkpoint luz_fitted_module going modify model weights -place. can start making predictions, evaluate model using reloeded weights. might also want start new training run checkpoint. , can use luz_callback_resume_from_checkpoint(). default, recover model weights checkpoint file, can configure restore records, callback optimizer state . checkpoint directory passed training resume last checkpoint file returned fs::dir_ls. ’s use callback:","code":"x <- torch_randn(1000, 10) y <- torch_randn(1000, 1) model <- nn_linear %>%   setup(optimizer = optim_sgd, loss = nnf_mse_loss) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 0.01) checkpoint <- luz_callback_model_checkpoint(   path = \"checkpoints/\",    monitor = \"train_loss\" )  results <- model %>% fit(   list(x, y),   callbacks = list(checkpoint),   verbose = FALSE ) fs::dir_ls(\"checkpoints\") #> checkpoints/epoch-01-train_loss-1.237.pt #> checkpoints/epoch-02-train_loss-1.065.pt #> checkpoints/epoch-03-train_loss-1.026.pt #> checkpoints/epoch-04-train_loss-1.004.pt #> checkpoints/epoch-05-train_loss-1.004.pt #> checkpoints/epoch-06-train_loss-1.005.pt #> checkpoints/epoch-07-train_loss-0.999.pt #> checkpoints/epoch-08-train_loss-0.998.pt #> checkpoints/epoch-09-train_loss-1.001.pt #> checkpoints/epoch-10-train_loss-1.002.pt luz_load_checkpoint(results, fs::dir_ls(\"checkpoints\")[1]) resume <- luz_callback_resume_from_checkpoint(path = \"checkpoints/\") results <- model %>% fit(   list(x, y),   callbacks = list(resume),   verbose = FALSE ) plot(results)"},{"path":"/articles/checkpoints.html","id":"custom-callbacks-state","dir":"Articles","previous_headings":"Checkpointing","what":"Custom callbacks state","title":"Checkpointing your models","text":"Sometimes callbacks also need keep internal state order allow continuing training exactly stopped. case, callbacks can implement state_dict() load_state_dict() methods automatically called saving reloading checkpoints. example, suppose callback tracks gradients weights every epoch. want use tracked weights analyse training procedure. implemented like: example, gradients field state callback. training fails reason, gradients lost. ’s important also checkpoint callback state, can implement state_dict() method must returning named list objects compose state callback load_state_dict() taking named list returned state_dict() restoring callback state. callback reimplemented :","code":"cb_weight_grad <- luz_callback(   \"weight_grad\",   gradients = list(),   initialize = function(track_weights) {     self$track_weights   },   on_train_batch_before_step = function() {     gradients[[ctx$epoch]] <- list()     for (w in self$track_weights) {       gradients[[ctx$epoch]][[w]] <- self$model$parameters[[w]]     }   } ) cb_weight_grad <- luz_callback(   \"weight_grad\",   gradients = list(),   initialize = function(track_weights) {     self$track_weights   },   on_train_batch_before_step = function() {     gradients[[ctx$epoch]] <- list()     for (w in self$track_weights) {       gradients[[ctx$epoch]][[w]] <- self$model$parameters[[w]]     }   },   state_dict = function() {     list(gradients = self$gradients)   },   load_state_dict = function(d) {     self$gradients <- d$gradients   } )"},{"path":"/articles/custom-loop.html","id":"multiple-optimizers","dir":"Articles","previous_headings":"","what":"Multiple optimizers","title":"Custom loops with luz","text":"Suppose want experiment train first fully connected layer using learning rate 0.1 second one using learning rate 0.01. minimize nn_cross_entropy_loss() , first layer want add L1 regularization weights. order use luz , implement two methods net module: set_optimizers: returns named list optimizers depending ctx. loss: computes loss depending selected optimizer. Let’s go code: Notice model optimizers initialized according set_optimizers() method’s return value (list). case, initializing optimizers using different model parameters learning rates. loss() method responsible computing loss back-propagated compute gradients update weights. loss() method can access ctx object contain opt_name field, describing optimizer currently used. Note function called optimizer training validation step. See help(\"ctx\") complete information context object. can finally setup fit module, however longer need specify optimizers loss functions. Now let’s re-implement model using slightly flexible approach overriding training validation step.","code":"net <- nn_module(   \"Net\",   initialize = function() {     self$fc1 <- nn_linear(100, 50)     self$fc1 <- nn_linear(50, 10)   },   forward = function(x) {     x %>%        self$fc1() %>%        nnf_relu() %>%        self$fc2()   },   set_optimizers = function(lr_fc1 = 0.1, lr_fc2 = 0.01) {     list(       opt_fc1 = optim_adam(self$fc1$parameters, lr = lr_fc1),       opt_fc2 = optim_adam(self$fc2$parameters, lr = lr_fc2)     )   },   loss = function(input, target) {     pred <- ctx$model(input)        if (ctx$opt_name == \"opt_fc1\")        nnf_cross_entropy(pred, target) + torch_norm(self$fc1$weight, p = 1)     else if (ctx$opt_name == \"opt_fc2\")       nnf_cross_entropy(pred, target)   } ) fitted <- net %>%    setup(metrics = list(luz_metric_accuracy)) %>%    fit(train_dl, epochs = 10, valid_data = test_dl)"},{"path":"/articles/custom-loop.html","id":"fully-flexible-step","dir":"Articles","previous_headings":"","what":"Fully flexible step","title":"Custom loops with luz","text":"Instead implementing loss() method, can implement step() method. allows us flexibly modify happens training validating batch dataset. now responsible updating weights stepping optimizers back-propagating loss. important things notice : step() method used training validation. need careful modify weights training. , can get complete information regarding context object using help(\"ctx\"). ctx$optimizers named list holding optimizer created set_optimizers() method called. need manually track losses saving saving named list ctx$loss. convention, use name optimizer refers . good practice detach() saving reduce memory usage. Callbacks called inside default step() method like on_train_batch_after_pred, on_train_batch_after_loss, etc, won’t automatically called. can still cal manually adding ctx$call_callbacks(\"<callback name>\") inside training step. See code fit_one_batch() valid_one_batch find callbacks won’t called. want luz metrics work custom step() method, must assign ctx$pred model predictions metrics always called metric$update(ctx$pred, ctx$target).","code":"net <- nn_module(   \"Net\",   initialize = function() {     self$fc1 <- nn_linear(100, 50)     self$fc1 <- nn_linear(50, 10)   },   forward = function(x) {     x %>%        self$fc1() %>%        nnf_relu() %>%        self$fc2()   },   set_optimizers = function(lr_fc1 = 0.1, lr_fc2 = 0.01) {     list(       opt_fc1 = optim_adam(self$fc1$parameters, lr = lr_fc1),       opt_fc2 = optim_adam(self$fc2$parameters, lr = lr_fc2)     )   },   step = function() {     ctx$loss <- list()     for (opt_name in names(ctx$optimizers)) {            ctx$pred <- ctx$model(ctx$input)       opt <- ctx$optimizers[[opt_name]]       loss <- nnf_cross_entropy(pred, target)              if (opt_name == \"opt_fc1\") {         # we have L1 regularization in layer 1         loss <- nnf_cross_entropy(pred, target) +            torch_norm(self$fc1$weight, p = 1)       }                if (ctx$training) {         opt$zero_grad()         loss$backward()         opt$step()         }              ctx$loss[[opt_name]] <- loss$detach()     }   } )"},{"path":"/articles/custom-loop.html","id":"next-steps","dir":"Articles","previous_headings":"","what":"Next steps","title":"Custom loops with luz","text":"article learned customize step() training loop using luz layered functionality. Luz also allows flexible modifications training loop described Accelerator vignette (vignette(\"accelerator\")). now able follow examples marked ‘intermediate’ ‘advanced’ category examples gallery.","code":""},{"path":"/articles/get-started.html","id":"training-a-nn_module","dir":"Articles","previous_headings":"","what":"Training a nn_module","title":"Get started with luz","text":"much possible, luz tries reuse existing structures torch. model luz defined identically define using raw torch. specific example, definition feed-forward CNN can used classify digits MNIST dataset: can now train model train_dl validate test_dl torch::dataloaders() : Let’s understand happens chunk code: setup function allows configure loss (objective) function optimizer use train model. Optionally can pass list metrics tracked training procedure. Note: loss function can function taking input target tensors returning scalar tensor value, optimizer can core torch optimizer custom ones created torch::optimizer() function. set_hparams() function allows set hyper-parameters passed module initialize() method. example case pass num_classes = 10. set_opt_hparams() function allows pass hyper-parameters used optimizer function. example, optim_adam() can take lr parameter specifying learning rate specify lr = 0.003. fit method take model specification provided setup() run training procedure using specified training validation torch::dataloaders() well number epochs. Note: reuse core torch data structures, instead providing data loading functionality. returned object fitted contains trained model well record metrics losses produced training. can also used producing predictions evaluating trained model datasets. fitting, luz use fastest possible accelerator; CUDA-capable GPU available used, otherwise fall back CPU. also automatically moves data, optimizers, models selected device don’t need handle manually (general error prone). create predictions trained model can use predict method:","code":"net <- nn_module(   \"Net\",   initialize = function(num_class) {     self$conv1 <- nn_conv2d(1, 32, 3, 1)     self$conv2 <- nn_conv2d(32, 64, 3, 1)     self$dropout1 <- nn_dropout2d(0.25)     self$dropout2 <- nn_dropout2d(0.5)     self$fc1 <- nn_linear(9216, 128)     self$fc2 <- nn_linear(128, num_class)   },   forward = function(x) {     x <- self$conv1(x)     x <- nnf_relu(x)     x <- self$conv2(x)     x <- nnf_relu(x)     x <- nnf_max_pool2d(x, 2)     x <- self$dropout1(x)     x <- torch_flatten(x, start_dim = 2)     x <- self$fc1(x)     x <- nnf_relu(x)     x <- self$dropout2(x)     x <- self$fc2(x)     x   } ) fitted <- net %>%   setup(     loss = nn_cross_entropy_loss(),     optimizer = optim_adam,     metrics = list(       luz_metric_accuracy     )   ) %>%   set_hparams(num_class = 10) %>%    set_opt_hparams(lr = 0.003) %>%    fit(train_dl, epochs = 10, valid_data = test_dl) predictions <- predict(fitted, test_dl)"},{"path":"/articles/get-started.html","id":"the-training-loop","dir":"Articles","previous_headings":"","what":"The training loop","title":"Get started with luz","text":"now general idea use fit function now ’s important overview ’s happening inside . pseudocode, ’s fit . fully detailed help build intuition:","code":"# -> Initialize objects: model, optimizers. # -> Select fitting device. # -> Move data, model, optimizers to the selected device. # -> Start training for (epoch in 1:epochs) {   # -> Training procedure   for (batch in train_dl) {     # -> Calculate model `forward` method.     # -> Calculate the loss     # -> Update weights     # -> Update metrics and tracking loss   }   # -> Validation procedure   for (batch in valid_dl) {     # -> Calculate model `forward` method.     # -> Calculate the loss     # -> Update metrics and tracking loss   } } # -> End training"},{"path":"/articles/get-started.html","id":"metrics","dir":"Articles","previous_headings":"","what":"Metrics","title":"Get started with luz","text":"One important parts machine learning projects choosing evaluation metric. Luz allows tracking many different metrics training minimal code changes. order track metrics, need modify metrics parameter setup function: Luz provides implementations used metrics. metric available can always implement new one using luz_metric function. order implement new luz_metric need implement 3 methods: initialize: defines metric initial state. function called epoch training validation loops. update: updates metric internal state. function called every training validation step predictions obtained model target values obtained dataloader. compute: uses internal state compute metric values. function called whenever need obtain current metric value. Eg, ’s called every training step metrics displayed progress bar, called per epoch record ’s value progress bar displayed. Optionally, can implement abbrev field gives metric abbreviation used displaying metric information console tracking record. abbrev passed, class name used. Let’s take look implementation luz_metric_accuracy can see implement new one: Note: ’s good practice compute metric returns regular R values instead torch tensors parts luz expect .","code":"fitted <- net %>%   setup(     ...     metrics = list(       luz_metric_accuracy     )   ) %>%   fit(...) luz_metric_accuracy <- luz_metric(   # An abbreviation to be shown in progress bars, or    # when printing progress   abbrev = \"Acc\",    # Initial setup for the metric. Metrics are initialized   # every epoch, for both training and validation   initialize = function() {     self$correct <- 0     self$total <- 0   },   # Run at every training or validation step and updates   # the internal state. The update function takes `preds`   # and `target` as parameters.   update = function(preds, target) {     pred <- torch::torch_argmax(preds, dim = 2)     self$correct <- self$correct + (pred == target)$       to(dtype = torch::torch_float())$       sum()$       item()     self$total <- self$total + pred$numel()   },   # Use the internal state to query the metric value   compute = function() {     self$correct/self$total   } )"},{"path":"/articles/get-started.html","id":"evaluate","dir":"Articles","previous_headings":"","what":"Evaluate","title":"Get started with luz","text":"model trained might want evaluate performance different dataset. reason, luz provides ?evaluate function takes fitted model dataset computes metrics attached model. Evaluate returns luz_module_evaluation object can query metrics using get_metrics function simply print see results. example:","code":"evaluation <- fitted %>% evaluate(data = valid_dl) metrics <- get_metrics(evaluation) print(evaluation) #> A `luz_module_evaluation` #> -- Results --------------------------------------------------------------------- #> loss: 1.8892 #> mae: 1.0522 #> mse: 1.645 #> rmse: 1.2826"},{"path":"/articles/get-started.html","id":"customizing-with-callbacks","dir":"Articles","previous_headings":"","what":"Customizing with callbacks","title":"Get started with luz","text":"Luz provides different ways customize training progress depending level control need training loop. fastest way ‘reusable’, sense can create training modifications can used many different situations, via callbacks. training loop luz many breakpoints can call arbitrary R functions. functionality allows customize training process without modify general training logic. Luz implements 3 default callbacks occur every training procedure: train-eval callback: Sets model train() eval() depending procedure training validation. metrics callback: evaluate metrics training validation process. progress callback: implements progress bar prints progress information training. can also implement custom callbacks modify act specifically training procedure. example: Let’s implement callback prints ‘Iteration n’ (n iteration number) every batch training set ‘Done’ epoch finished. task use luz_callback function: luz_callback() takes named functions ... arguments, name indicates moment callback called. instance on_train_batch_end() called every batch end training procedure, on_epoch_end() called end every epoch. returned value luz_callback() function initializes instance callback. Callbacks can initialization parameters, like name file want log results. case, can pass initialize method creating callback definition, save parameters self object. example, callback message parameter printed end epoch. callback defined can passed fit function via callbacks parameter: Callbacks can called many different positions training loop, including combinations . ’s overview possible callback breakpoints: Every step market on_* point training procedure available callbacks called. important part callbacks ctx (context) object. See help(\"ctx\") details. default, callbacks called order passed fit (predict evaluate), can provide weight attribute control order called. example, one callback weight = 10 another weight = 1, first one called second one. Callbacks don’t specify weight attribute considered weight = 0. built-callbacks luz already provide weight value. example, ?luz_callback_early_stopping weight Inf, since general want run last thing loop. ctx object used luz share information training loop callbacks, model methods, metrics. table describes information available ctx default. callbacks potentially modify attributes add new ones. Context attributes Attributes ctx can used produce desired behavior callbacks. can find information context object using help(\"ctx\"). example, use ctx$iter attribute print iteration number training batch.","code":"print_callback <- luz_callback(   name = \"print_callback\",   initialize = function(message) {     self$message <- message   },   on_train_batch_end = function() {     cat(\"Iteration \", ctx$iter, \"\\n\")   },   on_epoch_end = function() {     cat(self$message, \"\\n\")   } ) fitted <- net %>%   setup(...) %>%   fit(..., callbacks = list(     print_callback(message = \"Done!\")   )) Start Fit    - on_fit_begin   Start Epoch Loop      - on_epoch_begin     Start Train        - on_train_begin       Start Batch Loop          - on_train_batch_begin           Start Default Training Step             - on_train_batch_after_pred             - on_train_batch_after_loss             - on_train_batch_before_backward             - on_train_batch_before_step             - on_train_batch_after_step           End Default Training Step:          - on_train_batch_end       End Batch Loop        - on_train_end     End Train     Start Valid        - on_valid_begin       Start Batch Loop          - on_valid_batch_begin           Start Default Validation Step             - on_valid_batch_after_pred             - on_valid_batch_after_loss           End Default Validation Step          - on_valid_batch_end       End Batch Loop        - on_valid_end     End Valid       - on_epoch_end   End Epoch Loop    - on_fit_end End Fit"},{"path":"/articles/get-started.html","id":"next-steps","dir":"Articles","previous_headings":"","what":"Next steps","title":"Get started with luz","text":"article learned train first model using luz basics customization using custom metrics callbacks. Luz also allows flexible modifications training loop described vignette(\"custom-loop\"). now able follow examples marked ‘basic’ category examples gallery.","code":""},{"path":"/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Daniel Falbel. Author, maintainer, copyright holder. RStudio. Copyright holder.","code":""},{"path":"/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Falbel D (2023). luz: Higher Level 'API' 'torch'. https://mlverse.github.io/luz/, https://github.com/mlverse/luz.","code":"@Manual{,   title = {luz: Higher Level 'API' for 'torch'},   author = {Daniel Falbel},   year = {2023},   note = {https://mlverse.github.io/luz/, https://github.com/mlverse/luz}, }"},{"path":"/index.html","id":"luz","dir":"","previous_headings":"","what":"Higher Level API for torch","title":"Higher Level API for torch","text":"Luz higher level API torch providing abstractions allow much less verbose training loops. package still development. heavily inspired higher level frameworks deep learning, cite : FastAI: heavily inspired FastAI library, especially Learner object callbacks API. Keras: also heavily inspired Keras, especially callback names. lightning module interface similar compile, . PyTorch Lightning: idea luz_module subclass nn_module inspired LightningModule object lightning. HuggingFace Accelerate: internal device placement API heavily inspired Accelerate, much modest features. Currently CPU Single GPU supported.","code":""},{"path":"/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Higher Level API for torch","text":"can install released version CRAN : development version :","code":"install.packages(\"luz\") remotes::install_github(\"mlverse/luz\")"},{"path":"/index.html","id":"example","dir":"","previous_headings":"","what":"Example","title":"Higher Level API for torch","text":"Luz lets take torch nn_module definition fit dataloader, handling boring parts like moving data devices, updating weights, showing progress bars tracking metrics. ’s example defining training Autoencoder MNIST dataset. selected parts code highlight luz functionality. can find full example code . Now defined Autoencoder architecture using torch::nn_module(), can fit using luz:","code":"net <- nn_module(   \"Net\",   initialize = function() {     self$encoder <- nn_sequential(       nn_conv2d(1, 6, kernel_size=5),       nn_relu(),       nn_conv2d(6, 16, kernel_size=5),       nn_relu()     )     self$decoder <- nn_sequential(       nn_conv_transpose2d(16, 6, kernel_size = 5),       nn_relu(),       nn_conv_transpose2d(6, 1, kernel_size = 5),       nn_sigmoid()     )   },   forward = function(x) {     x %>%       self$encoder() %>%       self$decoder()   } ) fitted <- net %>%   setup(     loss = nn_mse_loss(),     optimizer = optim_adam   ) %>%   fit(train_dl, epochs = 1, valid_data = test_dl)"},{"path":"/reference/accelerator.html","id":null,"dir":"Reference","previous_headings":"","what":"Create an accelerator — accelerator","title":"Create an accelerator — accelerator","text":"Create accelerator","code":""},{"path":"/reference/accelerator.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create an accelerator — accelerator","text":"","code":"accelerator(   device_placement = TRUE,   cpu = FALSE,   cuda_index = torch::cuda_current_device() )"},{"path":"/reference/accelerator.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create an accelerator — accelerator","text":"device_placement (logical) whether accelerator object handle device placement. Default: TRUE cpu (logical) whether training procedure run CPU. cuda_index (integer) index CUDA device use multiple GPUs available. Default: result torch::cuda_current_device().","code":""},{"path":"/reference/as_dataloader.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a dataloader from its input — as_dataloader","title":"Creates a dataloader from its input — as_dataloader","text":"as_dataloader used internally luz convert input data valid_data passed fit.luz_module_generator() torch::dataloader","code":""},{"path":"/reference/as_dataloader.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a dataloader from its input — as_dataloader","text":"","code":"as_dataloader(x, ...)  # S3 method for dataset as_dataloader(x, ..., batch_size = 32)  # S3 method for iterable_dataset as_dataloader(x, ..., batch_size = 32)  # S3 method for list as_dataloader(x, ...)  # S3 method for dataloader as_dataloader(x, ...)  # S3 method for matrix as_dataloader(x, ...)  # S3 method for numeric as_dataloader(x, ...)  # S3 method for array as_dataloader(x, ...)  # S3 method for torch_tensor as_dataloader(x, ...)"},{"path":"/reference/as_dataloader.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a dataloader from its input — as_dataloader","text":"x input object. ... Passed torch::dataloader(). batch_size (int, optional): many samples per batch load (default: 1).","code":""},{"path":"/reference/as_dataloader.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Creates a dataloader from its input — as_dataloader","text":"as_dataloader methods sensible defaults batch_size, parallel workers, etc. allows users quickly experiment fit.luz_module_generator() requiring create torch::dataset torch::dataloader simple experiments.","code":""},{"path":"/reference/as_dataloader.html","id":"methods-by-class-","dir":"Reference","previous_headings":"","what":"Methods (by class)","title":"Creates a dataloader from its input — as_dataloader","text":"as_dataloader(dataset): Converts torch::dataset() torch::dataloader(). as_dataloader(iterable_dataset): Converts torch::iterable_dataset() torch::dataloader() as_dataloader(list): Converts list tensors arrays size first dimension  torch::dataloader() as_dataloader(dataloader): Returns dataloader as_dataloader(matrix): Converts matrix dataloader as_dataloader(numeric): Converts numeric vector dataloader as_dataloader(array): Converts array dataloader as_dataloader(torch_tensor): Converts tensor dataloader","code":""},{"path":"/reference/as_dataloader.html","id":"overriding","dir":"Reference","previous_headings":"","what":"Overriding","title":"Creates a dataloader from its input — as_dataloader","text":"can implement as_dataloader S3 method want data structure automatically supported luz's fit.luz_module_generator(). method must satisfy following conditions: method return torch::dataloader(). required argument x. good default arguments. better avoid implementing as_dataloader methods common S3 classes like data.frames. case, better assign different class inputs implement as_dataloader .","code":""},{"path":"/reference/context.html","id":null,"dir":"Reference","previous_headings":"","what":"Context object — context","title":"Context object — context","text":"Context object storing information model training context. See also ctx.","code":""},{"path":"/reference/context.html","id":"public-fields","dir":"Reference","previous_headings":"","what":"Public fields","title":"Context object — context","text":"buffers list buffers callbacks can use write temporary information ctx.","code":""},{"path":"/reference/context.html","id":"active-bindings","dir":"Reference","previous_headings":"","what":"Active bindings","title":"Context object — context","text":"records stores information values logged self$log. device allows querying current accelerator device callbacks list callbacks called. iter current iteration batch current batch data. list input data targets. input shortcut ctx$batch[[1]] target shortcut ctx$batch[[2]] min_epochs minimum number epochs model run . max_epochs maximum number epochs model run. hparams list hyperparameters used initialize ctx$model. opt_hparams list hyperparameters used initialize ctx$optimizers. train_data dataloader used training model valid_data dataloader using model validation accelerator accelerator() used move data, model etc correct device. optimizers named list optimizers used model training. verbose bool wether process verbose mode . handlers List error handlers can used. See rlang::try_fetch() info. epoch_handlers List error handlers can used. See rlang::try_fetch() info. training bool indicating model training validation mode. model model trained. pred Last predicted values. opt Current optimizer. opt_name Current optimizer name. data Current dataloader use. loss_fn Loss function used train model loss Last computed loss values. Detached graph. loss_grad Last computed loss value, detached, can additional tranformation. epoch Current epoch. metrics List metrics tracked process. step_opt Defines step called optimizer. must function taking optimizer argument.","code":""},{"path":[]},{"path":"/reference/context.html","id":"public-methods","dir":"Reference","previous_headings":"","what":"Public methods","title":"Context object — context","text":"context$new() context$log() context$log_metric() context$get_log() context$get_metrics() context$get_metric() context$get_formatted_metrics() context$get_metrics_df() context$set_verbose() context$clean() context$call_callbacks() context$state_dict() context$unsafe_set_records() context$clone()","code":""},{"path":"/reference/context.html","id":"method-new-","dir":"Reference","previous_headings":"","what":"Method new()","title":"Context object — context","text":"Initializes context object minimal necessary information.","code":""},{"path":"/reference/context.html","id":"usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$new(verbose, accelerator, callbacks, training)"},{"path":"/reference/context.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"verbose Whether context verbose mode . accelerator luz accelerator() configures device placement others. callbacks list callbacks used model. See luz_callback(). training boolean indicates context training mode .","code":""},{"path":"/reference/context.html","id":"method-log-","dir":"Reference","previous_headings":"","what":"Method log()","title":"Context object — context","text":"Allows logging arbitrary information ctx.","code":""},{"path":"/reference/context.html","id":"usage-1","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$log(what, set, value, index = NULL, append = TRUE)"},{"path":"/reference/context.html","id":"arguments-1","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"(string) logging. set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. value value log value Arbitrary value log. index Index value logged. NULL value added end list, otherwise index used. append TRUE value corresponding index already exists, value appended current value. FALSE value overwritten favor new value.","code":""},{"path":"/reference/context.html","id":"method-log-metric-","dir":"Reference","previous_headings":"","what":"Method log_metric()","title":"Context object — context","text":"Log metric gen name value. Metric values indexed epoch.","code":""},{"path":"/reference/context.html","id":"usage-2","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$log_metric(name, value)"},{"path":"/reference/context.html","id":"arguments-2","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"name name metric value value log value Arbitrary value log.","code":""},{"path":"/reference/context.html","id":"method-get-log-","dir":"Reference","previous_headings":"","what":"Method get_log()","title":"Context object — context","text":"Get specific value log.","code":""},{"path":"/reference/context.html","id":"usage-3","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_log(what, set, index = NULL)"},{"path":"/reference/context.html","id":"arguments-3","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"(string) logging. set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. index Index value logged. NULL value added end list, otherwise index used.","code":""},{"path":"/reference/context.html","id":"method-get-metrics-","dir":"Reference","previous_headings":"","what":"Method get_metrics()","title":"Context object — context","text":"Get metric given epoch set.","code":""},{"path":"/reference/context.html","id":"usage-4","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_metrics(set, epoch = NULL)"},{"path":"/reference/context.html","id":"arguments-4","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. epoch epoch want extract metrics .","code":""},{"path":"/reference/context.html","id":"method-get-metric-","dir":"Reference","previous_headings":"","what":"Method get_metric()","title":"Context object — context","text":"Get value metric given name, epoch set.","code":""},{"path":"/reference/context.html","id":"usage-5","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_metric(name, set, epoch = NULL)"},{"path":"/reference/context.html","id":"arguments-5","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"name name metric set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. epoch epoch want extract metrics .","code":""},{"path":"/reference/context.html","id":"method-get-formatted-metrics-","dir":"Reference","previous_headings":"","what":"Method get_formatted_metrics()","title":"Context object — context","text":"Get formatted metrics values","code":""},{"path":"/reference/context.html","id":"usage-6","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_formatted_metrics(set, epoch = NULL)"},{"path":"/reference/context.html","id":"arguments-6","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. epoch epoch want extract metrics .","code":""},{"path":"/reference/context.html","id":"method-get-metrics-df-","dir":"Reference","previous_headings":"","what":"Method get_metrics_df()","title":"Context object — context","text":"Get data.frame containing metrics.","code":""},{"path":"/reference/context.html","id":"usage-7","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_metrics_df()"},{"path":"/reference/context.html","id":"method-set-verbose-","dir":"Reference","previous_headings":"","what":"Method set_verbose()","title":"Context object — context","text":"Allows setting verbose attribute.","code":""},{"path":"/reference/context.html","id":"usage-8","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$set_verbose(verbose = NULL)"},{"path":"/reference/context.html","id":"arguments-7","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"verbose boolean. TRUE verbose mode used. FALSE non verbose. NULL use result interactive().","code":""},{"path":"/reference/context.html","id":"method-clean-","dir":"Reference","previous_headings":"","what":"Method clean()","title":"Context object — context","text":"Removes unnecessary information context object.","code":""},{"path":"/reference/context.html","id":"usage-9","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$clean()"},{"path":"/reference/context.html","id":"method-call-callbacks-","dir":"Reference","previous_headings":"","what":"Method call_callbacks()","title":"Context object — context","text":"Call selected callbacks. name callback types call, eg 'on_epoch_begin'.","code":""},{"path":"/reference/context.html","id":"usage-10","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$call_callbacks(name)"},{"path":"/reference/context.html","id":"arguments-8","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"name name metric","code":""},{"path":"/reference/context.html","id":"method-state-dict-","dir":"Reference","previous_headings":"","what":"Method state_dict()","title":"Context object — context","text":"Returns list containing minimal information context. Used create returned values.","code":""},{"path":"/reference/context.html","id":"usage-11","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$state_dict()"},{"path":"/reference/context.html","id":"method-unsafe-set-records-","dir":"Reference","previous_headings":"","what":"Method unsafe_set_records()","title":"Context object — context","text":"sure know ?","code":""},{"path":"/reference/context.html","id":"usage-12","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$unsafe_set_records(records)"},{"path":"/reference/context.html","id":"arguments-9","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"records New set records set.","code":""},{"path":"/reference/context.html","id":"method-clone-","dir":"Reference","previous_headings":"","what":"Method clone()","title":"Context object — context","text":"objects class cloneable method.","code":""},{"path":"/reference/context.html","id":"usage-13","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$clone(deep = FALSE)"},{"path":"/reference/context.html","id":"arguments-10","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"deep Whether make deep clone.","code":""},{"path":"/reference/ctx.html","id":null,"dir":"Reference","previous_headings":"","what":"Context object — ctx","title":"Context object — ctx","text":"Context objects used luz share information model methods, metrics callbacks.","code":""},{"path":"/reference/ctx.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Context object — ctx","text":"ctx object used luz share information training loop callbacks, model methods, metrics. table describes information available ctx default. callbacks potentially modify attributes add new ones. Context attributes","code":""},{"path":[]},{"path":"/reference/evaluate.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluates a fitted model on a dataset — evaluate","title":"Evaluates a fitted model on a dataset — evaluate","text":"Evaluates fitted model dataset","code":""},{"path":"/reference/evaluate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluates a fitted model on a dataset — evaluate","text":"","code":"evaluate(   object,   data,   ...,   metrics = NULL,   callbacks = list(),   accelerator = NULL,   verbose = NULL,   dataloader_options = NULL )"},{"path":"/reference/evaluate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluates a fitted model on a dataset — evaluate","text":"object fitted model evaluate. data (dataloader, dataset list) dataloader created torch::dataloader() used training model, dataset created torch::dataset() list. Dataloaders datasets must return list 2 items. first item used input module second used target loss function. ... Currently unused. metrics list luz metrics tracked evaluation. NULL (default) metrics used training tracked. callbacks (list, optional) list callbacks defined luz_callback() called training procedure. callbacks luz_callback_metrics(), luz_callback_progress() luz_callback_train_valid() always added default. accelerator (accelerator, optional) optional accelerator() object used configure device placement components like nn_modules, optimizers batches data. verbose (logical, optional) optional boolean value indicating fitting procedure emit output console training. default, produce output interactive() TRUE, otherwise print console. dataloader_options Options used creating dataloader. See torch::dataloader(). shuffle=TRUE default training data batch_size=32 default. error NULL data already dataloader.","code":""},{"path":"/reference/evaluate.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Evaluates a fitted model on a dataset — evaluate","text":"model trained might want evaluate performance different dataset. reason, luz provides ?evaluate function takes fitted model dataset computes metrics attached model. Evaluate returns luz_module_evaluation object can query metrics using get_metrics function simply print see results. example:","code":"evaluation <- fitted %>% evaluate(data = valid_dl) metrics <- get_metrics(evaluation) print(evaluation) ## A `luz_module_evaluation` ## -- Results --------------------------------------------------------------------- ## loss: 1.5146 ## mae: 1.0251 ## mse: 1.5159 ## rmse: 1.2312"},{"path":[]},{"path":"/reference/fit.luz_module_generator.html","id":null,"dir":"Reference","previous_headings":"","what":"Fit a nn_module — fit.luz_module_generator","title":"Fit a nn_module — fit.luz_module_generator","text":"Fit nn_module","code":""},{"path":"/reference/fit.luz_module_generator.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fit a nn_module — fit.luz_module_generator","text":"","code":"# S3 method for luz_module_generator fit(   object,   data,   epochs = 10,   callbacks = NULL,   valid_data = NULL,   accelerator = NULL,   verbose = NULL,   ...,   dataloader_options = NULL )"},{"path":"/reference/fit.luz_module_generator.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fit a nn_module — fit.luz_module_generator","text":"object nn_module setup(). data (dataloader, dataset list) dataloader created torch::dataloader() used training model, dataset created torch::dataset() list. Dataloaders datasets must return list 2 items. first item used input module second used target loss function. epochs (int) maximum number epochs training model. single value provided, taken max_epochs min_epochs set 0. vector two numbers provided, first value min_epochs second value max_epochs. minimum maximum number epochs included context object ctx$min_epochs ctx$max_epochs, respectively. callbacks (list, optional) list callbacks defined luz_callback() called training procedure. callbacks luz_callback_metrics(), luz_callback_progress() luz_callback_train_valid() always added default. valid_data (dataloader, dataset, list scalar value; optional) dataloader created torch::dataloader() dataset created torch::dataset() used validation procedure. must return list (input, target). data torch dataset list, can also supply numeric value 0 1 - case random sample size corresponding proportion data used validation. accelerator (accelerator, optional) optional accelerator() object used configure device placement components like nn_modules, optimizers batches data. verbose (logical, optional) optional boolean value indicating fitting procedure emit output console training. default, produce output interactive() TRUE, otherwise print console. ... Currently unused. dataloader_options Options used creating dataloader. See torch::dataloader(). shuffle=TRUE default training data batch_size=32 default. error NULL data already dataloader.","code":""},{"path":"/reference/fit.luz_module_generator.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fit a nn_module — fit.luz_module_generator","text":"fitted object can saved luz_save() can printed print() plotted plot().","code":""},{"path":[]},{"path":"/reference/get_metrics.html","id":null,"dir":"Reference","previous_headings":"","what":"Get metrics from the object — get_metrics","title":"Get metrics from the object — get_metrics","text":"Get metrics object","code":""},{"path":"/reference/get_metrics.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get metrics from the object — get_metrics","text":"","code":"get_metrics(object, ...)  # S3 method for luz_module_fitted get_metrics(object, ...)"},{"path":"/reference/get_metrics.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get metrics from the object — get_metrics","text":"object object query metrics. ... Currently unused.","code":""},{"path":"/reference/get_metrics.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get metrics from the object — get_metrics","text":"data.frame containing metric values.","code":""},{"path":"/reference/get_metrics.html","id":"methods-by-class-","dir":"Reference","previous_headings":"","what":"Methods (by class)","title":"Get metrics from the object — get_metrics","text":"get_metrics(luz_module_fitted): Extract metrics luz fitted model.","code":""},{"path":"/reference/lr_finder.html","id":null,"dir":"Reference","previous_headings":"","what":"Learning Rate Finder — lr_finder","title":"Learning Rate Finder — lr_finder","text":"Learning Rate Finder","code":""},{"path":"/reference/lr_finder.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Learning Rate Finder — lr_finder","text":"","code":"lr_finder(   object,   data,   steps = 100,   start_lr = 1e-07,   end_lr = 0.1,   log_spaced_intervals = TRUE,   ...,   verbose = NULL )"},{"path":"/reference/lr_finder.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Learning Rate Finder — lr_finder","text":"object nn_module setup(). data (dataloader) dataloader created torch::dataloader()  used learning rate finding. steps (integer) number steps iterate learning rate finder. Default: 100. start_lr (float) smallest learning rate. Default: 1e-7. end_lr (float) highest learning rate. Default: 1e-1. log_spaced_intervals (logical) Whether divide range start_lr end_lr log-spaced intervals (alternative: uniform intervals). Default: TRUE ... arguments passed fit. verbose Wether show progress bar process.","code":""},{"path":"/reference/lr_finder.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Learning Rate Finder — lr_finder","text":"dataframe two columns: learning rate loss","code":""},{"path":"/reference/lr_finder.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Learning Rate Finder — lr_finder","text":"","code":"if (torch::torch_is_installed()) { library(torch) ds <- torch::tensor_dataset(x = torch_randn(100, 10), y = torch_randn(100, 1)) dl <- torch::dataloader(ds, batch_size = 32) model <- torch::nn_linear model <- model %>% setup(   loss = torch::nn_mse_loss(),   optimizer = torch::optim_adam ) %>%   set_hparams(in_features = 10, out_features = 1) records <- lr_finder(model, dl, verbose = FALSE) plot(records) }"},{"path":"/reference/luz_callback.html","id":null,"dir":"Reference","previous_headings":"","what":"Create a new callback — luz_callback","title":"Create a new callback — luz_callback","text":"Create new callback","code":""},{"path":"/reference/luz_callback.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create a new callback — luz_callback","text":"","code":"luz_callback(   name = NULL,   ...,   private = NULL,   active = NULL,   parent_env = parent.frame(),   inherit = NULL )"},{"path":"/reference/luz_callback.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create a new callback — luz_callback","text":"name name callback ... Public methods callback. name methods used know called. See details section. private optional list private members, can functions non-functions. active optional list active binding functions. parent_env environment use parent newly-created objects. inherit R6ClassGenerator object inherit ; words, superclass. captured unevaluated expression evaluated parent_env time object instantiated.","code":""},{"path":"/reference/luz_callback.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create a new callback — luz_callback","text":"luz_callback can passed fit.luz_module_generator().","code":""},{"path":"/reference/luz_callback.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Create a new callback — luz_callback","text":"Let’s implement callback prints ‘Iteration n’ (n iteration number) every batch training set ‘Done’ epoch finished. task use luz_callback function:   luz_callback() takes named functions ... arguments, name indicates moment callback called. instance on_train_batch_end() called every batch end training procedure, on_epoch_end() called end every epoch. returned value luz_callback() function initializes instance callback. Callbacks can initialization parameters, like name file want log results. case, can pass initialize method creating callback definition, save parameters self object. example, callback message parameter printed end epoch. callback defined can passed fit function via callbacks parameter:   Callbacks can called many different positions training loop, including combinations . ’s overview possible callback breakpoints:   Every step market on_* point training procedure available callbacks called. important part callbacks ctx (context) object. See help(\"ctx\") details. default, callbacks called order passed fit (predict evaluate), can provide weight attribute control order called. example, one callback weight = 10 another weight = 1, first one called second one. Callbacks don’t specify weight attribute considered weight = 0. built-callbacks luz already provide weight value. example, ?luz_callback_early_stopping weight Inf, since general want run last thing loop.","code":"print_callback <- luz_callback(   name = \"print_callback\",   initialize = function(message) {     self$message <- message   },   on_train_batch_end = function() {     cat(\"Iteration \", ctx$iter, \"\\n\")   },   on_epoch_end = function() {     cat(self$message, \"\\n\")   } ) fitted <- net %>%   setup(...) %>%   fit(..., callbacks = list(     print_callback(message = \"Done!\")   )) Start Fit    - on_fit_begin   Start Epoch Loop      - on_epoch_begin     Start Train        - on_train_begin       Start Batch Loop          - on_train_batch_begin           Start Default Training Step             - on_train_batch_after_pred             - on_train_batch_after_loss             - on_train_batch_before_backward             - on_train_batch_before_step             - on_train_batch_after_step           End Default Training Step:          - on_train_batch_end       End Batch Loop        - on_train_end     End Train     Start Valid        - on_valid_begin       Start Batch Loop          - on_valid_batch_begin           Start Default Validation Step             - on_valid_batch_after_pred             - on_valid_batch_after_loss           End Default Validation Step          - on_valid_batch_end       End Batch Loop        - on_valid_end     End Valid       - on_epoch_end   End Epoch Loop    - on_fit_end End Fit"},{"path":"/reference/luz_callback.html","id":"prediction-callbacks","dir":"Reference","previous_headings":"","what":"Prediction callbacks","title":"Create a new callback — luz_callback","text":"can also use callbacks using predict(). case supported callback methods detailed .","code":"Start predict  - on_predict_begin  Start prediction loop   - on_predict_batch_begin   - on_predict_batch_end  End prediction loop  - on_predict_end End predict"},{"path":"/reference/luz_callback.html","id":"evaluate-callbacks","dir":"Reference","previous_headings":"","what":"Evaluate callbacks","title":"Create a new callback — luz_callback","text":"Callbacks can also used evaluate(), case, callbacks used equivalent validation loop using fit():","code":"Start Valid  - on_valid_begin  Start Batch Loop   - on_valid_batch_begin   Start Default Validation Step    - on_valid_batch_after_pred    - on_valid_batch_after_loss   End Default Validation Step   - on_valid_batch_end  End Batch Loop  - on_valid_end End Valid"},{"path":[]},{"path":"/reference/luz_callback.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create a new callback — luz_callback","text":"","code":"print_callback <- luz_callback(  name = \"print_callback\",  on_train_batch_end = function() {    cat(\"Iteration \", ctx$iter, \"\\n\")  },  on_epoch_end = function() {    cat(\"Done!\\n\")  } )"},{"path":"/reference/luz_callback_auto_resume.html","id":null,"dir":"Reference","previous_headings":"","what":"Resume training callback — luz_callback_auto_resume","title":"Resume training callback — luz_callback_auto_resume","text":"callback allows resume training model.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Resume training callback — luz_callback_auto_resume","text":"","code":"luz_callback_auto_resume(path = \"./state.pt\")"},{"path":"/reference/luz_callback_auto_resume.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Resume training callback — luz_callback_auto_resume","text":"path Path save state files model.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Resume training callback — luz_callback_auto_resume","text":"using , model weights, optimizer state serialized end epoch. something fails training simply re-running script restart model training epoch right last epoch serialized.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Resume training callback — luz_callback_auto_resume","text":"general want add callback last callbacks list, way, serialized state likely contain possible changes callbacks made 'on_epoch_end'. default weight attribute callback Inf. Read checkpointing article pkgdown website information.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"customizing-serialization","dir":"Reference","previous_headings":"","what":"Customizing serialization","title":"Resume training callback — luz_callback_auto_resume","text":"default model, optimizer state records serialized. Callbacks can used customize serialization implementing state_dict() load_state_dict() methods. methods implemented, state_dict() called end epoch load_state_dict() called model resumed.","code":""},{"path":[]},{"path":"/reference/luz_callback_auto_resume.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Resume training callback — luz_callback_auto_resume","text":"","code":"if (torch::torch_is_installed()) { library(torch) library(luz)  x <- torch_randn(1000, 10) y <- torch_randn(1000, 1)  model <- nn_linear %>%   setup(optimizer = optim_sgd, loss = nnf_mse_loss) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 0.01)   # simulate a failure in the middle of epoch 5 happening only once. callback_stop <- luz_callback(   \"interrupt\",   failed = FALSE,   on_epoch_end = function() {     if (ctx$epoch == 5 && !self$failed) {       self$failed <- TRUE       stop(\"Error on epoch 5\")     }   } )  path <- tempfile() autoresume <- luz_callback_auto_resume(path = path) interrupt <- callback_stop()  # try once and the model fails try({   results <- model %>% fit(     list(x, y),     callbacks = list(autoresume, interrupt),     verbose = FALSE   ) })  # model resumes and completes results <- model %>% fit(   list(x, y),   callbacks = list(autoresume, interrupt),   verbose = FALSE )  get_metrics(results)  } #> Error in FUN(X[[i]], ...) :  #>   Error while calling callback with class <interrupt/LuzCallback/R6> at #> on_epoch_end. #> Caused by error in `self[[callback_nm]]()`: #> ! Error on epoch 5 #>      set metric epoch    value #> 1  train   loss     1 1.302326 #> 2  train   loss     2 1.141849 #> 3  train   loss     3 1.094023 #> 4  train   loss     4 1.082328 #> 5  train   loss     5 1.083923 #> 6  train   loss     6 1.072870 #> 7  train   loss     7 1.083111 #> 8  train   loss     8 1.079866 #> 9  train   loss     9 1.074621 #> 10 train   loss    10 1.075743"},{"path":"/reference/luz_callback_csv_logger.html","id":null,"dir":"Reference","previous_headings":"","what":"CSV logger callback — luz_callback_csv_logger","title":"CSV logger callback — luz_callback_csv_logger","text":"Logs metrics obtained training fiel disk. file 1 line epoch/validation.","code":""},{"path":"/reference/luz_callback_csv_logger.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"CSV logger callback — luz_callback_csv_logger","text":"","code":"luz_callback_csv_logger(path)"},{"path":"/reference/luz_callback_csv_logger.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"CSV logger callback — luz_callback_csv_logger","text":"path path file disk.","code":""},{"path":[]},{"path":"/reference/luz_callback_early_stopping.html","id":null,"dir":"Reference","previous_headings":"","what":"Early stopping callback — luz_callback_early_stopping","title":"Early stopping callback — luz_callback_early_stopping","text":"Stops training monitored metric stops improving","code":""},{"path":"/reference/luz_callback_early_stopping.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Early stopping callback — luz_callback_early_stopping","text":"","code":"luz_callback_early_stopping(   monitor = \"valid_loss\",   min_delta = 0,   patience = 0,   mode = \"min\",   baseline = NULL )"},{"path":"/reference/luz_callback_early_stopping.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Early stopping callback — luz_callback_early_stopping","text":"monitor string format <set>_<metric> <set> can 'train' 'valid' <metric> can abbreviation metric tracking training. metric name case insensitive. min_delta Minimum improvement reset patience counter. patience Number epochs without improving stoping training. mode Specifies direction considered improvement. default 'min' used. Can also 'max' (higher better) 'zero' (closer zero better). baseline initial value used best seen value begining. Model stopm training better baseline value found first patience epochs.","code":""},{"path":"/reference/luz_callback_early_stopping.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Early stopping callback — luz_callback_early_stopping","text":"luz_callback early stopping.","code":""},{"path":"/reference/luz_callback_early_stopping.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Early stopping callback — luz_callback_early_stopping","text":"callback adds on_early_stopping callback can used call callbacks soon model stops training. verbose=TRUE fit.luz_module_generator() message printed early stopping.","code":""},{"path":[]},{"path":"/reference/luz_callback_early_stopping.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Early stopping callback — luz_callback_early_stopping","text":"","code":"cb <- luz_callback_early_stopping()"},{"path":"/reference/luz_callback_gradient_clip.html","id":null,"dir":"Reference","previous_headings":"","what":"Gradient clipping callback — luz_callback_gradient_clip","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"adding GradientClip callback, gradient norm_type (default:2) norm clipped max_norm (default:1) using torch::nn_utils_clip_grad_norm_(), can avoid loss divergence.","code":""},{"path":"/reference/luz_callback_gradient_clip.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"","code":"luz_callback_gradient_clip(max_norm = 1, norm_type = 2)"},{"path":"/reference/luz_callback_gradient_clip.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"max_norm (float int): max norm gradients norm_type (float int): type used p-norm. Can Inf infinity norm.","code":""},{"path":"/reference/luz_callback_gradient_clip.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"See FastAI documentation GradientClip callback.","code":""},{"path":"/reference/luz_callback_interrupt.html","id":null,"dir":"Reference","previous_headings":"","what":"Interrupt callback — luz_callback_interrupt","title":"Interrupt callback — luz_callback_interrupt","text":"Adds handler allows interrupting training loop using ctrl + C. Also registers on_interrupt breakpoint users can register callbacks run training loop interruption.","code":""},{"path":"/reference/luz_callback_interrupt.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Interrupt callback — luz_callback_interrupt","text":"","code":"luz_callback_interrupt()"},{"path":"/reference/luz_callback_interrupt.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Interrupt callback — luz_callback_interrupt","text":"luz_callback","code":""},{"path":"/reference/luz_callback_interrupt.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Interrupt callback — luz_callback_interrupt","text":"general need use callback always included default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_interrupt.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Interrupt callback — luz_callback_interrupt","text":"","code":"interrupt_callback <- luz_callback_interrupt()"},{"path":"/reference/luz_callback_keep_best_model.html","id":null,"dir":"Reference","previous_headings":"","what":"Keep the best model — luz_callback_keep_best_model","title":"Keep the best model — luz_callback_keep_best_model","text":"epoch, improvement monitored metric serialize model weights temp file. training done, reload weights best model.","code":""},{"path":"/reference/luz_callback_keep_best_model.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Keep the best model — luz_callback_keep_best_model","text":"","code":"luz_callback_keep_best_model(   monitor = \"valid_loss\",   mode = \"min\",   min_delta = 0 )"},{"path":"/reference/luz_callback_keep_best_model.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Keep the best model — luz_callback_keep_best_model","text":"monitor string format <set>_<metric> <set> can 'train' 'valid' <metric> can abbreviation metric tracking training. metric name case insensitive. mode Specifies direction considered improvement. default 'min' used. Can also 'max' (higher better) 'zero' (closer zero better). min_delta Minimum improvement reset patience counter.","code":""},{"path":[]},{"path":"/reference/luz_callback_keep_best_model.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Keep the best model — luz_callback_keep_best_model","text":"","code":"cb <- luz_callback_keep_best_model()"},{"path":"/reference/luz_callback_lr_scheduler.html","id":null,"dir":"Reference","previous_headings":"","what":"Learning rate scheduler callback — luz_callback_lr_scheduler","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"Initializes runs torch::lr_scheduler()s.","code":""},{"path":"/reference/luz_callback_lr_scheduler.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"","code":"luz_callback_lr_scheduler(   lr_scheduler,   ...,   call_on = \"on_epoch_end\",   opt_name = NULL )"},{"path":"/reference/luz_callback_lr_scheduler.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"lr_scheduler torch::lr_scheduler() initialized optimizer ... parameters. ... Additional arguments passed lr_scheduler together optimizers. call_on callback breakpoint scheduler$step() called. Default 'on_epoch_end'. See luz_callback() information. opt_name name optimizer affected callback. match name given set_optimizers. module single optimizer, opt_name used.","code":""},{"path":"/reference/luz_callback_lr_scheduler.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"luz_callback() generator.","code":""},{"path":[]},{"path":"/reference/luz_callback_lr_scheduler.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"","code":"if (torch::torch_is_installed()) { cb <- luz_callback_lr_scheduler(torch::lr_step, step_size = 30) }"},{"path":"/reference/luz_callback_metrics.html","id":null,"dir":"Reference","previous_headings":"","what":"Metrics callback — luz_callback_metrics","title":"Metrics callback — luz_callback_metrics","text":"Tracks metrics passed setup() training validation.","code":""},{"path":"/reference/luz_callback_metrics.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Metrics callback — luz_callback_metrics","text":"","code":"luz_callback_metrics()"},{"path":"/reference/luz_callback_metrics.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Metrics callback — luz_callback_metrics","text":"luz_callback","code":""},{"path":"/reference/luz_callback_metrics.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Metrics callback — luz_callback_metrics","text":"callback takes care 2 ctx attributes: ctx$metrics: stores current metrics objects initialized epoch, update()d compute()d every batch. rarely need work metrics. ctx$records$metrics: Stores metrics per training/validation epoch. structure similar ctx$losses.","code":""},{"path":"/reference/luz_callback_metrics.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Metrics callback — luz_callback_metrics","text":"general need explicitly use metrics callback used default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_mixed_precision.html","id":null,"dir":"Reference","previous_headings":"","what":"Automatic Mixed Precision callback — luz_callback_mixed_precision","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"callback enable torch::local_autocast() training model forward loss computation. disable autocast scale loss backward() opt$step(). See information.","code":""},{"path":"/reference/luz_callback_mixed_precision.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"","code":"luz_callback_mixed_precision(...)"},{"path":"/reference/luz_callback_mixed_precision.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"... Passed torch::cuda_amp_grad_scaler().","code":""},{"path":"/reference/luz_callback_mixed_precision.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"luz_callback","code":""},{"path":[]},{"path":"/reference/luz_callback_mixup.html","id":null,"dir":"Reference","previous_headings":"","what":"Mixup callback — luz_callback_mixup","title":"Mixup callback — luz_callback_mixup","text":"Implementation 'mixup: Beyond Empirical Risk Minimization'. today, tested categorical data, targets expected integers, one-hot encoded vectors. callback supposed used together nn_mixup_loss().","code":""},{"path":"/reference/luz_callback_mixup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mixup callback — luz_callback_mixup","text":"","code":"luz_callback_mixup(alpha = 0.4, ..., run_valid = FALSE, auto_loss = FALSE)"},{"path":"/reference/luz_callback_mixup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Mixup callback — luz_callback_mixup","text":"alpha parameter beta distribution used sample mixing coefficients ... currently unused. Just force named arguments. run_valid run validation auto_loss automatically modify loss function? wrap loss function create mixup loss. TRUE make sure loss function apply reductions. run_valid=FALSE, loss mean reduced validation.","code":""},{"path":"/reference/luz_callback_mixup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mixup callback — luz_callback_mixup","text":"luz_callback","code":""},{"path":"/reference/luz_callback_mixup.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Mixup callback — luz_callback_mixup","text":"Overall, follow fastai implementation described . Namely, work single dataloader , randomly mixing two observations batch. linearly combine losses computed targets: loss(output, new_target) = weight * loss(output, target1) + (1-weight) * loss(output, target2) draw different mixing coefficients every pair. replace weight weight = max(weight, 1-weight) avoid duplicates.","code":""},{"path":[]},{"path":"/reference/luz_callback_mixup.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Mixup callback — luz_callback_mixup","text":"","code":"if (torch::torch_is_installed()) { mixup_callback <- luz_callback_mixup() }"},{"path":"/reference/luz_callback_model_checkpoint.html","id":null,"dir":"Reference","previous_headings":"","what":"Checkpoints model weights — luz_callback_model_checkpoint","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"saves checkpoints model according specified metric behavior.","code":""},{"path":"/reference/luz_callback_model_checkpoint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"","code":"luz_callback_model_checkpoint(   path,   monitor = \"valid_loss\",   save_best_only = FALSE,   mode = \"min\",   min_delta = 0 )"},{"path":"/reference/luz_callback_model_checkpoint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"path Path save model disk. path interpolated glue, can use attribute within ctx using '{ctx$epoch}'. Specially epoch monitor quantities already environment. specified path path directory (ends / \\), models saved name given epoch-{epoch:02d}-{self$monitor}-{monitor:.3f}.pt. See examples. can use sprintf() quickly format quantities, example:'{epoch:02d}'. monitor string format <set>_<metric> <set> can 'train' 'valid' <metric> can abbreviation metric tracking training. metric name case insensitive. save_best_only TRUE models saved improvement previously saved model. mode Specifies direction considered improvement. default 'min' used. Can also 'max' (higher better) 'zero' (closer zero better). min_delta Minimum difference consider improvement. used save_best_only=TRUE.","code":""},{"path":"/reference/luz_callback_model_checkpoint.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"mode min_delta used save_best_only=TRUE. save_best_only overwrite saved models path parameter differentiate epochs. Read checkpointing article pkgdown website information.","code":""},{"path":[]},{"path":"/reference/luz_callback_model_checkpoint.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"","code":"luz_callback_model_checkpoint(path= \"path/to/dir\") #> <model_checkpoint_callback> #>   Inherits from: <monitor_metrics> #>   Public: #>     call: function (callback_nm)  #>     clone: function (deep = FALSE)  #>     compare: function (new, old)  #>     find_quantity: function ()  #>     fmt_path: function (path)  #>     initialize: function (path, monitor = \"valid_loss\", save_best_only = FALSE,  #>     min_delta: 0 #>     mode: min #>     monitor: valid_loss #>     on_epoch_end: function ()  #>     path: path/to/dir #>     save_best_only: FALSE #>     set_ctx: function (ctx)  luz_callback_model_checkpoint(path= \"path/to/dir/epoch-{epoch:02d}/model.pt\") #> <model_checkpoint_callback> #>   Inherits from: <monitor_metrics> #>   Public: #>     call: function (callback_nm)  #>     clone: function (deep = FALSE)  #>     compare: function (new, old)  #>     find_quantity: function ()  #>     fmt_path: function (path)  #>     initialize: function (path, monitor = \"valid_loss\", save_best_only = FALSE,  #>     min_delta: 0 #>     mode: min #>     monitor: valid_loss #>     on_epoch_end: function ()  #>     path: path/to/dir/epoch-{epoch:02d}/model.pt #>     save_best_only: FALSE #>     set_ctx: function (ctx)  luz_callback_model_checkpoint(path= \"path/to/dir/epoch-{epoch:02d}/model-{monitor:.2f}.pt\") #> <model_checkpoint_callback> #>   Inherits from: <monitor_metrics> #>   Public: #>     call: function (callback_nm)  #>     clone: function (deep = FALSE)  #>     compare: function (new, old)  #>     find_quantity: function ()  #>     fmt_path: function (path)  #>     initialize: function (path, monitor = \"valid_loss\", save_best_only = FALSE,  #>     min_delta: 0 #>     mode: min #>     monitor: valid_loss #>     on_epoch_end: function ()  #>     path: path/to/dir/epoch-{epoch:02d}/model-{monitor:.2f}.pt #>     save_best_only: FALSE #>     set_ctx: function (ctx)"},{"path":"/reference/luz_callback_profile.html","id":null,"dir":"Reference","previous_headings":"","what":"Profile callback — luz_callback_profile","title":"Profile callback — luz_callback_profile","text":"Computes times high-level operations training loops.","code":""},{"path":"/reference/luz_callback_profile.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Profile callback — luz_callback_profile","text":"","code":"luz_callback_profile()"},{"path":"/reference/luz_callback_profile.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Profile callback — luz_callback_profile","text":"luz_callback","code":""},{"path":"/reference/luz_callback_profile.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Profile callback — luz_callback_profile","text":"Records saved ctx$records$profile. Times stored seconds. Data stored following structure: fit time entire fit procedure. epoch times per epoch","code":""},{"path":"/reference/luz_callback_profile.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Profile callback — luz_callback_profile","text":"general need use callback always included default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_profile.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Profile callback — luz_callback_profile","text":"","code":"profile_callback <- luz_callback_profile()"},{"path":"/reference/luz_callback_progress.html","id":null,"dir":"Reference","previous_headings":"","what":"Progress callback — luz_callback_progress","title":"Progress callback — luz_callback_progress","text":"Responsible printing progress training.","code":""},{"path":"/reference/luz_callback_progress.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Progress callback — luz_callback_progress","text":"","code":"luz_callback_progress()"},{"path":"/reference/luz_callback_progress.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Progress callback — luz_callback_progress","text":"luz_callback","code":""},{"path":"/reference/luz_callback_progress.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Progress callback — luz_callback_progress","text":"general need use callback always included default fit.luz_module_generator(). Printing can disabled passing verbose=FALSE fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":null,"dir":"Reference","previous_headings":"","what":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"Allow resume model training specific checkpoint","code":""},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"","code":"luz_callback_resume_from_checkpoint(   path,   ...,   restore_model_state = TRUE,   restore_records = FALSE,   restore_optimizer_state = FALSE,   restore_callbacks_state = FALSE )"},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"path Path checkpoint want resume. ... currently unused. restore_model_state Wether restore model state callback. restore_records Wether restore records checkpoint. restore_optimizer_state Wether restore optimizer state checkpoint. restore_callbacks_state Wether restore callbacks state checkpoint.","code":""},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"Read checkpointing article pkgdown website information.","code":""},{"path":[]},{"path":"/reference/luz_callback_tfevents.html","id":null,"dir":"Reference","previous_headings":"","what":"tfevents callback — luz_callback_tfevents","title":"tfevents callback — luz_callback_tfevents","text":"Logs metrics model information tfevents file format. Assuming tensorboard installed, result can visualized ","code":""},{"path":"/reference/luz_callback_tfevents.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"tfevents callback — luz_callback_tfevents","text":"","code":"luz_callback_tfevents(logdir = \"logs\", histograms = FALSE, ...)"},{"path":"/reference/luz_callback_tfevents.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"tfevents callback — luz_callback_tfevents","text":"logdir directory log written . histograms boolean specifying histograms model weights logged. can also character vector specifying name parameters logged (names names(model$parameters)). ... Currently used. future expansion.","code":""},{"path":"/reference/luz_callback_tfevents.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"tfevents callback — luz_callback_tfevents","text":"","code":"tensorboard --logdir=logs"},{"path":"/reference/luz_callback_tfevents.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"tfevents callback — luz_callback_tfevents","text":"","code":"if (torch::torch_is_installed()) { library(torch) x <- torch_randn(1000, 10) y <- torch_randn(1000, 1)  model <- nn_linear %>%   setup(loss = nnf_mse_loss, optimizer = optim_adam) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 1e-4)  tmp <- tempfile()  model %>% fit(list(x, y), valid_data = 0.2, callbacks = list(   luz_callback_tfevents(tmp, histograms = TRUE) )) } #> A `luz_module_fitted` #> ── Time ──────────────────────────────────────────────────────────────────────── #> • Total time: 2.7s #> • Avg time per training epoch: 197ms #>  #> ── Results ───────────────────────────────────────────────────────────────────── #> Metrics observed in the last epoch. #>  #> ℹ Training: #> loss: 1.4131 #>  #> ── Model ─────────────────────────────────────────────────────────────────────── #> An `nn_module` containing 11 parameters. #>  #> ── Parameters ────────────────────────────────────────────────────────────────── #> • weight: Float [1:1, 1:10] #> • bias: Float [1:1]"},{"path":"/reference/luz_callback_train_valid.html","id":null,"dir":"Reference","previous_headings":"","what":"Train-eval callback — luz_callback_train_valid","title":"Train-eval callback — luz_callback_train_valid","text":"Switches important flags training evaluation modes.","code":""},{"path":"/reference/luz_callback_train_valid.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Train-eval callback — luz_callback_train_valid","text":"","code":"luz_callback_train_valid()"},{"path":"/reference/luz_callback_train_valid.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Train-eval callback — luz_callback_train_valid","text":"luz_callback","code":""},{"path":"/reference/luz_callback_train_valid.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Train-eval callback — luz_callback_train_valid","text":"takes care three ctx attributes: ctx$model: Responsible calling ctx$model$train() ctx$model$eval(), appropriate. ctx$training: Sets flag TRUE training FALSE validation mode. ctx$loss: Resets loss attribute list() finished training/ validating.","code":""},{"path":"/reference/luz_callback_train_valid.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Train-eval callback — luz_callback_train_valid","text":"general need explicitly use metrics callback used default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_load.html","id":null,"dir":"Reference","previous_headings":"","what":"Load trained model — luz_load","title":"Load trained model — luz_load","text":"Loads fitted model. See documentation luz_save().","code":""},{"path":"/reference/luz_load.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Load trained model — luz_load","text":"","code":"luz_load(path)"},{"path":"/reference/luz_load.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Load trained model — luz_load","text":"path path file system save object.","code":""},{"path":[]},{"path":"/reference/luz_load_checkpoint.html","id":null,"dir":"Reference","previous_headings":"","what":"Loads a checkpoint — luz_load_checkpoint","title":"Loads a checkpoint — luz_load_checkpoint","text":"Works checkpoints created typically luz_callback_model_checkpoint().","code":""},{"path":"/reference/luz_load_checkpoint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loads a checkpoint — luz_load_checkpoint","text":"","code":"luz_load_checkpoint(obj, path, ...)"},{"path":"/reference/luz_load_checkpoint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loads a checkpoint — luz_load_checkpoint","text":"obj Object want laod checkpoint. path Path checkpoint disk. ... unused. allow future extensions.","code":""},{"path":"/reference/luz_load_model_weights.html","id":null,"dir":"Reference","previous_headings":"","what":"Loads model weights into a fitted object. — luz_load_model_weights","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"can useful saved model checkpoints training want reload best checkpoint end.","code":""},{"path":"/reference/luz_load_model_weights.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"","code":"luz_load_model_weights(obj, path, ...)  luz_save_model_weights(obj, path)"},{"path":"/reference/luz_load_model_weights.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"obj luz object want copy new weights. path path saved model disk. ... arguments passed torch_load().","code":""},{"path":"/reference/luz_load_model_weights.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"Returns NULL invisibly.","code":""},{"path":"/reference/luz_load_model_weights.html","id":"warning","dir":"Reference","previous_headings":"","what":"Warning","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"luz_save_model_weights operates inplace, ie modifies model object contain new weights.","code":""},{"path":"/reference/luz_metric.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a new luz metric — luz_metric","title":"Creates a new luz metric — luz_metric","text":"Creates new luz metric","code":""},{"path":"/reference/luz_metric.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a new luz metric — luz_metric","text":"","code":"luz_metric(   name = NULL,   ...,   private = NULL,   active = NULL,   parent_env = parent.frame(),   inherit = NULL )"},{"path":"/reference/luz_metric.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a new luz metric — luz_metric","text":"name string naming new metric. ... named list public methods. implement least initialize, update compute. See details section information. private optional list private members, can functions non-functions. active optional list active binding functions. parent_env environment use parent newly-created objects. inherit R6ClassGenerator object inherit ; words, superclass. captured unevaluated expression evaluated parent_env time object instantiated.","code":""},{"path":"/reference/luz_metric.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Creates a new luz metric — luz_metric","text":"Returns new luz metric.","code":""},{"path":"/reference/luz_metric.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Creates a new luz metric — luz_metric","text":"order implement new luz_metric need implement 3 methods: initialize: defines metric initial state. function called epoch training validation loops. update: updates metric internal state. function called every training validation step predictions obtained model target values obtained dataloader. compute: uses internal state compute metric values. function called whenever need obtain current metric value. Eg, ’s called every training step metrics displayed progress bar, called per epoch record ’s value progress bar displayed. Optionally, can implement abbrev field gives metric abbreviation used displaying metric information console tracking record. abbrev passed, class name used. Let’s take look implementation luz_metric_accuracy can see implement new one:   Note: ’s good practice compute metric returns regular R values instead torch tensors parts luz expect .","code":"luz_metric_accuracy <- luz_metric(   # An abbreviation to be shown in progress bars, or    # when printing progress   abbrev = \"Acc\",    # Initial setup for the metric. Metrics are initialized   # every epoch, for both training and validation   initialize = function() {     self$correct <- 0     self$total <- 0   },   # Run at every training or validation step and updates   # the internal state. The update function takes `preds`   # and `target` as parameters.   update = function(preds, target) {     pred <- torch::torch_argmax(preds, dim = 2)     self$correct <- self$correct + (pred == target)$       to(dtype = torch::torch_float())$       sum()$       item()     self$total <- self$total + pred$numel()   },   # Use the internal state to query the metric value   compute = function() {     self$correct/self$total   } )"},{"path":[]},{"path":"/reference/luz_metric.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Creates a new luz metric — luz_metric","text":"","code":"luz_metric_accuracy <- luz_metric(   # An abbreviation to be shown in progress bars, or   # when printing progress   abbrev = \"Acc\",   # Initial setup for the metric. Metrics are initialized   # every epoch, for both training and validation   initialize = function() {     self$correct <- 0     self$total <- 0   },   # Run at every training or validation step and updates   # the internal state. The update function takes `preds`   # and `target` as parameters.   update = function(preds, target) {     pred <- torch::torch_argmax(preds, dim = 2)     self$correct <- self$correct + (pred == target)$       to(dtype = torch::torch_float())$       sum()$       item()     self$total <- self$total + pred$numel()   },   # Use the internal state to query the metric value   compute = function() {     self$correct/self$total   } )"},{"path":"/reference/luz_metric_accuracy.html","id":null,"dir":"Reference","previous_headings":"","what":"Accuracy — luz_metric_accuracy","title":"Accuracy — luz_metric_accuracy","text":"Computes accuracy multi-class classification problems.","code":""},{"path":"/reference/luz_metric_accuracy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Accuracy — luz_metric_accuracy","text":"","code":"luz_metric_accuracy()"},{"path":"/reference/luz_metric_accuracy.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Accuracy — luz_metric_accuracy","text":"Returns new luz metric.","code":""},{"path":"/reference/luz_metric_accuracy.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Accuracy — luz_metric_accuracy","text":"metric expects take logits probabilities every update. take columnwise argmax compare target.","code":""},{"path":[]},{"path":"/reference/luz_metric_accuracy.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Accuracy — luz_metric_accuracy","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_accuracy() metric <- metric$new() metric$update(torch_randn(100, 10), torch::torch_randint(1, 10, size = 100)) metric$compute() } #> [1] 0.08"},{"path":"/reference/luz_metric_binary_accuracy.html","id":null,"dir":"Reference","previous_headings":"","what":"Binary accuracy — luz_metric_binary_accuracy","title":"Binary accuracy — luz_metric_binary_accuracy","text":"Computes accuracy binary classification problems model returns probabilities. Commonly used loss torch::nn_bce_loss().","code":""},{"path":"/reference/luz_metric_binary_accuracy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Binary accuracy — luz_metric_binary_accuracy","text":"","code":"luz_metric_binary_accuracy(threshold = 0.5)"},{"path":"/reference/luz_metric_binary_accuracy.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Binary accuracy — luz_metric_binary_accuracy","text":"threshold value used classifiy observations 0 1.","code":""},{"path":"/reference/luz_metric_binary_accuracy.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Binary accuracy — luz_metric_binary_accuracy","text":"Returns new luz metric.","code":""},{"path":[]},{"path":"/reference/luz_metric_binary_accuracy.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Binary accuracy — luz_metric_binary_accuracy","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_binary_accuracy(threshold = 0.5) metric <- metric$new() metric$update(torch_rand(100), torch::torch_randint(0, 1, size = 100)) metric$compute() } #> [1] 0.56"},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":null,"dir":"Reference","previous_headings":"","what":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"Computes accuracy binary classification problems model return logits. Commonly used together torch::nn_bce_with_logits_loss().","code":""},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"","code":"luz_metric_binary_accuracy_with_logits(threshold = 0.5)"},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"threshold value used classifiy observations 0 1.","code":""},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"Returns new luz metric.","code":""},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"Probabilities generated using torch::nnf_sigmoid() threshold used classify 0 1.","code":""},{"path":[]},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_binary_accuracy_with_logits(threshold = 0.5) metric <- metric$new() metric$update(torch_randn(100), torch::torch_randint(0, 1, size = 100)) metric$compute() } #> [1] 0.41"},{"path":"/reference/luz_metric_binary_auroc.html","id":null,"dir":"Reference","previous_headings":"","what":"Computes the area under the ROC — luz_metric_binary_auroc","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"avoid storing predictions targets epoch compute confusion matrices across range pre-established thresholds.","code":""},{"path":"/reference/luz_metric_binary_auroc.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"","code":"luz_metric_binary_auroc(   num_thresholds = 200,   thresholds = NULL,   from_logits = FALSE )"},{"path":"/reference/luz_metric_binary_auroc.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"num_thresholds Number thresholds used compute confusion matrices. case, thresholds created getting num_thresholds values linearly spaced unit interval. thresholds (optional) threshold passed, used compute confusion matrices num_thresholds ignored. from_logits Boolean indicating predictions logits, case use sigmoid put unit interval.","code":""},{"path":[]},{"path":"/reference/luz_metric_binary_auroc.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"","code":"if (torch::torch_is_installed()){ library(torch) actual <- c(1, 1, 1, 0, 0, 0) predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2)  y_true <- torch_tensor(actual) y_pred <- torch_tensor(predicted)  m <- luz_metric_binary_auroc(thresholds = predicted) m <- m$new()  m$update(y_pred[1:2], y_true[1:2]) m$update(y_pred[3:4], y_true[3:4]) m$update(y_pred[5:6], y_true[5:6])  m$compute() } #> [1] 0.8888889"},{"path":"/reference/luz_metric_mae.html","id":null,"dir":"Reference","previous_headings":"","what":"Mean absolute error — luz_metric_mae","title":"Mean absolute error — luz_metric_mae","text":"Computes mean absolute error.","code":""},{"path":"/reference/luz_metric_mae.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mean absolute error — luz_metric_mae","text":"","code":"luz_metric_mae()"},{"path":"/reference/luz_metric_mae.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mean absolute error — luz_metric_mae","text":"Returns new luz metric.","code":""},{"path":[]},{"path":"/reference/luz_metric_mae.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Mean absolute error — luz_metric_mae","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_mae() metric <- metric$new() metric$update(torch_randn(100), torch_randn(100)) metric$compute() } #> [1] 1.008288"},{"path":"/reference/luz_metric_mse.html","id":null,"dir":"Reference","previous_headings":"","what":"Mean squared error — luz_metric_mse","title":"Mean squared error — luz_metric_mse","text":"Computes mean squared error","code":""},{"path":"/reference/luz_metric_mse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mean squared error — luz_metric_mse","text":"","code":"luz_metric_mse()"},{"path":"/reference/luz_metric_mse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mean squared error — luz_metric_mse","text":"luz_metric object.","code":""},{"path":[]},{"path":"/reference/luz_metric_multiclass_auroc.html","id":null,"dir":"Reference","previous_headings":"","what":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"definition Keras used default. equivalent 'micro' method SciKit Learn . See docs.","code":""},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"","code":"luz_metric_multiclass_auroc(   num_thresholds = 200,   thresholds = NULL,   from_logits = FALSE,   average = c(\"micro\", \"macro\", \"weighted\", \"none\") )"},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"num_thresholds Number thresholds used compute confusion matrices. case, thresholds created getting num_thresholds values linearly spaced unit interval. thresholds (optional) threshold passed, used compute confusion matrices num_thresholds ignored. from_logits TRUE call torch::nnf_softmax() predictions computing metric. average averaging method: 'micro': Stack classes computes AUROC binary classification problem. 'macro': Finds AUCROC class computes mean. 'weighted': Finds AUROC class computes weighted mean pondering number instances class. 'none': Returns AUROC class list.","code":""},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"Note class imbalance can affect metric unlike AUC binary classification. Currently AUC approximated using 'interpolation' method described Keras.","code":""},{"path":[]},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"","code":"if (torch::torch_is_installed()) { library(torch) actual <- c(1, 1, 1, 0, 0, 0) + 1L predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2) predicted <- cbind(1-predicted, predicted)  y_true <- torch_tensor(as.integer(actual)) y_pred <- torch_tensor(predicted)  m <- luz_metric_multiclass_auroc(thresholds = as.numeric(predicted),                                  average = \"micro\") m <- m$new()  m$update(y_pred[1:2,], y_true[1:2]) m$update(y_pred[3:4,], y_true[3:4]) m$update(y_pred[5:6,], y_true[5:6]) m$compute() } #> [1] 0.9027778"},{"path":"/reference/luz_metric_rmse.html","id":null,"dir":"Reference","previous_headings":"","what":"Root mean squared error — luz_metric_rmse","title":"Root mean squared error — luz_metric_rmse","text":"Computes root mean squared error.","code":""},{"path":"/reference/luz_metric_rmse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Root mean squared error — luz_metric_rmse","text":"","code":"luz_metric_rmse()"},{"path":"/reference/luz_metric_rmse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Root mean squared error — luz_metric_rmse","text":"Returns new luz metric.","code":""},{"path":[]},{"path":"/reference/luz_metric_set.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a metric set — luz_metric_set","title":"Creates a metric set — luz_metric_set","text":"metric set can used specify metrics evaluated training, validation .","code":""},{"path":"/reference/luz_metric_set.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a metric set — luz_metric_set","text":"","code":"luz_metric_set(metrics = NULL, train_metrics = NULL, valid_metrics = NULL)"},{"path":"/reference/luz_metric_set.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a metric set — luz_metric_set","text":"metrics list luz_metrics meant used training validation. train_metrics list luz_metrics used training. valid_metrics list luz_metrics sued validation.","code":""},{"path":"/reference/luz_save.html","id":null,"dir":"Reference","previous_headings":"","what":"Saves luz objects to disk — luz_save","title":"Saves luz objects to disk — luz_save","text":"Allows saving luz fitted models disk. Objects can loaded back luz_load().","code":""},{"path":"/reference/luz_save.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Saves luz objects to disk — luz_save","text":"","code":"luz_save(obj, path, ...)"},{"path":"/reference/luz_save.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Saves luz objects to disk — luz_save","text":"obj object class 'luz_module_fitted' returned fit.luz_module_generator(). path path file system save object. ... currently unused.","code":""},{"path":"/reference/luz_save.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Saves luz objects to disk — luz_save","text":"Objects saved plain .rds files obj$model serialized torch_save saving .","code":""},{"path":"/reference/luz_save.html","id":"warning","dir":"Reference","previous_headings":"","what":"Warning","title":"Saves luz objects to disk — luz_save","text":"ctx naively serialized. Ie, use saveRDS() serialize . expect luz_save work correctly unserializable objects ctx like torch_tensors external pointers general.","code":""},{"path":[]},{"path":"/reference/nn_mixup_loss.html","id":null,"dir":"Reference","previous_headings":"","what":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"training phase, computes individual losses regard two targets, weights item-wise, averages linear combinations yield mean batch loss. validation testing, defers passed-loss.","code":""},{"path":"/reference/nn_mixup_loss.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"","code":"nn_mixup_loss(loss)"},{"path":"/reference/nn_mixup_loss.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"loss underlying loss nn_module call. must support reduction field. training attribute changed 'none' get loss individual observations. See example documentation reduction argument torch::nn_cross_entropy_loss().","code":""},{"path":"/reference/nn_mixup_loss.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"used together luz_callback_mixup().","code":""},{"path":[]},{"path":"/reference/nnf_mixup.html","id":null,"dir":"Reference","previous_headings":"","what":"Mixup logic — nnf_mixup","title":"Mixup logic — nnf_mixup","text":"Logic underlying luz_callback_mixup().","code":""},{"path":"/reference/nnf_mixup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mixup logic — nnf_mixup","text":"","code":"nnf_mixup(x, y, weight)"},{"path":"/reference/nnf_mixup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Mixup logic — nnf_mixup","text":"x input batch y target batch weight weighting coefficient used torch_lerp()","code":""},{"path":"/reference/nnf_mixup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mixup logic — nnf_mixup","text":"list : x, new, mixed-input batch y, list : ys, list : y1, original target y1 y2, mixed-target y2 weight, mixing weights","code":""},{"path":"/reference/nnf_mixup.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Mixup logic — nnf_mixup","text":"Based passed-input target batches, well applicable mixing weights, return new tensors intended replace current batch. new input batch weighted linear combination input batch items, new target batch bundles original targets, well mixing weights, nested list.","code":""},{"path":[]},{"path":"/reference/nnf_mixup.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Mixup logic — nnf_mixup","text":"","code":"if (torch::torch_is_installed()) { batch_x <- torch::torch_randn(c(10, 768)) batch_y <- torch::torch_randn(10) weight <- torch::torch_tensor(rep(0.9, 10))$view(c(10, 1)) nnf_mixup(batch_x, batch_y, weight) } #> $x #> torch_tensor #> Columns 1 to 10-1.6129 -0.6514 -0.0005 -0.1299 -0.0186  0.7856 -0.2318 -0.5054 -1.1800 -0.4573 #> -0.3187  0.2511 -0.5413  0.5586 -0.3886  1.2698 -0.5388 -0.4903 -0.3484  0.1465 #>  0.4670 -0.4047  0.5951  1.7964 -0.1125 -0.1080  0.0275 -0.1578  0.4809 -1.3088 #> -1.2264 -0.5856  0.8783  0.1327 -0.6009 -0.2881  1.3573 -1.4678  0.5514  0.7259 #> -1.3347 -0.1024  1.3148 -0.4473  0.5465 -0.5149  0.9301 -0.0461  0.3143  0.5608 #> -0.8352 -0.4503 -0.6097  0.0943  0.8621 -0.3653 -0.0237  0.9062 -1.2661  1.7011 #>  0.0790 -2.2431  1.7278  1.5395 -0.5357  0.3805  1.7119  0.1466 -0.1981  1.3738 #>  0.5916 -0.0441 -0.1281  0.3864 -0.4095 -0.8772 -0.4108  1.1746 -1.2667 -0.9288 #> -0.5432 -0.8673  0.3779  1.4750  0.9157 -0.8307  1.0736 -0.2050 -1.2962  0.7474 #> -0.1829  0.7280 -1.1095  0.3540  0.9854 -0.2013 -0.1124  1.8542 -0.3396 -1.2286 #>  #> Columns 11 to 20-0.4164 -1.8517  0.8905 -0.7747 -0.6324  0.4760 -0.7748 -0.0742 -1.0611  0.7593 #>  1.3576  0.3285  0.2676 -0.0533  0.2062 -0.0335  0.3198  0.3276 -1.2097  0.0647 #> -0.2135 -0.0988  0.3074 -0.4857 -1.5481  0.5156 -0.0364 -0.6499 -1.0595 -1.4608 #>  0.0545 -0.1533  1.0694 -1.9981 -0.8471 -0.7479 -0.4441  0.3173  0.9581 -0.1928 #> -0.1570  0.1221 -0.0325  0.5952 -0.7952  0.9146 -0.6144  0.1231 -0.3203 -1.2061 #> -0.0764  1.8272  0.3365 -0.1222 -0.2703  0.5525  1.1011 -1.0144  0.5019  0.7357 #> -0.5516  0.5019  0.3448  0.1197 -0.4418  1.5774  0.5755  0.0448  0.7243 -1.5652 #>  0.1351 -0.4297 -0.2023 -1.3988 -1.3668 -0.4454 -0.5770  0.3981  0.2843  0.3258 #>  0.2701  0.4466  0.5089 -1.1313  0.4318  0.9925  0.6326 -1.3562 -0.3284  0.3340 #>  0.2723  0.8889 -1.0425 -0.6844 -0.2525 -0.4499  0.3906 -0.5498 -0.2378 -0.8349 #>  #> Columns 21 to 30 0.4978  0.0260  0.1258 -0.8327 -0.7895  0.1126 -0.2001  0.5339  1.1938  0.4961 #> -0.1211 -0.9407  1.1080  1.0563  0.6874 -0.8214  0.4860 -0.4867 -0.2094 -1.9037 #>  0.0040  0.6689 -1.3853  0.1822  0.7930 -0.8359  0.6396 -0.0760  1.2617  0.2426 #>  0.2137  0.6507  0.2203 -2.1912 -0.4301 -0.9777  0.4276 -1.0610  0.7440 -1.2844 #>  0.4721  1.3801  0.1906  0.0381 -1.8842 -0.2035  1.1486 -0.4319  1.2018 -1.0576 #> -0.1008  1.8099  0.1133 -0.3625 -0.8228  0.3376 -1.2784  0.4270 -1.8851 -0.7659 #> -0.1893  0.3863  0.0251 -0.0711  0.2325 -0.3685 -1.2638  0.1694  1.3024 -0.4186 #>  0.6907  1.8215 -0.5999  0.0687 -0.2703 -1.2546 -0.7732  1.1821  0.6891  0.7568 #> ... [the output was truncated (use n=-1 to disable)] #> [ CPUFloatType{10,768} ] #>  #> $y #> $y$ys #> $y$ys$y1 #> torch_tensor #>  0.8530 #> -0.3463 #> -2.1121 #> -0.0545 #>  0.7580 #> -0.9011 #> -0.2182 #>  0.6930 #> -1.0155 #>  1.3686 #> [ CPUFloatType{10} ] #>  #> $y$ys$y2 #> torch_tensor #>  0.7580 #> -0.9011 #> -0.0545 #> -0.2182 #>  0.6930 #>  1.3686 #> -1.0155 #>  0.8530 #> -0.3463 #> -2.1121 #> [ CPUFloatType{10} ] #>  #>  #> $y$weight #> torch_tensor #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #> [ CPUFloatType{10,1} ] #>  #>"},{"path":"/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"Pipe operator — %>%","title":"Pipe operator — %>%","text":"See magrittr::%>% details.","code":""},{"path":"/reference/pipe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Pipe operator — %>%","text":"","code":"lhs %>% rhs"},{"path":"/reference/predict.luz_module_fitted.html","id":null,"dir":"Reference","previous_headings":"","what":"Create predictions for a fitted model — predict.luz_module_fitted","title":"Create predictions for a fitted model — predict.luz_module_fitted","text":"Create predictions fitted model","code":""},{"path":"/reference/predict.luz_module_fitted.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create predictions for a fitted model — predict.luz_module_fitted","text":"","code":"# S3 method for luz_module_fitted predict(   object,   newdata,   ...,   callbacks = list(),   accelerator = NULL,   verbose = NULL,   dataloader_options = NULL )"},{"path":"/reference/predict.luz_module_fitted.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create predictions for a fitted model — predict.luz_module_fitted","text":"object (fitted model) fitted model object returned fit.luz_module_generator() newdata (dataloader, dataset, list array) returning list least 1 element. elements used. ... Currently unused. callbacks (list, optional) list callbacks defined luz_callback() called training procedure. callbacks luz_callback_metrics(), luz_callback_progress() luz_callback_train_valid() always added default. accelerator (accelerator, optional) optional accelerator() object used configure device placement components like nn_modules, optimizers batches data. verbose (logical, optional) optional boolean value indicating fitting procedure emit output console training. default, produce output interactive() TRUE, otherwise print console. dataloader_options Options used creating dataloader. See torch::dataloader(). shuffle=TRUE default training data batch_size=32 default. error NULL data already dataloader.","code":""},{"path":[]},{"path":"/reference/reexports.html","id":null,"dir":"Reference","previous_headings":"","what":"Objects exported from other packages — reexports","title":"Objects exported from other packages — reexports","text":"objects imported packages. Follow links see documentation. generics fit","code":""},{"path":"/reference/set_hparams.html","id":null,"dir":"Reference","previous_headings":"","what":"Set hyper-parameter of a module — set_hparams","title":"Set hyper-parameter of a module — set_hparams","text":"function used define hyper-parameters calling fit luz_modules.","code":""},{"path":"/reference/set_hparams.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set hyper-parameter of a module — set_hparams","text":"","code":"set_hparams(module, ...)"},{"path":"/reference/set_hparams.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set hyper-parameter of a module — set_hparams","text":"module nn_module setup(). ... parameters set used initialize nn_module, ie passed unchanged initialize method base nn_module.","code":""},{"path":"/reference/set_hparams.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set hyper-parameter of a module — set_hparams","text":"luz module","code":""},{"path":[]},{"path":"/reference/set_opt_hparams.html","id":null,"dir":"Reference","previous_headings":"","what":"Set optimizer hyper-parameters — set_opt_hparams","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"function used define hyper-parameters optimizer initialization method.","code":""},{"path":"/reference/set_opt_hparams.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"","code":"set_opt_hparams(module, ...)"},{"path":"/reference/set_opt_hparams.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"module nn_module setup(). ... parameters passed used initialize optimizers. example, optimizer optim_adam pass lr=0.1, optim_adam function called optim_adam(parameters, lr=0.1) fitting model.","code":""},{"path":"/reference/set_opt_hparams.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"luz module","code":""},{"path":[]},{"path":"/reference/setup.html","id":null,"dir":"Reference","previous_headings":"","what":"Set's up a nn_module to use with luz — setup","title":"Set's up a nn_module to use with luz — setup","text":"setup function used set important attributes method nn_modules used luz.","code":""},{"path":"/reference/setup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set's up a nn_module to use with luz — setup","text":"","code":"setup(module, loss = NULL, optimizer = NULL, metrics = NULL, backward = NULL)"},{"path":"/reference/setup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set's up a nn_module to use with luz — setup","text":"module (nn_module) nn_module want set . loss (function, optional) optional function signature function(input, target). requires nn_module implement method called loss. optimizer (torch_optimizer, optional) function signature function(parameters, ...) used initialize optimizer given model parameters. metrics (list, optional) list metrics tracked training procedure. Sometimes, want metrics evaluated training validation, case can pass luz_metric_set() object specify metrics used stage. backward (function) functions takes loss scalar values parameter. must call $backward() torch::autograd_backward(). general need set parameter unless need customize luz calls backward(), example, need add additional arguments backward call. Note becomes method nn_module thus can used custom step() override .","code":""},{"path":"/reference/setup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set's up a nn_module to use with luz — setup","text":"luz module can trained fit().","code":""},{"path":"/reference/setup.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Set's up a nn_module to use with luz — setup","text":"makes sure module necessary ingredients order fitted.","code":""},{"path":"/reference/setup.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Set's up a nn_module to use with luz — setup","text":"also adds device active field can used query current module device within methods, eg self$device. useful ctx() available, eg, calling methods outside luz wrappers. Users can override default implementing device active method input module.","code":""},{"path":[]},{"path":"/news/index.html","id":"luz-development-version","dir":"Changelog","previous_headings":"","what":"luz (development version)","title":"luz (development version)","text":"Added mixed precision callback. (#127) Added support torch iterable datasets. (#135) Fixed bug trying resume models trained learning rate schedulers. (#137)","code":""},{"path":"/news/index.html","id":"luz-040","dir":"Changelog","previous_headings":"","what":"luz 0.4.0","title":"luz 0.4.0","text":"CRAN release: 2023-04-17","code":""},{"path":"/news/index.html","id":"breaking-changes-0-4-0","dir":"Changelog","previous_headings":"","what":"Breaking changes","title":"luz 0.4.0","text":"drop_last=TRUE now default training dataloaders created luz (eg. pass list torch dataset data input) (#117) default profile callback longer tracks intra step timings adds non ignorable overhead. (#125)","code":""},{"path":"/news/index.html","id":"new-features-0-4-0","dir":"Changelog","previous_headings":"","what":"New features","title":"luz 0.4.0","text":"Added support arm Mac’s MPS device. (#104) Refactor checkpointing luz - now also serialize optimizer state callbacks state. (#107) Added luz_callback_autoresume() allowing easily resume trainining runs might crashed. (#107) Added th luz_callback_resume_from_checkpoint() allowing one resume training run checkpoint file. (#107) Users can now chose metrics called training validation, training validation. See luz_metric_set() information. (#112) Improved errors raised user code, eg calling metrics callbacks raised. helps lot debuging errors callbacks metrics. (#112) loss_fn now field context, thus callbacks can override needed. (#112) luz_callback_mixup now supports run_valid auto_loss arguments. (#112) ctx now aliases default opt opt_name single optimizer specified (ie. cases) (#114) Added tfevents callback logging loss getting weights histograms. (#118) can now specify metrics evaluated evaluate. (#123)","code":""},{"path":"/news/index.html","id":"bug-fixes-0-4-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"luz 0.4.0","text":"Bug fix: accelerators cpu argument always respected. (#119) Handled rlang ggplot2 deprecations. (#120) Better handling metrics environments. Faster garbage collection dataloaders iterators, use less memory. (#122) Much faster loss averaging every step. Can hight influence training times large number iterations per epoch. (#124)","code":""},{"path":"/news/index.html","id":"luz-031","dir":"Changelog","previous_headings":"","what":"luz 0.3.1","title":"luz 0.3.1","text":"CRAN release: 2022-09-06 Re-submission fix vignette rendering.","code":""},{"path":"/news/index.html","id":"luz-030","dir":"Changelog","previous_headings":"","what":"luz 0.3.0","title":"luz 0.3.0","text":"CRAN release: 2022-08-19","code":""},{"path":"/news/index.html","id":"breaking-changes-0-3-0","dir":"Changelog","previous_headings":"","what":"Breaking changes","title":"luz 0.3.0","text":"lr_finder() now default divides range start_lr end_lr log-spaced intervals, following fast.ai implementation. Cf. Sylvain Gugger’s post: https://sgugger.github.io/---find--good-learning-rate.html. previous behavior can achieved passing log_spaced_intervals=FALSE function. (#82, @skeydan) plot.lr_records() now addition plots exponentially weighted moving average loss (, see Sylvain Gugger’s post), weighting coefficient 0.9 (seems reasonable value default setting 100 learning-rate-incrementing intervals). (#82, @skeydan)","code":""},{"path":"/news/index.html","id":"documentation-0-3-0","dir":"Changelog","previous_headings":"","what":"Documentation","title":"luz 0.3.0","text":"Many wording improvements getting started guides (#81 #94, @jonthegeek).","code":""},{"path":"/news/index.html","id":"new-features-0-3-0","dir":"Changelog","previous_headings":"","what":"New features","title":"luz 0.3.0","text":"Added MixUp callback helper loss function functional logic. (#82, @skeydan). Added luz_callback_gradient_clip inspired FastAI’s implementation. (#90) Added backward argument setup allowing one customize backward called loss scalar value. (#93) Added luz_callback_keep_best_model() reload weights best model training finished. (#95)","code":""},{"path":"/news/index.html","id":"luz-020","dir":"Changelog","previous_headings":"","what":"luz 0.2.0","title":"luz 0.2.0","text":"CRAN release: 2021-10-07","code":""},{"path":"/news/index.html","id":"new-features-0-2-0","dir":"Changelog","previous_headings":"","what":"New features","title":"luz 0.2.0","text":"Allow users provide minimum maximum number epochs calling fit.luz_module_generator(). Removed ctx$epochs context object replaced ctx$min_epochs ctx$max_epochs (#53, @mattwarkentin). Early stopping now occur minimum number training epochs met (#53, @mattwarkentin). Added cuda_index argument accelerator allow selecting specific GPU multiple present (#58, @cmcmaster1). Implemented lr_finder (#59, @cmcmaster1). now handle different kinds data arguments passed fit using as_dataloader() method (#66). valid_data can now scalar value indicating proportion data used fitting. works data torch dataset list. (#69) can now supply dataloader_options fit pass additional information as_dataloader(). (#71) Implemented evaluate function allowing users get metrics model new dataset. (#73)","code":""},{"path":"/news/index.html","id":"bug-fixes-0-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"luz 0.2.0","text":"Fixed bug CSV logger callback saving logs space delimited file (#52, @mattwarkentin). Fixed bug length progress bar validation dataset (#52, @mattwarkentin). Fixed bugs early stopping callback related working properly patience = 1 specified logging callbacks. (#76)","code":""},{"path":"/news/index.html","id":"internal-changes-0-2-0","dir":"Changelog","previous_headings":"","what":"Internal changes","title":"luz 0.2.0","text":"ctx$data now refers current use data instead always refering ctx$train_data. (#54) Refactored ctx object make safer avoid returing output. (#73)","code":""},{"path":"/news/index.html","id":"luz-010","dir":"Changelog","previous_headings":"","what":"luz 0.1.0","title":"luz 0.1.0","text":"CRAN release: 2021-06-17 Added NEWS.md file track changes package.","code":""}]
+[{"path":"/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2021 luz authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"/articles/accelerator.html","id":"example","dir":"Articles","previous_headings":"","what":"Example","title":"Accelerator API","text":"Accelerator API best explained showing example diff raw torch training loop. code changes shown, longer need manually move data parameters devices, makes code easier read less error prone. can find additional documentation using help(accelerator).","code":"library(torch) + library(luz)  + acc <- accelerator() - device <- \"cpu\"  data <- tensor_dataset(   x = torch_randn(100, 10),   y = torch_rand(100, 1) )  dl <- dataloader(data, batch_size = 10)  model <- nn_linear(10, 1) - model$to(device = device) opt <- optim_adam(model$parameters)  + c(model, opt, dl) %<-% acc$prepare(model, opt, dl)  model$train() coro::loop(for (batch in dl) {    opt$zero_grad()  -  preds <- model(batch$x$to(device = device)) +  preds <- model(batch$x) -  loss <- nnf_mse_loss(preds, batch$y$to(device = device)) +  loss <- nnf_mse_loss(preds, batch$y)    loss$backward()   opt$step() })"},{"path":"/articles/checkpoints.html","id":"resuming-training-runs-that-crashed","dir":"Articles","previous_headings":"","what":"Resuming training runs that crashed","title":"Checkpointing your models","text":"long training run can crash whatever reason (computer turned , process kileed cluster, etc), recommend add luz_callback_autoresume() list callbacks. luz_callback_autoresume() automatically checkpoint whole state model end epoch. something fails training can simply rerun script, whithout code changes checkpoint reloaded training start stopped. example, lets’s take randomly generated training dataset linear model show autoresume works. ’s training data: model definition: Let’s now create callback simulates random failure happen. callback just raise R error 5th epoch. Let’s now start training adding luz_callback_auto_resume(): resume model training exactly stopped just need restart fitting, using exact model, callbacks, etc: , model fitting process continued exactly stopped. Records, optimizer model state recovered previous run can full results:","code":"x <- torch_randn(1000, 10) y <- torch_randn(1000, 1) model <- nn_linear %>%   setup(optimizer = optim_sgd, loss = nnf_mse_loss) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 0.01) interrupt <- luz_callback(   \"interrupt\",   failed = FALSE,   on_epoch_end = function() {     if (ctx$epoch == 5 && !self$failed) {       self$failed <- TRUE       stop(\"Error on epoch 5\")     }   } ) autoresume <- luz_callback_auto_resume(path = \"state.pt\") inter <- interrupt()  # An error will happen in the 5th epoch and the model will be stopped. results <- model %>% fit(   list(x, y),   callbacks = list(inter, autoresume),   verbose = FALSE ) #> Error in `FUN()`: #> ! Error while calling callback with class <interrupt/LuzCallback/R6> at #>   on_epoch_end. #> Caused by error in `self[[callback_nm]]()`: #> ! Error on epoch 5 results <- model %>% fit(   list(x, y),   callbacks = list(inter, autoresume),   verbose = FALSE ) plot(results)"},{"path":"/articles/checkpoints.html","id":"checkpointing","dir":"Articles","previous_headings":"","what":"Checkpointing","title":"Checkpointing your models","text":"Sometimes want control checkpoints handled. case can use luz_callback_model_checkpoint() save checkpoints specified file directory. Let’s use example resuming section: first generate data. define model: Let’s now fit model using luz_callback_model_checkpoint(). can see now checkpoints directory contains files state dumps epoch. default, luz_callback_model_checkpoint save state epochs format name including resulting loss. can configured withing path parameter, see ?luz_callback_model_checkpoint details. Finally, can load specific checkpoint fitted result using luz_load_checkpoint. Note loading checkpoint luz_fitted_module going modify model weights -place. can start making predictions, evaluate model using reloeded weights. might also want start new training run checkpoint. , can use luz_callback_resume_from_checkpoint(). default, recover model weights checkpoint file, can configure restore records, callback optimizer state . checkpoint directory passed training resume last checkpoint file returned fs::dir_ls. ’s use callback:","code":"x <- torch_randn(1000, 10) y <- torch_randn(1000, 1) model <- nn_linear %>%   setup(optimizer = optim_sgd, loss = nnf_mse_loss) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 0.01) checkpoint <- luz_callback_model_checkpoint(   path = \"checkpoints/\",    monitor = \"train_loss\" )  results <- model %>% fit(   list(x, y),   callbacks = list(checkpoint),   verbose = FALSE ) fs::dir_ls(\"checkpoints\") #> checkpoints/epoch-01-train_loss-1.237.pt #> checkpoints/epoch-02-train_loss-1.065.pt #> checkpoints/epoch-03-train_loss-1.026.pt #> checkpoints/epoch-04-train_loss-1.004.pt #> checkpoints/epoch-05-train_loss-1.004.pt #> checkpoints/epoch-06-train_loss-1.005.pt #> checkpoints/epoch-07-train_loss-0.999.pt #> checkpoints/epoch-08-train_loss-0.998.pt #> checkpoints/epoch-09-train_loss-1.001.pt #> checkpoints/epoch-10-train_loss-1.002.pt luz_load_checkpoint(results, fs::dir_ls(\"checkpoints\")[1]) resume <- luz_callback_resume_from_checkpoint(path = \"checkpoints/\") results <- model %>% fit(   list(x, y),   callbacks = list(resume),   verbose = FALSE ) plot(results)"},{"path":"/articles/checkpoints.html","id":"custom-callbacks-state","dir":"Articles","previous_headings":"Checkpointing","what":"Custom callbacks state","title":"Checkpointing your models","text":"Sometimes callbacks also need keep internal state order allow continuing training exactly stopped. case, callbacks can implement state_dict() load_state_dict() methods automatically called saving reloading checkpoints. example, suppose callback tracks gradients weights every epoch. want use tracked weights analyse training procedure. implemented like: example, gradients field state callback. training fails reason, gradients lost. ’s important also checkpoint callback state, can implement state_dict() method must returning named list objects compose state callback load_state_dict() taking named list returned state_dict() restoring callback state. callback reimplemented :","code":"cb_weight_grad <- luz_callback(   \"weight_grad\",   gradients = list(),   initialize = function(track_weights) {     self$track_weights   },   on_train_batch_before_step = function() {     gradients[[ctx$epoch]] <- list()     for (w in self$track_weights) {       gradients[[ctx$epoch]][[w]] <- self$model$parameters[[w]]     }   } ) cb_weight_grad <- luz_callback(   \"weight_grad\",   gradients = list(),   initialize = function(track_weights) {     self$track_weights   },   on_train_batch_before_step = function() {     gradients[[ctx$epoch]] <- list()     for (w in self$track_weights) {       gradients[[ctx$epoch]][[w]] <- self$model$parameters[[w]]     }   },   state_dict = function() {     list(gradients = self$gradients)   },   load_state_dict = function(d) {     self$gradients <- d$gradients   } )"},{"path":"/articles/custom-loop.html","id":"multiple-optimizers","dir":"Articles","previous_headings":"","what":"Multiple optimizers","title":"Custom loops with luz","text":"Suppose want experiment train first fully connected layer using learning rate 0.1 second one using learning rate 0.01. minimize nn_cross_entropy_loss() , first layer want add L1 regularization weights. order use luz , implement two methods net module: set_optimizers: returns named list optimizers depending ctx. loss: computes loss depending selected optimizer. Let’s go code: Notice model optimizers initialized according set_optimizers() method’s return value (list). case, initializing optimizers using different model parameters learning rates. loss() method responsible computing loss back-propagated compute gradients update weights. loss() method can access ctx object contain opt_name field, describing optimizer currently used. Note function called optimizer training validation step. See help(\"ctx\") complete information context object. can finally setup fit module, however longer need specify optimizers loss functions. Now let’s re-implement model using slightly flexible approach overriding training validation step.","code":"net <- nn_module(   \"Net\",   initialize = function() {     self$fc1 <- nn_linear(100, 50)     self$fc1 <- nn_linear(50, 10)   },   forward = function(x) {     x %>%        self$fc1() %>%        nnf_relu() %>%        self$fc2()   },   set_optimizers = function(lr_fc1 = 0.1, lr_fc2 = 0.01) {     list(       opt_fc1 = optim_adam(self$fc1$parameters, lr = lr_fc1),       opt_fc2 = optim_adam(self$fc2$parameters, lr = lr_fc2)     )   },   loss = function(input, target) {     pred <- ctx$model(input)        if (ctx$opt_name == \"opt_fc1\")        nnf_cross_entropy(pred, target) + torch_norm(self$fc1$weight, p = 1)     else if (ctx$opt_name == \"opt_fc2\")       nnf_cross_entropy(pred, target)   } ) fitted <- net %>%    setup(metrics = list(luz_metric_accuracy)) %>%    fit(train_dl, epochs = 10, valid_data = test_dl)"},{"path":"/articles/custom-loop.html","id":"fully-flexible-step","dir":"Articles","previous_headings":"","what":"Fully flexible step","title":"Custom loops with luz","text":"Instead implementing loss() method, can implement step() method. allows us flexibly modify happens training validating batch dataset. now responsible updating weights stepping optimizers back-propagating loss. important things notice : step() method used training validation. need careful modify weights training. , can get complete information regarding context object using help(\"ctx\"). ctx$optimizers named list holding optimizer created set_optimizers() method called. need manually track losses saving saving named list ctx$loss. convention, use name optimizer refers . good practice detach() saving reduce memory usage. Callbacks called inside default step() method like on_train_batch_after_pred, on_train_batch_after_loss, etc, won’t automatically called. can still cal manually adding ctx$call_callbacks(\"<callback name>\") inside training step. See code fit_one_batch() valid_one_batch find callbacks won’t called. want luz metrics work custom step() method, must assign ctx$pred model predictions metrics always called metric$update(ctx$pred, ctx$target).","code":"net <- nn_module(   \"Net\",   initialize = function() {     self$fc1 <- nn_linear(100, 50)     self$fc1 <- nn_linear(50, 10)   },   forward = function(x) {     x %>%        self$fc1() %>%        nnf_relu() %>%        self$fc2()   },   set_optimizers = function(lr_fc1 = 0.1, lr_fc2 = 0.01) {     list(       opt_fc1 = optim_adam(self$fc1$parameters, lr = lr_fc1),       opt_fc2 = optim_adam(self$fc2$parameters, lr = lr_fc2)     )   },   step = function() {     ctx$loss <- list()     for (opt_name in names(ctx$optimizers)) {            ctx$pred <- ctx$model(ctx$input)       opt <- ctx$optimizers[[opt_name]]       loss <- nnf_cross_entropy(pred, target)              if (opt_name == \"opt_fc1\") {         # we have L1 regularization in layer 1         loss <- nnf_cross_entropy(pred, target) +            torch_norm(self$fc1$weight, p = 1)       }                if (ctx$training) {         opt$zero_grad()         loss$backward()         opt$step()         }              ctx$loss[[opt_name]] <- loss$detach()     }   } )"},{"path":"/articles/custom-loop.html","id":"next-steps","dir":"Articles","previous_headings":"","what":"Next steps","title":"Custom loops with luz","text":"article learned customize step() training loop using luz layered functionality. Luz also allows flexible modifications training loop described Accelerator vignette (vignette(\"accelerator\")). now able follow examples marked ‘intermediate’ ‘advanced’ category examples gallery.","code":""},{"path":"/articles/examples/text-generation.html","id":"data","dir":"Articles > Examples","previous_headings":"","what":"Data","title":"Training a causal language model from scratch","text":"First step implement torch dataset gathers data pre-process format suitable training model. means need : Download data Train tokenizer dataset able produce sequences tokens format expected model going use 2 datasets available Hugging Face Hub. first contain R packages source code available CRAN. second contains R code available GitHub data dumps. datasets Parquet format. Following implement function downloads caches data returns single arrow table containing data. Next implement function trains tokenizer dataset. can finally implement torch dataset going use training model. going use torch::iterable_dataset instead torch::dataset. main motivation can’t really know total number samples dataset, can implement .getitem() method get arbiratrary sample. Thus implement .iter method returns new sample every time ’s called. dataset likely large us train model documents example. ’s also hard predict long take train end. order make easier, define wraper dataset used run dataset fixed number steps. required, makes using luz pleasant, can easily define many tokens want train model. finally define model going train. ’ll use small version GPT2. also define generate method allowing us sample model given initial context. make easier inspect training, also define callback prints sample model every epoch. can finally train model. define want train model half billion tokens total 100 epochs. can use model generate text given prompt :","code":"read_dataset <- function(source) {   d <- source |>     hfhub::hub_snapshot(repo_type = \"dataset\", allow_patterns = \"parquet$\") |>     fs::path(\"data/r\") |>     arrow::open_dataset() |>     dplyr::filter(stringr::str_detect(path, \".*\\\\.[rR]$\")) |>     dplyr::select(content) |>     dplyr::mutate(content = arrow::cast(content, arrow::string())) |>     dplyr::filter(!is.na(content)) |>     dplyr::collect() %>%     # the dataset contains invalid utf8 characters...     # we need to remove them, otherwise we get an error from tokenizers     dplyr::filter(utf8::utf8_valid(content)) }  read_datasets <- function() {   dplyr::bind_rows(     read_dataset(\"dfalbel/cran-packages\"),     read_dataset(\"dfalbel/github-r-repos\")   ) } create_tokenizer <- function(text, vocab_size, special_tokens) {   tok <- tok::tokenizer$new(tok::model_bpe$new())    tok$pre_tokenizer <- tok::pre_tokenizer_byte_level$new(add_prefix_space = FALSE)   tok$decoder <- tok::decoder_byte_level$new()   tok$post_processor <- tok::processor_byte_level$new(trim_offsets = FALSE)    tok$train_from_memory(     text,     tok::trainer_bpe$new(vocab_size = vocab_size, special_tokens = special_tokens)   )   tok }  # test code to debug the tokenizer # data <- read_datasets() # tok <- create_tokenizer(data$content) r_sources_dataset <- torch::iterable_dataset(   \"r_sources_dataset\",   initialize = function(root = \".\", vocab_size = 20000, context_length = 128) {     self$data <- read_datasets()     self$context_length <- context_length     self$index <- sample.int(nrow(self$data))      # we only create a tokenizer if it doesn't exist, otherwise we just load it     tok_path <- file.path(root, glue::glue(\"tokenizer-{vocab_size}.json\"))     if (!file.exists(tok_path)) {       self$tok <- create_tokenizer(         as.character(self$data$content),         vocab_size,         c(\"<fbegin>\", \"<fend>\")       )       fs::dir_create(root)       self$tok$save(tok_path)     } else {       self$tok <- tok::tokenizer$from_file(tok_path)     }   },   .iter = function() {     i <- 1L     sequence <- c()     function() {       while (length(sequence) < (self$context_length + 1) && i <= nrow(self$data)) {         sequence <<- c(           sequence,           self$tok$encode(paste(\"<fbegin>\", as.character(self$data$content[self$index[i]]), \"<fend>\"))$ids         )         i <- i + 1L       }        if (length(sequence) < (self$context_length + 1)) {         return(coro::exhausted())       }        on.exit({         sequence <<- sequence[-seq_len(self$context_length)]       })       list(         input_ids = sequence[seq_len(self$context_length)] + 1L,         labels = sequence[2:(self$context_length + 1)] + 1L       )     }   } )  # debug code for the dataset # ds <- r_sources_dataset(\"~/Downloads/\") # it <- ds$.iter() # it() # ds$tok$get_vocab_size() fixed_steps_iterable_dataset <- iterable_dataset(   \"fixed_steps_dataset\",   initialize = function(dataset, steps) {     self$dataset <- dataset     self$steps <- steps   },   .iter = function() {     i <- 1L     iter <- NULL     function() {       if (i > self$steps) {         return(coro::exhausted())       }        i <<- i + 1L        if (is.null(iter) || coro::is_exhausted(data <- iter())) {         iter <<- self$dataset$.iter()         data <- iter()       }        data     }   },   .length = function() {     self$steps   } ) net <- nn_module(   initialize = function() {     self$gpt <- minhub::gpt2(       vocab_size = 20000,       pdrop = 0.1     )   },   forward = function(x) {     self$gpt(x)$transpose(2,3)   },   generate = function(x, temperature = 1, iter = 50, top_k = 10) {     # samples from the model givn a context vector.     for (i in seq_len(iter)) {       logits <- self$forward(x)[,,-1]       logits <- logits/temperature       c(prob, ind) %<-% logits$topk(top_k)       logits <- torch_full_like(logits, -Inf)$scatter_(-1, ind, prob)       logits <- nnf_softmax(logits, dim = -1)       id_next <- torch_multinomial(logits, num_samples = 1)       x <- torch_cat(list(x, id_next), dim = 2)     }     x   } )  # debug code for the model # ds <- torch::dataloader(r_sources_dataset(\"~/Downloads/\"), batch_size = 32) # batch <- coro::collect(ds, 1)[[1]] # str(batch) # m <- net() # str(m(batch$input_ids)) # samples from the model using the context. generate <- function(model, tok, context, ...) {   local_no_grad() # disables gradient for sampling   x <- tok$encode(context)$ids + 1L   x <- torch_tensor(x)[NULL,]$to(device = model$device)   content <- as.integer(model$generate(x, ...)$cpu())   tok$decode(content - 1L) }  display_cb <- luz_callback(   initialize = function() {},   on_epoch_end = function() {     local_no_grad()     # sample from the model...     context <- \"# creates a linear model\"     text <- generate(ctx$model, dataset$dataset$tok, context, iter = 100)     cli::cli_rule()     cat(text, \"\\n\")     cli::cli_rule()   } ) n_tokens <- 500e6 batch_size <- 16 epochs <- 100 context_length <- 256L  steps <- n_tokens / context_length / epochs dataset <- fixed_steps_iterable_dataset(   r_sources_dataset(context_length = context_length),   steps = steps )  fitted <- net %>%   setup(     optimizer = optim_adam,     loss = nn_cross_entropy_loss()   ) %>%   set_opt_hparams(lr = 3e-4) |>   fit(     dataset,     epochs = epochs,     dataloader_options = list(batch_size = batch_size),     callbacks = list(       luz_callback_lr_scheduler(         torch::lr_one_cycle,         max_lr = 0.1,         epochs = epochs,         steps_per_epoch = steps/batch_size,         call_on = \"on_batch_end\"       ),       luz_callback_gradient_clip(max_norm = 1),       display_cb()     ),     verbose = TRUE   )  luz::luz_save(fitted, \"model.pt\") fitted <- luz::luz_load(\"model.pt\") tok <- tok::tokenizer$from_file(\"tokenizer-20000.json\") context <- \"#' Creates a linear model linear_model <- function(x, y) { \" text <- generate(fitted$model, tok, context, iter = 100) cat(text)"},{"path":"/articles/get-started.html","id":"training-a-nn_module","dir":"Articles","previous_headings":"","what":"Training a nn_module","title":"Get started with luz","text":"much possible, luz tries reuse existing structures torch. model luz defined identically define using raw torch. specific example, definition feed-forward CNN can used classify digits MNIST dataset: can now train model train_dl validate test_dl torch::dataloaders() : Let’s understand happens chunk code: setup function allows configure loss (objective) function optimizer use train model. Optionally can pass list metrics tracked training procedure. Note: loss function can function taking input target tensors returning scalar tensor value, optimizer can core torch optimizer custom ones created torch::optimizer() function. set_hparams() function allows set hyper-parameters passed module initialize() method. example case pass num_classes = 10. set_opt_hparams() function allows pass hyper-parameters used optimizer function. example, optim_adam() can take lr parameter specifying learning rate specify lr = 0.003. fit method take model specification provided setup() run training procedure using specified training validation torch::dataloaders() well number epochs. Note: reuse core torch data structures, instead providing data loading functionality. returned object fitted contains trained model well record metrics losses produced training. can also used producing predictions evaluating trained model datasets. fitting, luz use fastest possible accelerator; CUDA-capable GPU available used, otherwise fall back CPU. also automatically moves data, optimizers, models selected device don’t need handle manually (general error prone). create predictions trained model can use predict method:","code":"net <- nn_module(   \"Net\",   initialize = function(num_class) {     self$conv1 <- nn_conv2d(1, 32, 3, 1)     self$conv2 <- nn_conv2d(32, 64, 3, 1)     self$dropout1 <- nn_dropout2d(0.25)     self$dropout2 <- nn_dropout2d(0.5)     self$fc1 <- nn_linear(9216, 128)     self$fc2 <- nn_linear(128, num_class)   },   forward = function(x) {     x <- self$conv1(x)     x <- nnf_relu(x)     x <- self$conv2(x)     x <- nnf_relu(x)     x <- nnf_max_pool2d(x, 2)     x <- self$dropout1(x)     x <- torch_flatten(x, start_dim = 2)     x <- self$fc1(x)     x <- nnf_relu(x)     x <- self$dropout2(x)     x <- self$fc2(x)     x   } ) fitted <- net %>%   setup(     loss = nn_cross_entropy_loss(),     optimizer = optim_adam,     metrics = list(       luz_metric_accuracy     )   ) %>%   set_hparams(num_class = 10) %>%    set_opt_hparams(lr = 0.003) %>%    fit(train_dl, epochs = 10, valid_data = test_dl) predictions <- predict(fitted, test_dl)"},{"path":"/articles/get-started.html","id":"the-training-loop","dir":"Articles","previous_headings":"","what":"The training loop","title":"Get started with luz","text":"now general idea use fit function now ’s important overview ’s happening inside . pseudocode, ’s fit . fully detailed help build intuition:","code":"# -> Initialize objects: model, optimizers. # -> Select fitting device. # -> Move data, model, optimizers to the selected device. # -> Start training for (epoch in 1:epochs) {   # -> Training procedure   for (batch in train_dl) {     # -> Calculate model `forward` method.     # -> Calculate the loss     # -> Update weights     # -> Update metrics and tracking loss   }   # -> Validation procedure   for (batch in valid_dl) {     # -> Calculate model `forward` method.     # -> Calculate the loss     # -> Update metrics and tracking loss   } } # -> End training"},{"path":"/articles/get-started.html","id":"metrics","dir":"Articles","previous_headings":"","what":"Metrics","title":"Get started with luz","text":"One important parts machine learning projects choosing evaluation metric. Luz allows tracking many different metrics training minimal code changes. order track metrics, need modify metrics parameter setup function: Luz provides implementations used metrics. metric available can always implement new one using luz_metric function. order implement new luz_metric need implement 3 methods: initialize: defines metric initial state. function called epoch training validation loops. update: updates metric internal state. function called every training validation step predictions obtained model target values obtained dataloader. compute: uses internal state compute metric values. function called whenever need obtain current metric value. Eg, ’s called every training step metrics displayed progress bar, called per epoch record ’s value progress bar displayed. Optionally, can implement abbrev field gives metric abbreviation used displaying metric information console tracking record. abbrev passed, class name used. Let’s take look implementation luz_metric_accuracy can see implement new one: Note: ’s good practice compute metric returns regular R values instead torch tensors parts luz expect .","code":"fitted <- net %>%   setup(     ...     metrics = list(       luz_metric_accuracy     )   ) %>%   fit(...) luz_metric_accuracy <- luz_metric(   # An abbreviation to be shown in progress bars, or    # when printing progress   abbrev = \"Acc\",    # Initial setup for the metric. Metrics are initialized   # every epoch, for both training and validation   initialize = function() {     self$correct <- 0     self$total <- 0   },   # Run at every training or validation step and updates   # the internal state. The update function takes `preds`   # and `target` as parameters.   update = function(preds, target) {     pred <- torch::torch_argmax(preds, dim = 2)     self$correct <- self$correct + (pred == target)$       to(dtype = torch::torch_float())$       sum()$       item()     self$total <- self$total + pred$numel()   },   # Use the internal state to query the metric value   compute = function() {     self$correct/self$total   } )"},{"path":"/articles/get-started.html","id":"evaluate","dir":"Articles","previous_headings":"","what":"Evaluate","title":"Get started with luz","text":"model trained might want evaluate performance different dataset. reason, luz provides ?evaluate function takes fitted model dataset computes metrics attached model. Evaluate returns luz_module_evaluation object can query metrics using get_metrics function simply print see results. example:","code":"evaluation <- fitted %>% evaluate(data = valid_dl) metrics <- get_metrics(evaluation) print(evaluation) #> A `luz_module_evaluation` #> -- Results --------------------------------------------------------------------- #> loss: 1.8892 #> mae: 1.0522 #> mse: 1.645 #> rmse: 1.2826"},{"path":"/articles/get-started.html","id":"customizing-with-callbacks","dir":"Articles","previous_headings":"","what":"Customizing with callbacks","title":"Get started with luz","text":"Luz provides different ways customize training progress depending level control need training loop. fastest way ‘reusable’, sense can create training modifications can used many different situations, via callbacks. training loop luz many breakpoints can call arbitrary R functions. functionality allows customize training process without modify general training logic. Luz implements 3 default callbacks occur every training procedure: train-eval callback: Sets model train() eval() depending procedure training validation. metrics callback: evaluate metrics training validation process. progress callback: implements progress bar prints progress information training. can also implement custom callbacks modify act specifically training procedure. example: Let’s implement callback prints ‘Iteration n’ (n iteration number) every batch training set ‘Done’ epoch finished. task use luz_callback function: luz_callback() takes named functions ... arguments, name indicates moment callback called. instance on_train_batch_end() called every batch end training procedure, on_epoch_end() called end every epoch. returned value luz_callback() function initializes instance callback. Callbacks can initialization parameters, like name file want log results. case, can pass initialize method creating callback definition, save parameters self object. example, callback message parameter printed end epoch. callback defined can passed fit function via callbacks parameter: Callbacks can called many different positions training loop, including combinations . ’s overview possible callback breakpoints: Every step market on_* point training procedure available callbacks called. important part callbacks ctx (context) object. See help(\"ctx\") details. default, callbacks called order passed fit (predict evaluate), can provide weight attribute control order called. example, one callback weight = 10 another weight = 1, first one called second one. Callbacks don’t specify weight attribute considered weight = 0. built-callbacks luz already provide weight value. example, ?luz_callback_early_stopping weight Inf, since general want run last thing loop. ctx object used luz share information training loop callbacks, model methods, metrics. table describes information available ctx default. callbacks potentially modify attributes add new ones. Context attributes Attributes ctx can used produce desired behavior callbacks. can find information context object using help(\"ctx\"). example, use ctx$iter attribute print iteration number training batch.","code":"print_callback <- luz_callback(   name = \"print_callback\",   initialize = function(message) {     self$message <- message   },   on_train_batch_end = function() {     cat(\"Iteration \", ctx$iter, \"\\n\")   },   on_epoch_end = function() {     cat(self$message, \"\\n\")   } ) fitted <- net %>%   setup(...) %>%   fit(..., callbacks = list(     print_callback(message = \"Done!\")   )) Start Fit    - on_fit_begin   Start Epoch Loop      - on_epoch_begin     Start Train        - on_train_begin       Start Batch Loop          - on_train_batch_begin           Start Default Training Step             - on_train_batch_after_pred             - on_train_batch_after_loss             - on_train_batch_before_backward             - on_train_batch_before_step             - on_train_batch_after_step           End Default Training Step:          - on_train_batch_end       End Batch Loop        - on_train_end     End Train     Start Valid        - on_valid_begin       Start Batch Loop          - on_valid_batch_begin           Start Default Validation Step             - on_valid_batch_after_pred             - on_valid_batch_after_loss           End Default Validation Step          - on_valid_batch_end       End Batch Loop        - on_valid_end     End Valid       - on_epoch_end   End Epoch Loop    - on_fit_end End Fit"},{"path":"/articles/get-started.html","id":"next-steps","dir":"Articles","previous_headings":"","what":"Next steps","title":"Get started with luz","text":"article learned train first model using luz basics customization using custom metrics callbacks. Luz also allows flexible modifications training loop described vignette(\"custom-loop\"). now able follow examples marked ‘basic’ category examples gallery.","code":""},{"path":"/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Daniel Falbel. Author, maintainer, copyright holder. RStudio. Copyright holder.","code":""},{"path":"/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Falbel D (2023). luz: Higher Level 'API' 'torch'. https://mlverse.github.io/luz/, https://github.com/mlverse/luz.","code":"@Manual{,   title = {luz: Higher Level 'API' for 'torch'},   author = {Daniel Falbel},   year = {2023},   note = {https://mlverse.github.io/luz/, https://github.com/mlverse/luz}, }"},{"path":"/index.html","id":"luz","dir":"","previous_headings":"","what":"Higher Level API for torch","title":"Higher Level API for torch","text":"Luz higher level API torch providing abstractions allow much less verbose training loops. package still development. heavily inspired higher level frameworks deep learning, cite : FastAI: heavily inspired FastAI library, especially Learner object callbacks API. Keras: also heavily inspired Keras, especially callback names. lightning module interface similar compile, . PyTorch Lightning: idea luz_module subclass nn_module inspired LightningModule object lightning. HuggingFace Accelerate: internal device placement API heavily inspired Accelerate, much modest features. Currently CPU Single GPU supported.","code":""},{"path":"/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Higher Level API for torch","text":"can install released version CRAN : development version :","code":"install.packages(\"luz\") remotes::install_github(\"mlverse/luz\")"},{"path":"/index.html","id":"example","dir":"","previous_headings":"","what":"Example","title":"Higher Level API for torch","text":"Luz lets take torch nn_module definition fit dataloader, handling boring parts like moving data devices, updating weights, showing progress bars tracking metrics. ’s example defining training Autoencoder MNIST dataset. selected parts code highlight luz functionality. can find full example code . Now defined Autoencoder architecture using torch::nn_module(), can fit using luz:","code":"net <- nn_module(   \"Net\",   initialize = function() {     self$encoder <- nn_sequential(       nn_conv2d(1, 6, kernel_size=5),       nn_relu(),       nn_conv2d(6, 16, kernel_size=5),       nn_relu()     )     self$decoder <- nn_sequential(       nn_conv_transpose2d(16, 6, kernel_size = 5),       nn_relu(),       nn_conv_transpose2d(6, 1, kernel_size = 5),       nn_sigmoid()     )   },   forward = function(x) {     x %>%       self$encoder() %>%       self$decoder()   } ) fitted <- net %>%   setup(     loss = nn_mse_loss(),     optimizer = optim_adam   ) %>%   fit(train_dl, epochs = 1, valid_data = test_dl)"},{"path":"/reference/accelerator.html","id":null,"dir":"Reference","previous_headings":"","what":"Create an accelerator — accelerator","title":"Create an accelerator — accelerator","text":"Create accelerator","code":""},{"path":"/reference/accelerator.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create an accelerator — accelerator","text":"","code":"accelerator(   device_placement = TRUE,   cpu = FALSE,   cuda_index = torch::cuda_current_device() )"},{"path":"/reference/accelerator.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create an accelerator — accelerator","text":"device_placement (logical) whether accelerator object handle device placement. Default: TRUE cpu (logical) whether training procedure run CPU. cuda_index (integer) index CUDA device use multiple GPUs available. Default: result torch::cuda_current_device().","code":""},{"path":"/reference/as_dataloader.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a dataloader from its input — as_dataloader","title":"Creates a dataloader from its input — as_dataloader","text":"as_dataloader used internally luz convert input data valid_data passed fit.luz_module_generator() torch::dataloader","code":""},{"path":"/reference/as_dataloader.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a dataloader from its input — as_dataloader","text":"","code":"as_dataloader(x, ...)  # S3 method for dataset as_dataloader(x, ..., batch_size = 32)  # S3 method for iterable_dataset as_dataloader(x, ..., batch_size = 32)  # S3 method for list as_dataloader(x, ...)  # S3 method for dataloader as_dataloader(x, ...)  # S3 method for matrix as_dataloader(x, ...)  # S3 method for numeric as_dataloader(x, ...)  # S3 method for array as_dataloader(x, ...)  # S3 method for torch_tensor as_dataloader(x, ...)"},{"path":"/reference/as_dataloader.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a dataloader from its input — as_dataloader","text":"x input object. ... Passed torch::dataloader(). batch_size (int, optional): many samples per batch load (default: 1).","code":""},{"path":"/reference/as_dataloader.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Creates a dataloader from its input — as_dataloader","text":"as_dataloader methods sensible defaults batch_size, parallel workers, etc. allows users quickly experiment fit.luz_module_generator() requiring create torch::dataset torch::dataloader simple experiments.","code":""},{"path":"/reference/as_dataloader.html","id":"methods-by-class-","dir":"Reference","previous_headings":"","what":"Methods (by class)","title":"Creates a dataloader from its input — as_dataloader","text":"as_dataloader(dataset): Converts torch::dataset() torch::dataloader(). as_dataloader(iterable_dataset): Converts torch::iterable_dataset() torch::dataloader() as_dataloader(list): Converts list tensors arrays size first dimension  torch::dataloader() as_dataloader(dataloader): Returns dataloader as_dataloader(matrix): Converts matrix dataloader as_dataloader(numeric): Converts numeric vector dataloader as_dataloader(array): Converts array dataloader as_dataloader(torch_tensor): Converts tensor dataloader","code":""},{"path":"/reference/as_dataloader.html","id":"overriding","dir":"Reference","previous_headings":"","what":"Overriding","title":"Creates a dataloader from its input — as_dataloader","text":"can implement as_dataloader S3 method want data structure automatically supported luz's fit.luz_module_generator(). method must satisfy following conditions: method return torch::dataloader(). required argument x. good default arguments. better avoid implementing as_dataloader methods common S3 classes like data.frames. case, better assign different class inputs implement as_dataloader .","code":""},{"path":"/reference/context.html","id":null,"dir":"Reference","previous_headings":"","what":"Context object — context","title":"Context object — context","text":"Context object storing information model training context. See also ctx.","code":""},{"path":"/reference/context.html","id":"public-fields","dir":"Reference","previous_headings":"","what":"Public fields","title":"Context object — context","text":"buffers list buffers callbacks can use write temporary information ctx.","code":""},{"path":"/reference/context.html","id":"active-bindings","dir":"Reference","previous_headings":"","what":"Active bindings","title":"Context object — context","text":"records stores information values logged self$log. device allows querying current accelerator device callbacks list callbacks called. iter current iteration batch current batch data. list input data targets. input shortcut ctx$batch[[1]] target shortcut ctx$batch[[2]] min_epochs minimum number epochs model run . max_epochs maximum number epochs model run. hparams list hyperparameters used initialize ctx$model. opt_hparams list hyperparameters used initialize ctx$optimizers. train_data dataloader used training model valid_data dataloader using model validation accelerator accelerator() used move data, model etc correct device. optimizers named list optimizers used model training. verbose bool wether process verbose mode . handlers List error handlers can used. See rlang::try_fetch() info. epoch_handlers List error handlers can used. See rlang::try_fetch() info. training bool indicating model training validation mode. model model trained. pred Last predicted values. opt Current optimizer. opt_name Current optimizer name. data Current dataloader use. loss_fn Loss function used train model loss Last computed loss values. Detached graph. loss_grad Last computed loss value, detached, can additional tranformation. epoch Current epoch. metrics List metrics tracked process. step_opt Defines step called optimizer. must function taking optimizer argument.","code":""},{"path":[]},{"path":"/reference/context.html","id":"public-methods","dir":"Reference","previous_headings":"","what":"Public methods","title":"Context object — context","text":"context$new() context$log() context$log_metric() context$get_log() context$get_metrics() context$get_metric() context$get_formatted_metrics() context$get_metrics_df() context$set_verbose() context$clean() context$call_callbacks() context$state_dict() context$unsafe_set_records() context$clone()","code":""},{"path":"/reference/context.html","id":"method-new-","dir":"Reference","previous_headings":"","what":"Method new()","title":"Context object — context","text":"Initializes context object minimal necessary information.","code":""},{"path":"/reference/context.html","id":"usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$new(verbose, accelerator, callbacks, training)"},{"path":"/reference/context.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"verbose Whether context verbose mode . accelerator luz accelerator() configures device placement others. callbacks list callbacks used model. See luz_callback(). training boolean indicates context training mode .","code":""},{"path":"/reference/context.html","id":"method-log-","dir":"Reference","previous_headings":"","what":"Method log()","title":"Context object — context","text":"Allows logging arbitrary information ctx.","code":""},{"path":"/reference/context.html","id":"usage-1","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$log(what, set, value, index = NULL, append = TRUE)"},{"path":"/reference/context.html","id":"arguments-1","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"(string) logging. set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. value value log value Arbitrary value log. index Index value logged. NULL value added end list, otherwise index used. append TRUE value corresponding index already exists, value appended current value. FALSE value overwritten favor new value.","code":""},{"path":"/reference/context.html","id":"method-log-metric-","dir":"Reference","previous_headings":"","what":"Method log_metric()","title":"Context object — context","text":"Log metric gen name value. Metric values indexed epoch.","code":""},{"path":"/reference/context.html","id":"usage-2","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$log_metric(name, value)"},{"path":"/reference/context.html","id":"arguments-2","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"name name metric value value log value Arbitrary value log.","code":""},{"path":"/reference/context.html","id":"method-get-log-","dir":"Reference","previous_headings":"","what":"Method get_log()","title":"Context object — context","text":"Get specific value log.","code":""},{"path":"/reference/context.html","id":"usage-3","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_log(what, set, index = NULL)"},{"path":"/reference/context.html","id":"arguments-3","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"(string) logging. set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. index Index value logged. NULL value added end list, otherwise index used.","code":""},{"path":"/reference/context.html","id":"method-get-metrics-","dir":"Reference","previous_headings":"","what":"Method get_metrics()","title":"Context object — context","text":"Get metric given epoch set.","code":""},{"path":"/reference/context.html","id":"usage-4","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_metrics(set, epoch = NULL)"},{"path":"/reference/context.html","id":"arguments-4","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. epoch epoch want extract metrics .","code":""},{"path":"/reference/context.html","id":"method-get-metric-","dir":"Reference","previous_headings":"","what":"Method get_metric()","title":"Context object — context","text":"Get value metric given name, epoch set.","code":""},{"path":"/reference/context.html","id":"usage-5","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_metric(name, set, epoch = NULL)"},{"path":"/reference/context.html","id":"arguments-5","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"name name metric set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. epoch epoch want extract metrics .","code":""},{"path":"/reference/context.html","id":"method-get-formatted-metrics-","dir":"Reference","previous_headings":"","what":"Method get_formatted_metrics()","title":"Context object — context","text":"Get formatted metrics values","code":""},{"path":"/reference/context.html","id":"usage-6","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_formatted_metrics(set, epoch = NULL)"},{"path":"/reference/context.html","id":"arguments-6","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"set (string) Usually 'train' 'valid' indicating set want lot . can arbitrary info. epoch epoch want extract metrics .","code":""},{"path":"/reference/context.html","id":"method-get-metrics-df-","dir":"Reference","previous_headings":"","what":"Method get_metrics_df()","title":"Context object — context","text":"Get data.frame containing metrics.","code":""},{"path":"/reference/context.html","id":"usage-7","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$get_metrics_df()"},{"path":"/reference/context.html","id":"method-set-verbose-","dir":"Reference","previous_headings":"","what":"Method set_verbose()","title":"Context object — context","text":"Allows setting verbose attribute.","code":""},{"path":"/reference/context.html","id":"usage-8","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$set_verbose(verbose = NULL)"},{"path":"/reference/context.html","id":"arguments-7","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"verbose boolean. TRUE verbose mode used. FALSE non verbose. NULL use result interactive().","code":""},{"path":"/reference/context.html","id":"method-clean-","dir":"Reference","previous_headings":"","what":"Method clean()","title":"Context object — context","text":"Removes unnecessary information context object.","code":""},{"path":"/reference/context.html","id":"usage-9","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$clean()"},{"path":"/reference/context.html","id":"method-call-callbacks-","dir":"Reference","previous_headings":"","what":"Method call_callbacks()","title":"Context object — context","text":"Call selected callbacks. name callback types call, eg 'on_epoch_begin'.","code":""},{"path":"/reference/context.html","id":"usage-10","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$call_callbacks(name)"},{"path":"/reference/context.html","id":"arguments-8","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"name name metric","code":""},{"path":"/reference/context.html","id":"method-state-dict-","dir":"Reference","previous_headings":"","what":"Method state_dict()","title":"Context object — context","text":"Returns list containing minimal information context. Used create returned values.","code":""},{"path":"/reference/context.html","id":"usage-11","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$state_dict()"},{"path":"/reference/context.html","id":"method-unsafe-set-records-","dir":"Reference","previous_headings":"","what":"Method unsafe_set_records()","title":"Context object — context","text":"sure know ?","code":""},{"path":"/reference/context.html","id":"usage-12","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$unsafe_set_records(records)"},{"path":"/reference/context.html","id":"arguments-9","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"records New set records set.","code":""},{"path":"/reference/context.html","id":"method-clone-","dir":"Reference","previous_headings":"","what":"Method clone()","title":"Context object — context","text":"objects class cloneable method.","code":""},{"path":"/reference/context.html","id":"usage-13","dir":"Reference","previous_headings":"","what":"Usage","title":"Context object — context","text":"","code":"context$clone(deep = FALSE)"},{"path":"/reference/context.html","id":"arguments-10","dir":"Reference","previous_headings":"","what":"Arguments","title":"Context object — context","text":"deep Whether make deep clone.","code":""},{"path":"/reference/ctx.html","id":null,"dir":"Reference","previous_headings":"","what":"Context object — ctx","title":"Context object — ctx","text":"Context objects used luz share information model methods, metrics callbacks.","code":""},{"path":"/reference/ctx.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Context object — ctx","text":"ctx object used luz share information training loop callbacks, model methods, metrics. table describes information available ctx default. callbacks potentially modify attributes add new ones. Context attributes","code":""},{"path":[]},{"path":"/reference/evaluate.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluates a fitted model on a dataset — evaluate","title":"Evaluates a fitted model on a dataset — evaluate","text":"Evaluates fitted model dataset","code":""},{"path":"/reference/evaluate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluates a fitted model on a dataset — evaluate","text":"","code":"evaluate(   object,   data,   ...,   metrics = NULL,   callbacks = list(),   accelerator = NULL,   verbose = NULL,   dataloader_options = NULL )"},{"path":"/reference/evaluate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluates a fitted model on a dataset — evaluate","text":"object fitted model evaluate. data (dataloader, dataset list) dataloader created torch::dataloader() used training model, dataset created torch::dataset() list. Dataloaders datasets must return list 2 items. first item used input module second used target loss function. ... Currently unused. metrics list luz metrics tracked evaluation. NULL (default) metrics used training tracked. callbacks (list, optional) list callbacks defined luz_callback() called training procedure. callbacks luz_callback_metrics(), luz_callback_progress() luz_callback_train_valid() always added default. accelerator (accelerator, optional) optional accelerator() object used configure device placement components like nn_modules, optimizers batches data. verbose (logical, optional) optional boolean value indicating fitting procedure emit output console training. default, produce output interactive() TRUE, otherwise print console. dataloader_options Options used creating dataloader. See torch::dataloader(). shuffle=TRUE default training data batch_size=32 default. error NULL data already dataloader.","code":""},{"path":"/reference/evaluate.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Evaluates a fitted model on a dataset — evaluate","text":"model trained might want evaluate performance different dataset. reason, luz provides ?evaluate function takes fitted model dataset computes metrics attached model. Evaluate returns luz_module_evaluation object can query metrics using get_metrics function simply print see results. example:","code":"evaluation <- fitted %>% evaluate(data = valid_dl) metrics <- get_metrics(evaluation) print(evaluation) ## A `luz_module_evaluation` ## -- Results --------------------------------------------------------------------- ## loss: 1.5146 ## mae: 1.0251 ## mse: 1.5159 ## rmse: 1.2312"},{"path":[]},{"path":"/reference/fit.luz_module_generator.html","id":null,"dir":"Reference","previous_headings":"","what":"Fit a nn_module — fit.luz_module_generator","title":"Fit a nn_module — fit.luz_module_generator","text":"Fit nn_module","code":""},{"path":"/reference/fit.luz_module_generator.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fit a nn_module — fit.luz_module_generator","text":"","code":"# S3 method for luz_module_generator fit(   object,   data,   epochs = 10,   callbacks = NULL,   valid_data = NULL,   accelerator = NULL,   verbose = NULL,   ...,   dataloader_options = NULL )"},{"path":"/reference/fit.luz_module_generator.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fit a nn_module — fit.luz_module_generator","text":"object nn_module setup(). data (dataloader, dataset list) dataloader created torch::dataloader() used training model, dataset created torch::dataset() list. Dataloaders datasets must return list 2 items. first item used input module second used target loss function. epochs (int) maximum number epochs training model. single value provided, taken max_epochs min_epochs set 0. vector two numbers provided, first value min_epochs second value max_epochs. minimum maximum number epochs included context object ctx$min_epochs ctx$max_epochs, respectively. callbacks (list, optional) list callbacks defined luz_callback() called training procedure. callbacks luz_callback_metrics(), luz_callback_progress() luz_callback_train_valid() always added default. valid_data (dataloader, dataset, list scalar value; optional) dataloader created torch::dataloader() dataset created torch::dataset() used validation procedure. must return list (input, target). data torch dataset list, can also supply numeric value 0 1 - case random sample size corresponding proportion data used validation. accelerator (accelerator, optional) optional accelerator() object used configure device placement components like nn_modules, optimizers batches data. verbose (logical, optional) optional boolean value indicating fitting procedure emit output console training. default, produce output interactive() TRUE, otherwise print console. ... Currently unused. dataloader_options Options used creating dataloader. See torch::dataloader(). shuffle=TRUE default training data batch_size=32 default. error NULL data already dataloader.","code":""},{"path":"/reference/fit.luz_module_generator.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fit a nn_module — fit.luz_module_generator","text":"fitted object can saved luz_save() can printed print() plotted plot().","code":""},{"path":[]},{"path":"/reference/get_metrics.html","id":null,"dir":"Reference","previous_headings":"","what":"Get metrics from the object — get_metrics","title":"Get metrics from the object — get_metrics","text":"Get metrics object","code":""},{"path":"/reference/get_metrics.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get metrics from the object — get_metrics","text":"","code":"get_metrics(object, ...)  # S3 method for luz_module_fitted get_metrics(object, ...)"},{"path":"/reference/get_metrics.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get metrics from the object — get_metrics","text":"object object query metrics. ... Currently unused.","code":""},{"path":"/reference/get_metrics.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get metrics from the object — get_metrics","text":"data.frame containing metric values.","code":""},{"path":"/reference/get_metrics.html","id":"methods-by-class-","dir":"Reference","previous_headings":"","what":"Methods (by class)","title":"Get metrics from the object — get_metrics","text":"get_metrics(luz_module_fitted): Extract metrics luz fitted model.","code":""},{"path":"/reference/lr_finder.html","id":null,"dir":"Reference","previous_headings":"","what":"Learning Rate Finder — lr_finder","title":"Learning Rate Finder — lr_finder","text":"Learning Rate Finder","code":""},{"path":"/reference/lr_finder.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Learning Rate Finder — lr_finder","text":"","code":"lr_finder(   object,   data,   steps = 100,   start_lr = 1e-07,   end_lr = 0.1,   log_spaced_intervals = TRUE,   ...,   verbose = NULL )"},{"path":"/reference/lr_finder.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Learning Rate Finder — lr_finder","text":"object nn_module setup(). data (dataloader) dataloader created torch::dataloader()  used learning rate finding. steps (integer) number steps iterate learning rate finder. Default: 100. start_lr (float) smallest learning rate. Default: 1e-7. end_lr (float) highest learning rate. Default: 1e-1. log_spaced_intervals (logical) Whether divide range start_lr end_lr log-spaced intervals (alternative: uniform intervals). Default: TRUE ... arguments passed fit. verbose Wether show progress bar process.","code":""},{"path":"/reference/lr_finder.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Learning Rate Finder — lr_finder","text":"dataframe two columns: learning rate loss","code":""},{"path":"/reference/lr_finder.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Learning Rate Finder — lr_finder","text":"","code":"if (torch::torch_is_installed()) { library(torch) ds <- torch::tensor_dataset(x = torch_randn(100, 10), y = torch_randn(100, 1)) dl <- torch::dataloader(ds, batch_size = 32) model <- torch::nn_linear model <- model %>% setup(   loss = torch::nn_mse_loss(),   optimizer = torch::optim_adam ) %>%   set_hparams(in_features = 10, out_features = 1) records <- lr_finder(model, dl, verbose = FALSE) plot(records) }"},{"path":"/reference/luz_callback.html","id":null,"dir":"Reference","previous_headings":"","what":"Create a new callback — luz_callback","title":"Create a new callback — luz_callback","text":"Create new callback","code":""},{"path":"/reference/luz_callback.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create a new callback — luz_callback","text":"","code":"luz_callback(   name = NULL,   ...,   private = NULL,   active = NULL,   parent_env = parent.frame(),   inherit = NULL )"},{"path":"/reference/luz_callback.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create a new callback — luz_callback","text":"name name callback ... Public methods callback. name methods used know called. See details section. private optional list private members, can functions non-functions. active optional list active binding functions. parent_env environment use parent newly-created objects. inherit R6ClassGenerator object inherit ; words, superclass. captured unevaluated expression evaluated parent_env time object instantiated.","code":""},{"path":"/reference/luz_callback.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create a new callback — luz_callback","text":"luz_callback can passed fit.luz_module_generator().","code":""},{"path":"/reference/luz_callback.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Create a new callback — luz_callback","text":"Let’s implement callback prints ‘Iteration n’ (n iteration number) every batch training set ‘Done’ epoch finished. task use luz_callback function:   luz_callback() takes named functions ... arguments, name indicates moment callback called. instance on_train_batch_end() called every batch end training procedure, on_epoch_end() called end every epoch. returned value luz_callback() function initializes instance callback. Callbacks can initialization parameters, like name file want log results. case, can pass initialize method creating callback definition, save parameters self object. example, callback message parameter printed end epoch. callback defined can passed fit function via callbacks parameter:   Callbacks can called many different positions training loop, including combinations . ’s overview possible callback breakpoints:   Every step market on_* point training procedure available callbacks called. important part callbacks ctx (context) object. See help(\"ctx\") details. default, callbacks called order passed fit (predict evaluate), can provide weight attribute control order called. example, one callback weight = 10 another weight = 1, first one called second one. Callbacks don’t specify weight attribute considered weight = 0. built-callbacks luz already provide weight value. example, ?luz_callback_early_stopping weight Inf, since general want run last thing loop.","code":"print_callback <- luz_callback(   name = \"print_callback\",   initialize = function(message) {     self$message <- message   },   on_train_batch_end = function() {     cat(\"Iteration \", ctx$iter, \"\\n\")   },   on_epoch_end = function() {     cat(self$message, \"\\n\")   } ) fitted <- net %>%   setup(...) %>%   fit(..., callbacks = list(     print_callback(message = \"Done!\")   )) Start Fit    - on_fit_begin   Start Epoch Loop      - on_epoch_begin     Start Train        - on_train_begin       Start Batch Loop          - on_train_batch_begin           Start Default Training Step             - on_train_batch_after_pred             - on_train_batch_after_loss             - on_train_batch_before_backward             - on_train_batch_before_step             - on_train_batch_after_step           End Default Training Step:          - on_train_batch_end       End Batch Loop        - on_train_end     End Train     Start Valid        - on_valid_begin       Start Batch Loop          - on_valid_batch_begin           Start Default Validation Step             - on_valid_batch_after_pred             - on_valid_batch_after_loss           End Default Validation Step          - on_valid_batch_end       End Batch Loop        - on_valid_end     End Valid       - on_epoch_end   End Epoch Loop    - on_fit_end End Fit"},{"path":"/reference/luz_callback.html","id":"prediction-callbacks","dir":"Reference","previous_headings":"","what":"Prediction callbacks","title":"Create a new callback — luz_callback","text":"can also use callbacks using predict(). case supported callback methods detailed .","code":"Start predict  - on_predict_begin  Start prediction loop   - on_predict_batch_begin   - on_predict_batch_end  End prediction loop  - on_predict_end End predict"},{"path":"/reference/luz_callback.html","id":"evaluate-callbacks","dir":"Reference","previous_headings":"","what":"Evaluate callbacks","title":"Create a new callback — luz_callback","text":"Callbacks can also used evaluate(), case, callbacks used equivalent validation loop using fit():","code":"Start Valid  - on_valid_begin  Start Batch Loop   - on_valid_batch_begin   Start Default Validation Step    - on_valid_batch_after_pred    - on_valid_batch_after_loss   End Default Validation Step   - on_valid_batch_end  End Batch Loop  - on_valid_end End Valid"},{"path":[]},{"path":"/reference/luz_callback.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create a new callback — luz_callback","text":"","code":"print_callback <- luz_callback(  name = \"print_callback\",  on_train_batch_end = function() {    cat(\"Iteration \", ctx$iter, \"\\n\")  },  on_epoch_end = function() {    cat(\"Done!\\n\")  } )"},{"path":"/reference/luz_callback_auto_resume.html","id":null,"dir":"Reference","previous_headings":"","what":"Resume training callback — luz_callback_auto_resume","title":"Resume training callback — luz_callback_auto_resume","text":"callback allows resume training model.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Resume training callback — luz_callback_auto_resume","text":"","code":"luz_callback_auto_resume(path = \"./state.pt\")"},{"path":"/reference/luz_callback_auto_resume.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Resume training callback — luz_callback_auto_resume","text":"path Path save state files model.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Resume training callback — luz_callback_auto_resume","text":"using , model weights, optimizer state serialized end epoch. something fails training simply re-running script restart model training epoch right last epoch serialized.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Resume training callback — luz_callback_auto_resume","text":"general want add callback last callbacks list, way, serialized state likely contain possible changes callbacks made 'on_epoch_end'. default weight attribute callback Inf. Read checkpointing article pkgdown website information.","code":""},{"path":"/reference/luz_callback_auto_resume.html","id":"customizing-serialization","dir":"Reference","previous_headings":"","what":"Customizing serialization","title":"Resume training callback — luz_callback_auto_resume","text":"default model, optimizer state records serialized. Callbacks can used customize serialization implementing state_dict() load_state_dict() methods. methods implemented, state_dict() called end epoch load_state_dict() called model resumed.","code":""},{"path":[]},{"path":"/reference/luz_callback_auto_resume.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Resume training callback — luz_callback_auto_resume","text":"","code":"if (torch::torch_is_installed()) { library(torch) library(luz)  x <- torch_randn(1000, 10) y <- torch_randn(1000, 1)  model <- nn_linear %>%   setup(optimizer = optim_sgd, loss = nnf_mse_loss) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 0.01)   # simulate a failure in the middle of epoch 5 happening only once. callback_stop <- luz_callback(   \"interrupt\",   failed = FALSE,   on_epoch_end = function() {     if (ctx$epoch == 5 && !self$failed) {       self$failed <- TRUE       stop(\"Error on epoch 5\")     }   } )  path <- tempfile() autoresume <- luz_callback_auto_resume(path = path) interrupt <- callback_stop()  # try once and the model fails try({   results <- model %>% fit(     list(x, y),     callbacks = list(autoresume, interrupt),     verbose = FALSE   ) })  # model resumes and completes results <- model %>% fit(   list(x, y),   callbacks = list(autoresume, interrupt),   verbose = FALSE )  get_metrics(results)  } #> Error in FUN(X[[i]], ...) :  #>   Error while calling callback with class <interrupt/LuzCallback/R6> at #> on_epoch_end. #> Caused by error in `self[[callback_nm]]()`: #> ! Error on epoch 5 #>      set metric epoch    value #> 1  train   loss     1 1.217334 #> 2  train   loss     2 1.079304 #> 3  train   loss     3 1.040630 #> 4  train   loss     4 1.027106 #> 5  train   loss     5 1.023069 #> 6  train   loss     6 1.017577 #> 7  train   loss     7 1.016829 #> 8  train   loss     8 1.020484 #> 9  train   loss     9 1.022464 #> 10 train   loss    10 1.025988"},{"path":"/reference/luz_callback_csv_logger.html","id":null,"dir":"Reference","previous_headings":"","what":"CSV logger callback — luz_callback_csv_logger","title":"CSV logger callback — luz_callback_csv_logger","text":"Logs metrics obtained training fiel disk. file 1 line epoch/validation.","code":""},{"path":"/reference/luz_callback_csv_logger.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"CSV logger callback — luz_callback_csv_logger","text":"","code":"luz_callback_csv_logger(path)"},{"path":"/reference/luz_callback_csv_logger.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"CSV logger callback — luz_callback_csv_logger","text":"path path file disk.","code":""},{"path":[]},{"path":"/reference/luz_callback_early_stopping.html","id":null,"dir":"Reference","previous_headings":"","what":"Early stopping callback — luz_callback_early_stopping","title":"Early stopping callback — luz_callback_early_stopping","text":"Stops training monitored metric stops improving","code":""},{"path":"/reference/luz_callback_early_stopping.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Early stopping callback — luz_callback_early_stopping","text":"","code":"luz_callback_early_stopping(   monitor = \"valid_loss\",   min_delta = 0,   patience = 0,   mode = \"min\",   baseline = NULL )"},{"path":"/reference/luz_callback_early_stopping.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Early stopping callback — luz_callback_early_stopping","text":"monitor string format <set>_<metric> <set> can 'train' 'valid' <metric> can abbreviation metric tracking training. metric name case insensitive. min_delta Minimum improvement reset patience counter. patience Number epochs without improving stoping training. mode Specifies direction considered improvement. default 'min' used. Can also 'max' (higher better) 'zero' (closer zero better). baseline initial value used best seen value begining. Model stopm training better baseline value found first patience epochs.","code":""},{"path":"/reference/luz_callback_early_stopping.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Early stopping callback — luz_callback_early_stopping","text":"luz_callback early stopping.","code":""},{"path":"/reference/luz_callback_early_stopping.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Early stopping callback — luz_callback_early_stopping","text":"callback adds on_early_stopping callback can used call callbacks soon model stops training. verbose=TRUE fit.luz_module_generator() message printed early stopping.","code":""},{"path":[]},{"path":"/reference/luz_callback_early_stopping.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Early stopping callback — luz_callback_early_stopping","text":"","code":"cb <- luz_callback_early_stopping()"},{"path":"/reference/luz_callback_gradient_clip.html","id":null,"dir":"Reference","previous_headings":"","what":"Gradient clipping callback — luz_callback_gradient_clip","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"adding GradientClip callback, gradient norm_type (default:2) norm clipped max_norm (default:1) using torch::nn_utils_clip_grad_norm_(), can avoid loss divergence.","code":""},{"path":"/reference/luz_callback_gradient_clip.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"","code":"luz_callback_gradient_clip(max_norm = 1, norm_type = 2)"},{"path":"/reference/luz_callback_gradient_clip.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"max_norm (float int): max norm gradients norm_type (float int): type used p-norm. Can Inf infinity norm.","code":""},{"path":"/reference/luz_callback_gradient_clip.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Gradient clipping callback — luz_callback_gradient_clip","text":"See FastAI documentation GradientClip callback.","code":""},{"path":"/reference/luz_callback_interrupt.html","id":null,"dir":"Reference","previous_headings":"","what":"Interrupt callback — luz_callback_interrupt","title":"Interrupt callback — luz_callback_interrupt","text":"Adds handler allows interrupting training loop using ctrl + C. Also registers on_interrupt breakpoint users can register callbacks run training loop interruption.","code":""},{"path":"/reference/luz_callback_interrupt.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Interrupt callback — luz_callback_interrupt","text":"","code":"luz_callback_interrupt()"},{"path":"/reference/luz_callback_interrupt.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Interrupt callback — luz_callback_interrupt","text":"luz_callback","code":""},{"path":"/reference/luz_callback_interrupt.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Interrupt callback — luz_callback_interrupt","text":"general need use callback always included default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_interrupt.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Interrupt callback — luz_callback_interrupt","text":"","code":"interrupt_callback <- luz_callback_interrupt()"},{"path":"/reference/luz_callback_keep_best_model.html","id":null,"dir":"Reference","previous_headings":"","what":"Keep the best model — luz_callback_keep_best_model","title":"Keep the best model — luz_callback_keep_best_model","text":"epoch, improvement monitored metric serialize model weights temp file. training done, reload weights best model.","code":""},{"path":"/reference/luz_callback_keep_best_model.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Keep the best model — luz_callback_keep_best_model","text":"","code":"luz_callback_keep_best_model(   monitor = \"valid_loss\",   mode = \"min\",   min_delta = 0 )"},{"path":"/reference/luz_callback_keep_best_model.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Keep the best model — luz_callback_keep_best_model","text":"monitor string format <set>_<metric> <set> can 'train' 'valid' <metric> can abbreviation metric tracking training. metric name case insensitive. mode Specifies direction considered improvement. default 'min' used. Can also 'max' (higher better) 'zero' (closer zero better). min_delta Minimum improvement reset patience counter.","code":""},{"path":[]},{"path":"/reference/luz_callback_keep_best_model.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Keep the best model — luz_callback_keep_best_model","text":"","code":"cb <- luz_callback_keep_best_model()"},{"path":"/reference/luz_callback_lr_scheduler.html","id":null,"dir":"Reference","previous_headings":"","what":"Learning rate scheduler callback — luz_callback_lr_scheduler","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"Initializes runs torch::lr_scheduler()s.","code":""},{"path":"/reference/luz_callback_lr_scheduler.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"","code":"luz_callback_lr_scheduler(   lr_scheduler,   ...,   call_on = \"on_epoch_end\",   opt_name = NULL )"},{"path":"/reference/luz_callback_lr_scheduler.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"lr_scheduler torch::lr_scheduler() initialized optimizer ... parameters. ... Additional arguments passed lr_scheduler together optimizers. call_on callback breakpoint scheduler$step() called. Default 'on_epoch_end'. See luz_callback() information. opt_name name optimizer affected callback. match name given set_optimizers. module single optimizer, opt_name used.","code":""},{"path":"/reference/luz_callback_lr_scheduler.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"luz_callback() generator.","code":""},{"path":[]},{"path":"/reference/luz_callback_lr_scheduler.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Learning rate scheduler callback — luz_callback_lr_scheduler","text":"","code":"if (torch::torch_is_installed()) { cb <- luz_callback_lr_scheduler(torch::lr_step, step_size = 30) }"},{"path":"/reference/luz_callback_metrics.html","id":null,"dir":"Reference","previous_headings":"","what":"Metrics callback — luz_callback_metrics","title":"Metrics callback — luz_callback_metrics","text":"Tracks metrics passed setup() training validation.","code":""},{"path":"/reference/luz_callback_metrics.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Metrics callback — luz_callback_metrics","text":"","code":"luz_callback_metrics()"},{"path":"/reference/luz_callback_metrics.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Metrics callback — luz_callback_metrics","text":"luz_callback","code":""},{"path":"/reference/luz_callback_metrics.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Metrics callback — luz_callback_metrics","text":"callback takes care 2 ctx attributes: ctx$metrics: stores current metrics objects initialized epoch, update()d compute()d every batch. rarely need work metrics. ctx$records$metrics: Stores metrics per training/validation epoch. structure similar ctx$losses.","code":""},{"path":"/reference/luz_callback_metrics.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Metrics callback — luz_callback_metrics","text":"general need explicitly use metrics callback used default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_mixed_precision.html","id":null,"dir":"Reference","previous_headings":"","what":"Automatic Mixed Precision callback — luz_callback_mixed_precision","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"callback enable torch::local_autocast() training model forward loss computation. disable autocast scale loss backward() opt$step(). See information.","code":""},{"path":"/reference/luz_callback_mixed_precision.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"","code":"luz_callback_mixed_precision(...)"},{"path":"/reference/luz_callback_mixed_precision.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"... Passed torch::cuda_amp_grad_scaler().","code":""},{"path":"/reference/luz_callback_mixed_precision.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Automatic Mixed Precision callback — luz_callback_mixed_precision","text":"luz_callback","code":""},{"path":[]},{"path":"/reference/luz_callback_mixup.html","id":null,"dir":"Reference","previous_headings":"","what":"Mixup callback — luz_callback_mixup","title":"Mixup callback — luz_callback_mixup","text":"Implementation 'mixup: Beyond Empirical Risk Minimization'. today, tested categorical data, targets expected integers, one-hot encoded vectors. callback supposed used together nn_mixup_loss().","code":""},{"path":"/reference/luz_callback_mixup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mixup callback — luz_callback_mixup","text":"","code":"luz_callback_mixup(alpha = 0.4, ..., run_valid = FALSE, auto_loss = FALSE)"},{"path":"/reference/luz_callback_mixup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Mixup callback — luz_callback_mixup","text":"alpha parameter beta distribution used sample mixing coefficients ... currently unused. Just force named arguments. run_valid run validation auto_loss automatically modify loss function? wrap loss function create mixup loss. TRUE make sure loss function apply reductions. run_valid=FALSE, loss mean reduced validation.","code":""},{"path":"/reference/luz_callback_mixup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mixup callback — luz_callback_mixup","text":"luz_callback","code":""},{"path":"/reference/luz_callback_mixup.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Mixup callback — luz_callback_mixup","text":"Overall, follow fastai implementation described . Namely, work single dataloader , randomly mixing two observations batch. linearly combine losses computed targets: loss(output, new_target) = weight * loss(output, target1) + (1-weight) * loss(output, target2) draw different mixing coefficients every pair. replace weight weight = max(weight, 1-weight) avoid duplicates.","code":""},{"path":[]},{"path":"/reference/luz_callback_mixup.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Mixup callback — luz_callback_mixup","text":"","code":"if (torch::torch_is_installed()) { mixup_callback <- luz_callback_mixup() }"},{"path":"/reference/luz_callback_model_checkpoint.html","id":null,"dir":"Reference","previous_headings":"","what":"Checkpoints model weights — luz_callback_model_checkpoint","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"saves checkpoints model according specified metric behavior.","code":""},{"path":"/reference/luz_callback_model_checkpoint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"","code":"luz_callback_model_checkpoint(   path,   monitor = \"valid_loss\",   save_best_only = FALSE,   mode = \"min\",   min_delta = 0 )"},{"path":"/reference/luz_callback_model_checkpoint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"path Path save model disk. path interpolated glue, can use attribute within ctx using '{ctx$epoch}'. Specially epoch monitor quantities already environment. specified path path directory (ends / \\), models saved name given epoch-{epoch:02d}-{self$monitor}-{monitor:.3f}.pt. See examples. can use sprintf() quickly format quantities, example:'{epoch:02d}'. monitor string format <set>_<metric> <set> can 'train' 'valid' <metric> can abbreviation metric tracking training. metric name case insensitive. save_best_only TRUE models saved improvement previously saved model. mode Specifies direction considered improvement. default 'min' used. Can also 'max' (higher better) 'zero' (closer zero better). min_delta Minimum difference consider improvement. used save_best_only=TRUE.","code":""},{"path":"/reference/luz_callback_model_checkpoint.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"mode min_delta used save_best_only=TRUE. save_best_only overwrite saved models path parameter differentiate epochs. Read checkpointing article pkgdown website information.","code":""},{"path":[]},{"path":"/reference/luz_callback_model_checkpoint.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Checkpoints model weights — luz_callback_model_checkpoint","text":"","code":"luz_callback_model_checkpoint(path= \"path/to/dir\") #> <model_checkpoint_callback> #>   Inherits from: <monitor_metrics> #>   Public: #>     call: function (callback_nm)  #>     clone: function (deep = FALSE)  #>     compare: function (new, old)  #>     find_quantity: function ()  #>     fmt_path: function (path)  #>     initialize: function (path, monitor = \"valid_loss\", save_best_only = FALSE,  #>     min_delta: 0 #>     mode: min #>     monitor: valid_loss #>     on_epoch_end: function ()  #>     path: path/to/dir #>     save_best_only: FALSE #>     set_ctx: function (ctx)  luz_callback_model_checkpoint(path= \"path/to/dir/epoch-{epoch:02d}/model.pt\") #> <model_checkpoint_callback> #>   Inherits from: <monitor_metrics> #>   Public: #>     call: function (callback_nm)  #>     clone: function (deep = FALSE)  #>     compare: function (new, old)  #>     find_quantity: function ()  #>     fmt_path: function (path)  #>     initialize: function (path, monitor = \"valid_loss\", save_best_only = FALSE,  #>     min_delta: 0 #>     mode: min #>     monitor: valid_loss #>     on_epoch_end: function ()  #>     path: path/to/dir/epoch-{epoch:02d}/model.pt #>     save_best_only: FALSE #>     set_ctx: function (ctx)  luz_callback_model_checkpoint(path= \"path/to/dir/epoch-{epoch:02d}/model-{monitor:.2f}.pt\") #> <model_checkpoint_callback> #>   Inherits from: <monitor_metrics> #>   Public: #>     call: function (callback_nm)  #>     clone: function (deep = FALSE)  #>     compare: function (new, old)  #>     find_quantity: function ()  #>     fmt_path: function (path)  #>     initialize: function (path, monitor = \"valid_loss\", save_best_only = FALSE,  #>     min_delta: 0 #>     mode: min #>     monitor: valid_loss #>     on_epoch_end: function ()  #>     path: path/to/dir/epoch-{epoch:02d}/model-{monitor:.2f}.pt #>     save_best_only: FALSE #>     set_ctx: function (ctx)"},{"path":"/reference/luz_callback_profile.html","id":null,"dir":"Reference","previous_headings":"","what":"Profile callback — luz_callback_profile","title":"Profile callback — luz_callback_profile","text":"Computes times high-level operations training loops.","code":""},{"path":"/reference/luz_callback_profile.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Profile callback — luz_callback_profile","text":"","code":"luz_callback_profile()"},{"path":"/reference/luz_callback_profile.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Profile callback — luz_callback_profile","text":"luz_callback","code":""},{"path":"/reference/luz_callback_profile.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Profile callback — luz_callback_profile","text":"Records saved ctx$records$profile. Times stored seconds. Data stored following structure: fit time entire fit procedure. epoch times per epoch","code":""},{"path":"/reference/luz_callback_profile.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Profile callback — luz_callback_profile","text":"general need use callback always included default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_profile.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Profile callback — luz_callback_profile","text":"","code":"profile_callback <- luz_callback_profile()"},{"path":"/reference/luz_callback_progress.html","id":null,"dir":"Reference","previous_headings":"","what":"Progress callback — luz_callback_progress","title":"Progress callback — luz_callback_progress","text":"Responsible printing progress training.","code":""},{"path":"/reference/luz_callback_progress.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Progress callback — luz_callback_progress","text":"","code":"luz_callback_progress()"},{"path":"/reference/luz_callback_progress.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Progress callback — luz_callback_progress","text":"luz_callback","code":""},{"path":"/reference/luz_callback_progress.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Progress callback — luz_callback_progress","text":"general need use callback always included default fit.luz_module_generator(). Printing can disabled passing verbose=FALSE fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":null,"dir":"Reference","previous_headings":"","what":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"Allow resume model training specific checkpoint","code":""},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"","code":"luz_callback_resume_from_checkpoint(   path,   ...,   restore_model_state = TRUE,   restore_records = FALSE,   restore_optimizer_state = FALSE,   restore_callbacks_state = FALSE )"},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"path Path checkpoint want resume. ... currently unused. restore_model_state Wether restore model state callback. restore_records Wether restore records checkpoint. restore_optimizer_state Wether restore optimizer state checkpoint. restore_callbacks_state Wether restore callbacks state checkpoint.","code":""},{"path":"/reference/luz_callback_resume_from_checkpoint.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Allow resume model training from a specific checkpoint — luz_callback_resume_from_checkpoint","text":"Read checkpointing article pkgdown website information.","code":""},{"path":[]},{"path":"/reference/luz_callback_tfevents.html","id":null,"dir":"Reference","previous_headings":"","what":"tfevents callback — luz_callback_tfevents","title":"tfevents callback — luz_callback_tfevents","text":"Logs metrics model information tfevents file format. Assuming tensorboard installed, result can visualized ","code":""},{"path":"/reference/luz_callback_tfevents.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"tfevents callback — luz_callback_tfevents","text":"","code":"luz_callback_tfevents(logdir = \"logs\", histograms = FALSE, ...)"},{"path":"/reference/luz_callback_tfevents.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"tfevents callback — luz_callback_tfevents","text":"logdir directory log written . histograms boolean specifying histograms model weights logged. can also character vector specifying name parameters logged (names names(model$parameters)). ... Currently used. future expansion.","code":""},{"path":"/reference/luz_callback_tfevents.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"tfevents callback — luz_callback_tfevents","text":"","code":"tensorboard --logdir=logs"},{"path":"/reference/luz_callback_tfevents.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"tfevents callback — luz_callback_tfevents","text":"","code":"if (torch::torch_is_installed()) { library(torch) x <- torch_randn(1000, 10) y <- torch_randn(1000, 1)  model <- nn_linear %>%   setup(loss = nnf_mse_loss, optimizer = optim_adam) %>%   set_hparams(in_features = 10, out_features = 1) %>%   set_opt_hparams(lr = 1e-4)  tmp <- tempfile()  model %>% fit(list(x, y), valid_data = 0.2, callbacks = list(   luz_callback_tfevents(tmp, histograms = TRUE) )) } #> A `luz_module_fitted` #> ── Time ──────────────────────────────────────────────────────────────────────── #> • Total time: 2.4s #> • Avg time per training epoch: 177ms #>  #> ── Results ───────────────────────────────────────────────────────────────────── #> Metrics observed in the last epoch. #>  #> ℹ Training: #> loss: 1.4048 #>  #> ── Model ─────────────────────────────────────────────────────────────────────── #> An `nn_module` containing 11 parameters. #>  #> ── Parameters ────────────────────────────────────────────────────────────────── #> • weight: Float [1:1, 1:10] #> • bias: Float [1:1]"},{"path":"/reference/luz_callback_train_valid.html","id":null,"dir":"Reference","previous_headings":"","what":"Train-eval callback — luz_callback_train_valid","title":"Train-eval callback — luz_callback_train_valid","text":"Switches important flags training evaluation modes.","code":""},{"path":"/reference/luz_callback_train_valid.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Train-eval callback — luz_callback_train_valid","text":"","code":"luz_callback_train_valid()"},{"path":"/reference/luz_callback_train_valid.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Train-eval callback — luz_callback_train_valid","text":"luz_callback","code":""},{"path":"/reference/luz_callback_train_valid.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Train-eval callback — luz_callback_train_valid","text":"takes care three ctx attributes: ctx$model: Responsible calling ctx$model$train() ctx$model$eval(), appropriate. ctx$training: Sets flag TRUE training FALSE validation mode. ctx$loss: Resets loss attribute list() finished training/ validating.","code":""},{"path":"/reference/luz_callback_train_valid.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Train-eval callback — luz_callback_train_valid","text":"general need explicitly use metrics callback used default fit.luz_module_generator().","code":""},{"path":[]},{"path":"/reference/luz_load.html","id":null,"dir":"Reference","previous_headings":"","what":"Load trained model — luz_load","title":"Load trained model — luz_load","text":"Loads fitted model. See documentation luz_save().","code":""},{"path":"/reference/luz_load.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Load trained model — luz_load","text":"","code":"luz_load(path)"},{"path":"/reference/luz_load.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Load trained model — luz_load","text":"path path file system save object.","code":""},{"path":[]},{"path":"/reference/luz_load_checkpoint.html","id":null,"dir":"Reference","previous_headings":"","what":"Loads a checkpoint — luz_load_checkpoint","title":"Loads a checkpoint — luz_load_checkpoint","text":"Works checkpoints created typically luz_callback_model_checkpoint().","code":""},{"path":"/reference/luz_load_checkpoint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loads a checkpoint — luz_load_checkpoint","text":"","code":"luz_load_checkpoint(obj, path, ...)"},{"path":"/reference/luz_load_checkpoint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loads a checkpoint — luz_load_checkpoint","text":"obj Object want laod checkpoint. path Path checkpoint disk. ... unused. allow future extensions.","code":""},{"path":"/reference/luz_load_model_weights.html","id":null,"dir":"Reference","previous_headings":"","what":"Loads model weights into a fitted object. — luz_load_model_weights","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"can useful saved model checkpoints training want reload best checkpoint end.","code":""},{"path":"/reference/luz_load_model_weights.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"","code":"luz_load_model_weights(obj, path, ...)  luz_save_model_weights(obj, path)"},{"path":"/reference/luz_load_model_weights.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"obj luz object want copy new weights. path path saved model disk. ... arguments passed torch_load().","code":""},{"path":"/reference/luz_load_model_weights.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"Returns NULL invisibly.","code":""},{"path":"/reference/luz_load_model_weights.html","id":"warning","dir":"Reference","previous_headings":"","what":"Warning","title":"Loads model weights into a fitted object. — luz_load_model_weights","text":"luz_save_model_weights operates inplace, ie modifies model object contain new weights.","code":""},{"path":"/reference/luz_metric.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a new luz metric — luz_metric","title":"Creates a new luz metric — luz_metric","text":"Creates new luz metric","code":""},{"path":"/reference/luz_metric.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a new luz metric — luz_metric","text":"","code":"luz_metric(   name = NULL,   ...,   private = NULL,   active = NULL,   parent_env = parent.frame(),   inherit = NULL )"},{"path":"/reference/luz_metric.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a new luz metric — luz_metric","text":"name string naming new metric. ... named list public methods. implement least initialize, update compute. See details section information. private optional list private members, can functions non-functions. active optional list active binding functions. parent_env environment use parent newly-created objects. inherit R6ClassGenerator object inherit ; words, superclass. captured unevaluated expression evaluated parent_env time object instantiated.","code":""},{"path":"/reference/luz_metric.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Creates a new luz metric — luz_metric","text":"Returns new luz metric.","code":""},{"path":"/reference/luz_metric.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Creates a new luz metric — luz_metric","text":"order implement new luz_metric need implement 3 methods: initialize: defines metric initial state. function called epoch training validation loops. update: updates metric internal state. function called every training validation step predictions obtained model target values obtained dataloader. compute: uses internal state compute metric values. function called whenever need obtain current metric value. Eg, ’s called every training step metrics displayed progress bar, called per epoch record ’s value progress bar displayed. Optionally, can implement abbrev field gives metric abbreviation used displaying metric information console tracking record. abbrev passed, class name used. Let’s take look implementation luz_metric_accuracy can see implement new one:   Note: ’s good practice compute metric returns regular R values instead torch tensors parts luz expect .","code":"luz_metric_accuracy <- luz_metric(   # An abbreviation to be shown in progress bars, or    # when printing progress   abbrev = \"Acc\",    # Initial setup for the metric. Metrics are initialized   # every epoch, for both training and validation   initialize = function() {     self$correct <- 0     self$total <- 0   },   # Run at every training or validation step and updates   # the internal state. The update function takes `preds`   # and `target` as parameters.   update = function(preds, target) {     pred <- torch::torch_argmax(preds, dim = 2)     self$correct <- self$correct + (pred == target)$       to(dtype = torch::torch_float())$       sum()$       item()     self$total <- self$total + pred$numel()   },   # Use the internal state to query the metric value   compute = function() {     self$correct/self$total   } )"},{"path":[]},{"path":"/reference/luz_metric.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Creates a new luz metric — luz_metric","text":"","code":"luz_metric_accuracy <- luz_metric(   # An abbreviation to be shown in progress bars, or   # when printing progress   abbrev = \"Acc\",   # Initial setup for the metric. Metrics are initialized   # every epoch, for both training and validation   initialize = function() {     self$correct <- 0     self$total <- 0   },   # Run at every training or validation step and updates   # the internal state. The update function takes `preds`   # and `target` as parameters.   update = function(preds, target) {     pred <- torch::torch_argmax(preds, dim = 2)     self$correct <- self$correct + (pred == target)$       to(dtype = torch::torch_float())$       sum()$       item()     self$total <- self$total + pred$numel()   },   # Use the internal state to query the metric value   compute = function() {     self$correct/self$total   } )"},{"path":"/reference/luz_metric_accuracy.html","id":null,"dir":"Reference","previous_headings":"","what":"Accuracy — luz_metric_accuracy","title":"Accuracy — luz_metric_accuracy","text":"Computes accuracy multi-class classification problems.","code":""},{"path":"/reference/luz_metric_accuracy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Accuracy — luz_metric_accuracy","text":"","code":"luz_metric_accuracy()"},{"path":"/reference/luz_metric_accuracy.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Accuracy — luz_metric_accuracy","text":"Returns new luz metric.","code":""},{"path":"/reference/luz_metric_accuracy.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Accuracy — luz_metric_accuracy","text":"metric expects take logits probabilities every update. take columnwise argmax compare target.","code":""},{"path":[]},{"path":"/reference/luz_metric_accuracy.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Accuracy — luz_metric_accuracy","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_accuracy() metric <- metric$new() metric$update(torch_randn(100, 10), torch::torch_randint(1, 10, size = 100)) metric$compute() } #> [1] 0.07"},{"path":"/reference/luz_metric_binary_accuracy.html","id":null,"dir":"Reference","previous_headings":"","what":"Binary accuracy — luz_metric_binary_accuracy","title":"Binary accuracy — luz_metric_binary_accuracy","text":"Computes accuracy binary classification problems model returns probabilities. Commonly used loss torch::nn_bce_loss().","code":""},{"path":"/reference/luz_metric_binary_accuracy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Binary accuracy — luz_metric_binary_accuracy","text":"","code":"luz_metric_binary_accuracy(threshold = 0.5)"},{"path":"/reference/luz_metric_binary_accuracy.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Binary accuracy — luz_metric_binary_accuracy","text":"threshold value used classifiy observations 0 1.","code":""},{"path":"/reference/luz_metric_binary_accuracy.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Binary accuracy — luz_metric_binary_accuracy","text":"Returns new luz metric.","code":""},{"path":[]},{"path":"/reference/luz_metric_binary_accuracy.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Binary accuracy — luz_metric_binary_accuracy","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_binary_accuracy(threshold = 0.5) metric <- metric$new() metric$update(torch_rand(100), torch::torch_randint(0, 1, size = 100)) metric$compute() } #> [1] 0.51"},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":null,"dir":"Reference","previous_headings":"","what":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"Computes accuracy binary classification problems model return logits. Commonly used together torch::nn_bce_with_logits_loss().","code":""},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"","code":"luz_metric_binary_accuracy_with_logits(threshold = 0.5)"},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"threshold value used classifiy observations 0 1.","code":""},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"Returns new luz metric.","code":""},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"Probabilities generated using torch::nnf_sigmoid() threshold used classify 0 1.","code":""},{"path":[]},{"path":"/reference/luz_metric_binary_accuracy_with_logits.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Binary accuracy with logits — luz_metric_binary_accuracy_with_logits","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_binary_accuracy_with_logits(threshold = 0.5) metric <- metric$new() metric$update(torch_randn(100), torch::torch_randint(0, 1, size = 100)) metric$compute() } #> [1] 0.5"},{"path":"/reference/luz_metric_binary_auroc.html","id":null,"dir":"Reference","previous_headings":"","what":"Computes the area under the ROC — luz_metric_binary_auroc","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"avoid storing predictions targets epoch compute confusion matrices across range pre-established thresholds.","code":""},{"path":"/reference/luz_metric_binary_auroc.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"","code":"luz_metric_binary_auroc(   num_thresholds = 200,   thresholds = NULL,   from_logits = FALSE )"},{"path":"/reference/luz_metric_binary_auroc.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"num_thresholds Number thresholds used compute confusion matrices. case, thresholds created getting num_thresholds values linearly spaced unit interval. thresholds (optional) threshold passed, used compute confusion matrices num_thresholds ignored. from_logits Boolean indicating predictions logits, case use sigmoid put unit interval.","code":""},{"path":[]},{"path":"/reference/luz_metric_binary_auroc.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Computes the area under the ROC — luz_metric_binary_auroc","text":"","code":"if (torch::torch_is_installed()){ library(torch) actual <- c(1, 1, 1, 0, 0, 0) predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2)  y_true <- torch_tensor(actual) y_pred <- torch_tensor(predicted)  m <- luz_metric_binary_auroc(thresholds = predicted) m <- m$new()  m$update(y_pred[1:2], y_true[1:2]) m$update(y_pred[3:4], y_true[3:4]) m$update(y_pred[5:6], y_true[5:6])  m$compute() } #> [1] 0.8888889"},{"path":"/reference/luz_metric_mae.html","id":null,"dir":"Reference","previous_headings":"","what":"Mean absolute error — luz_metric_mae","title":"Mean absolute error — luz_metric_mae","text":"Computes mean absolute error.","code":""},{"path":"/reference/luz_metric_mae.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mean absolute error — luz_metric_mae","text":"","code":"luz_metric_mae()"},{"path":"/reference/luz_metric_mae.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mean absolute error — luz_metric_mae","text":"Returns new luz metric.","code":""},{"path":[]},{"path":"/reference/luz_metric_mae.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Mean absolute error — luz_metric_mae","text":"","code":"if (torch::torch_is_installed()) { library(torch) metric <- luz_metric_mae() metric <- metric$new() metric$update(torch_randn(100), torch_randn(100)) metric$compute() } #> [1] 1.080743"},{"path":"/reference/luz_metric_mse.html","id":null,"dir":"Reference","previous_headings":"","what":"Mean squared error — luz_metric_mse","title":"Mean squared error — luz_metric_mse","text":"Computes mean squared error","code":""},{"path":"/reference/luz_metric_mse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mean squared error — luz_metric_mse","text":"","code":"luz_metric_mse()"},{"path":"/reference/luz_metric_mse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mean squared error — luz_metric_mse","text":"luz_metric object.","code":""},{"path":[]},{"path":"/reference/luz_metric_multiclass_auroc.html","id":null,"dir":"Reference","previous_headings":"","what":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"definition Keras used default. equivalent 'micro' method SciKit Learn . See docs.","code":""},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"","code":"luz_metric_multiclass_auroc(   num_thresholds = 200,   thresholds = NULL,   from_logits = FALSE,   average = c(\"micro\", \"macro\", \"weighted\", \"none\") )"},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"num_thresholds Number thresholds used compute confusion matrices. case, thresholds created getting num_thresholds values linearly spaced unit interval. thresholds (optional) threshold passed, used compute confusion matrices num_thresholds ignored. from_logits TRUE call torch::nnf_softmax() predictions computing metric. average averaging method: 'micro': Stack classes computes AUROC binary classification problem. 'macro': Finds AUCROC class computes mean. 'weighted': Finds AUROC class computes weighted mean pondering number instances class. 'none': Returns AUROC class list.","code":""},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"Note class imbalance can affect metric unlike AUC binary classification. Currently AUC approximated using 'interpolation' method described Keras.","code":""},{"path":[]},{"path":"/reference/luz_metric_multiclass_auroc.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Computes the multi-class AUROC — luz_metric_multiclass_auroc","text":"","code":"if (torch::torch_is_installed()) { library(torch) actual <- c(1, 1, 1, 0, 0, 0) + 1L predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2) predicted <- cbind(1-predicted, predicted)  y_true <- torch_tensor(as.integer(actual)) y_pred <- torch_tensor(predicted)  m <- luz_metric_multiclass_auroc(thresholds = as.numeric(predicted),                                  average = \"micro\") m <- m$new()  m$update(y_pred[1:2,], y_true[1:2]) m$update(y_pred[3:4,], y_true[3:4]) m$update(y_pred[5:6,], y_true[5:6]) m$compute() } #> [1] 0.9027778"},{"path":"/reference/luz_metric_rmse.html","id":null,"dir":"Reference","previous_headings":"","what":"Root mean squared error — luz_metric_rmse","title":"Root mean squared error — luz_metric_rmse","text":"Computes root mean squared error.","code":""},{"path":"/reference/luz_metric_rmse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Root mean squared error — luz_metric_rmse","text":"","code":"luz_metric_rmse()"},{"path":"/reference/luz_metric_rmse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Root mean squared error — luz_metric_rmse","text":"Returns new luz metric.","code":""},{"path":[]},{"path":"/reference/luz_metric_set.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a metric set — luz_metric_set","title":"Creates a metric set — luz_metric_set","text":"metric set can used specify metrics evaluated training, validation .","code":""},{"path":"/reference/luz_metric_set.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a metric set — luz_metric_set","text":"","code":"luz_metric_set(metrics = NULL, train_metrics = NULL, valid_metrics = NULL)"},{"path":"/reference/luz_metric_set.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a metric set — luz_metric_set","text":"metrics list luz_metrics meant used training validation. train_metrics list luz_metrics used training. valid_metrics list luz_metrics sued validation.","code":""},{"path":"/reference/luz_save.html","id":null,"dir":"Reference","previous_headings":"","what":"Saves luz objects to disk — luz_save","title":"Saves luz objects to disk — luz_save","text":"Allows saving luz fitted models disk. Objects can loaded back luz_load().","code":""},{"path":"/reference/luz_save.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Saves luz objects to disk — luz_save","text":"","code":"luz_save(obj, path, ...)"},{"path":"/reference/luz_save.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Saves luz objects to disk — luz_save","text":"obj object class 'luz_module_fitted' returned fit.luz_module_generator(). path path file system save object. ... currently unused.","code":""},{"path":"/reference/luz_save.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Saves luz objects to disk — luz_save","text":"Objects saved plain .rds files obj$model serialized torch_save saving .","code":""},{"path":"/reference/luz_save.html","id":"warning","dir":"Reference","previous_headings":"","what":"Warning","title":"Saves luz objects to disk — luz_save","text":"ctx naively serialized. Ie, use saveRDS() serialize . expect luz_save work correctly unserializable objects ctx like torch_tensors external pointers general.","code":""},{"path":[]},{"path":"/reference/nn_mixup_loss.html","id":null,"dir":"Reference","previous_headings":"","what":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"training phase, computes individual losses regard two targets, weights item-wise, averages linear combinations yield mean batch loss. validation testing, defers passed-loss.","code":""},{"path":"/reference/nn_mixup_loss.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"","code":"nn_mixup_loss(loss)"},{"path":"/reference/nn_mixup_loss.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"loss underlying loss nn_module call. must support reduction field. training attribute changed 'none' get loss individual observations. See example documentation reduction argument torch::nn_cross_entropy_loss().","code":""},{"path":"/reference/nn_mixup_loss.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Loss to be used with callbacks_mixup(). — nn_mixup_loss","text":"used together luz_callback_mixup().","code":""},{"path":[]},{"path":"/reference/nnf_mixup.html","id":null,"dir":"Reference","previous_headings":"","what":"Mixup logic — nnf_mixup","title":"Mixup logic — nnf_mixup","text":"Logic underlying luz_callback_mixup().","code":""},{"path":"/reference/nnf_mixup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Mixup logic — nnf_mixup","text":"","code":"nnf_mixup(x, y, weight)"},{"path":"/reference/nnf_mixup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Mixup logic — nnf_mixup","text":"x input batch y target batch weight weighting coefficient used torch_lerp()","code":""},{"path":"/reference/nnf_mixup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Mixup logic — nnf_mixup","text":"list : x, new, mixed-input batch y, list : ys, list : y1, original target y1 y2, mixed-target y2 weight, mixing weights","code":""},{"path":"/reference/nnf_mixup.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Mixup logic — nnf_mixup","text":"Based passed-input target batches, well applicable mixing weights, return new tensors intended replace current batch. new input batch weighted linear combination input batch items, new target batch bundles original targets, well mixing weights, nested list.","code":""},{"path":[]},{"path":"/reference/nnf_mixup.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Mixup logic — nnf_mixup","text":"","code":"if (torch::torch_is_installed()) { batch_x <- torch::torch_randn(c(10, 768)) batch_y <- torch::torch_randn(10) weight <- torch::torch_tensor(rep(0.9, 10))$view(c(10, 1)) nnf_mixup(batch_x, batch_y, weight) } #> $x #> torch_tensor #> Columns 1 to 6 2.1105e-01 -2.5707e-01  5.0293e-01 -2.6365e-01  2.9616e-01  4.3874e-01 #> -5.3747e-01 -5.0843e-01  2.0182e-01  8.5945e-01 -7.9758e-01 -1.6854e-01 #> -1.8622e+00 -7.2628e-01 -1.2179e-01 -5.3673e-01  8.8290e-01 -7.7352e-02 #>  6.0087e-01  7.7442e-01 -1.8468e+00  4.8284e-01  1.4391e+00  2.0366e-01 #> -1.0608e+00 -4.1275e-01 -9.6645e-01 -5.1798e-01  2.5813e-01  1.7352e-01 #> -8.6037e-01  1.4365e-01  6.6950e-01 -4.4121e-01  4.2209e-01 -3.8243e-01 #>  4.5700e-01  7.9541e-01  4.9467e-01  1.3577e+00 -5.6978e-01 -1.1119e+00 #>  7.8966e-01  4.9365e-01  1.0959e+00  6.6656e-01  2.4713e-01  2.4156e-01 #>  6.8629e-01  4.3494e-01  1.5368e+00  4.5424e-01 -3.3821e-01 -6.9955e-01 #> -2.9487e-01 -2.7045e-01 -9.3513e-01 -3.1766e-01  7.1092e-01 -8.8386e-01 #>  #> Columns 7 to 12-5.8458e-01 -7.6261e-01 -9.8281e-01  1.0952e-01 -3.8169e-01  4.4187e-01 #>  1.5732e+00 -4.8694e-01  1.8215e-01  2.4406e-01  4.9622e-01  4.1927e-01 #> -2.1962e-01 -1.5063e-01 -4.3045e-01  6.1290e-01  1.3646e+00 -7.8468e-02 #> -2.7856e-01 -1.4861e+00  5.4135e-01  2.5380e-01 -2.1084e+00 -6.7824e-01 #> -7.2378e-01  6.6451e-01 -5.6135e-02 -2.5516e-02 -1.6625e+00 -1.2545e+00 #>  4.8056e-01  1.1037e+00  1.6371e+00  1.1139e-01 -6.4466e-01 -1.5184e+00 #> -1.2516e+00 -1.3917e-01  1.8302e-01 -7.0514e-01 -2.0332e+00  5.3306e-01 #> -5.2930e-01 -3.2553e-01  7.5119e-01 -8.0412e-01  1.3013e+00  1.3164e+00 #>  7.5627e-01  5.4333e-03 -1.1749e+00  1.0025e+00 -1.3122e-01  7.3929e-01 #> -3.6906e-01  7.2472e-01  6.5900e-01  1.6670e-01 -1.2228e-02  6.5552e-01 #>  #> Columns 13 to 18 3.4535e-01 -1.2002e+00 -8.3307e-01 -1.6820e+00 -5.6943e-01 -1.2224e+00 #>  1.0747e-01  4.3129e-01  1.0875e+00  4.7297e-01 -5.5352e-01 -6.9736e-01 #> -5.6236e-01  5.3038e-01 -4.8145e-01  9.4094e-01  2.5152e+00 -8.0532e-01 #>  1.2296e+00 -5.9918e-01  9.1384e-01  5.5982e-02  1.0325e+00  9.0756e-01 #> -5.5983e-01  9.8870e-01  2.4292e-01 -2.4190e-01  4.6381e-01  6.5734e-01 #> -5.8572e-01 -5.4169e-01  4.0119e-01  5.8703e-01 -4.3276e-01  9.2243e-01 #> -3.0966e-01  9.1974e-02  1.8338e-01  1.0977e+00  9.2757e-01  1.5192e+00 #> -4.0285e-01 -1.2765e+00  3.3926e-01 -1.7810e-02 -7.1996e-01 -1.3532e+00 #> ... [the output was truncated (use n=-1 to disable)] #> [ CPUFloatType{10,768} ] #>  #> $y #> $y$ys #> $y$ys$y1 #> torch_tensor #> -0.9905 #> -0.3795 #>  1.2743 #> -0.5082 #> -1.0673 #> -0.4616 #>  0.5942 #>  1.0820 #> -0.3193 #>  0.2713 #> [ CPUFloatType{10} ] #>  #> $y$ys$y2 #> torch_tensor #> -0.3193 #> -0.3795 #>  1.0820 #>  0.5942 #> -0.4616 #>  1.2743 #> -0.5082 #>  0.2713 #> -0.9905 #> -1.0673 #> [ CPUFloatType{10} ] #>  #>  #> $y$weight #> torch_tensor #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #>  0.9000 #> [ CPUFloatType{10,1} ] #>  #>"},{"path":"/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"Pipe operator — %>%","title":"Pipe operator — %>%","text":"See magrittr::%>% details.","code":""},{"path":"/reference/pipe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Pipe operator — %>%","text":"","code":"lhs %>% rhs"},{"path":"/reference/predict.luz_module_fitted.html","id":null,"dir":"Reference","previous_headings":"","what":"Create predictions for a fitted model — predict.luz_module_fitted","title":"Create predictions for a fitted model — predict.luz_module_fitted","text":"Create predictions fitted model","code":""},{"path":"/reference/predict.luz_module_fitted.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create predictions for a fitted model — predict.luz_module_fitted","text":"","code":"# S3 method for luz_module_fitted predict(   object,   newdata,   ...,   callbacks = list(),   accelerator = NULL,   verbose = NULL,   dataloader_options = NULL )"},{"path":"/reference/predict.luz_module_fitted.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create predictions for a fitted model — predict.luz_module_fitted","text":"object (fitted model) fitted model object returned fit.luz_module_generator() newdata (dataloader, dataset, list array) returning list least 1 element. elements used. ... Currently unused. callbacks (list, optional) list callbacks defined luz_callback() called training procedure. callbacks luz_callback_metrics(), luz_callback_progress() luz_callback_train_valid() always added default. accelerator (accelerator, optional) optional accelerator() object used configure device placement components like nn_modules, optimizers batches data. verbose (logical, optional) optional boolean value indicating fitting procedure emit output console training. default, produce output interactive() TRUE, otherwise print console. dataloader_options Options used creating dataloader. See torch::dataloader(). shuffle=TRUE default training data batch_size=32 default. error NULL data already dataloader.","code":""},{"path":[]},{"path":"/reference/reexports.html","id":null,"dir":"Reference","previous_headings":"","what":"Objects exported from other packages — reexports","title":"Objects exported from other packages — reexports","text":"objects imported packages. Follow links see documentation. generics fit","code":""},{"path":"/reference/set_hparams.html","id":null,"dir":"Reference","previous_headings":"","what":"Set hyper-parameter of a module — set_hparams","title":"Set hyper-parameter of a module — set_hparams","text":"function used define hyper-parameters calling fit luz_modules.","code":""},{"path":"/reference/set_hparams.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set hyper-parameter of a module — set_hparams","text":"","code":"set_hparams(module, ...)"},{"path":"/reference/set_hparams.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set hyper-parameter of a module — set_hparams","text":"module nn_module setup(). ... parameters set used initialize nn_module, ie passed unchanged initialize method base nn_module.","code":""},{"path":"/reference/set_hparams.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set hyper-parameter of a module — set_hparams","text":"luz module","code":""},{"path":[]},{"path":"/reference/set_opt_hparams.html","id":null,"dir":"Reference","previous_headings":"","what":"Set optimizer hyper-parameters — set_opt_hparams","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"function used define hyper-parameters optimizer initialization method.","code":""},{"path":"/reference/set_opt_hparams.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"","code":"set_opt_hparams(module, ...)"},{"path":"/reference/set_opt_hparams.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"module nn_module setup(). ... parameters passed used initialize optimizers. example, optimizer optim_adam pass lr=0.1, optim_adam function called optim_adam(parameters, lr=0.1) fitting model.","code":""},{"path":"/reference/set_opt_hparams.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set optimizer hyper-parameters — set_opt_hparams","text":"luz module","code":""},{"path":[]},{"path":"/reference/setup.html","id":null,"dir":"Reference","previous_headings":"","what":"Set's up a nn_module to use with luz — setup","title":"Set's up a nn_module to use with luz — setup","text":"setup function used set important attributes method nn_modules used luz.","code":""},{"path":"/reference/setup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Set's up a nn_module to use with luz — setup","text":"","code":"setup(module, loss = NULL, optimizer = NULL, metrics = NULL, backward = NULL)"},{"path":"/reference/setup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Set's up a nn_module to use with luz — setup","text":"module (nn_module) nn_module want set . loss (function, optional) optional function signature function(input, target). requires nn_module implement method called loss. optimizer (torch_optimizer, optional) function signature function(parameters, ...) used initialize optimizer given model parameters. metrics (list, optional) list metrics tracked training procedure. Sometimes, want metrics evaluated training validation, case can pass luz_metric_set() object specify metrics used stage. backward (function) functions takes loss scalar values parameter. must call $backward() torch::autograd_backward(). general need set parameter unless need customize luz calls backward(), example, need add additional arguments backward call. Note becomes method nn_module thus can used custom step() override .","code":""},{"path":"/reference/setup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Set's up a nn_module to use with luz — setup","text":"luz module can trained fit().","code":""},{"path":"/reference/setup.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Set's up a nn_module to use with luz — setup","text":"makes sure module necessary ingredients order fitted.","code":""},{"path":"/reference/setup.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Set's up a nn_module to use with luz — setup","text":"also adds device active field can used query current module device within methods, eg self$device. useful ctx() available, eg, calling methods outside luz wrappers. Users can override default implementing device active method input module.","code":""},{"path":[]},{"path":"/news/index.html","id":"luz-development-version","dir":"Changelog","previous_headings":"","what":"luz (development version)","title":"luz (development version)","text":"Added mixed precision callback. (#127) Added support torch iterable datasets. (#135) Fixed bug trying resume models trained learning rate schedulers. (#137)","code":""},{"path":"/news/index.html","id":"luz-040","dir":"Changelog","previous_headings":"","what":"luz 0.4.0","title":"luz 0.4.0","text":"CRAN release: 2023-04-17","code":""},{"path":"/news/index.html","id":"breaking-changes-0-4-0","dir":"Changelog","previous_headings":"","what":"Breaking changes","title":"luz 0.4.0","text":"drop_last=TRUE now default training dataloaders created luz (eg. pass list torch dataset data input) (#117) default profile callback longer tracks intra step timings adds non ignorable overhead. (#125)","code":""},{"path":"/news/index.html","id":"new-features-0-4-0","dir":"Changelog","previous_headings":"","what":"New features","title":"luz 0.4.0","text":"Added support arm Mac’s MPS device. (#104) Refactor checkpointing luz - now also serialize optimizer state callbacks state. (#107) Added luz_callback_autoresume() allowing easily resume trainining runs might crashed. (#107) Added th luz_callback_resume_from_checkpoint() allowing one resume training run checkpoint file. (#107) Users can now chose metrics called training validation, training validation. See luz_metric_set() information. (#112) Improved errors raised user code, eg calling metrics callbacks raised. helps lot debuging errors callbacks metrics. (#112) loss_fn now field context, thus callbacks can override needed. (#112) luz_callback_mixup now supports run_valid auto_loss arguments. (#112) ctx now aliases default opt opt_name single optimizer specified (ie. cases) (#114) Added tfevents callback logging loss getting weights histograms. (#118) can now specify metrics evaluated evaluate. (#123)","code":""},{"path":"/news/index.html","id":"bug-fixes-0-4-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"luz 0.4.0","text":"Bug fix: accelerators cpu argument always respected. (#119) Handled rlang ggplot2 deprecations. (#120) Better handling metrics environments. Faster garbage collection dataloaders iterators, use less memory. (#122) Much faster loss averaging every step. Can hight influence training times large number iterations per epoch. (#124)","code":""},{"path":"/news/index.html","id":"luz-031","dir":"Changelog","previous_headings":"","what":"luz 0.3.1","title":"luz 0.3.1","text":"CRAN release: 2022-09-06 Re-submission fix vignette rendering.","code":""},{"path":"/news/index.html","id":"luz-030","dir":"Changelog","previous_headings":"","what":"luz 0.3.0","title":"luz 0.3.0","text":"CRAN release: 2022-08-19","code":""},{"path":"/news/index.html","id":"breaking-changes-0-3-0","dir":"Changelog","previous_headings":"","what":"Breaking changes","title":"luz 0.3.0","text":"lr_finder() now default divides range start_lr end_lr log-spaced intervals, following fast.ai implementation. Cf. Sylvain Gugger’s post: https://sgugger.github.io/---find--good-learning-rate.html. previous behavior can achieved passing log_spaced_intervals=FALSE function. (#82, @skeydan) plot.lr_records() now addition plots exponentially weighted moving average loss (, see Sylvain Gugger’s post), weighting coefficient 0.9 (seems reasonable value default setting 100 learning-rate-incrementing intervals). (#82, @skeydan)","code":""},{"path":"/news/index.html","id":"documentation-0-3-0","dir":"Changelog","previous_headings":"","what":"Documentation","title":"luz 0.3.0","text":"Many wording improvements getting started guides (#81 #94, @jonthegeek).","code":""},{"path":"/news/index.html","id":"new-features-0-3-0","dir":"Changelog","previous_headings":"","what":"New features","title":"luz 0.3.0","text":"Added MixUp callback helper loss function functional logic. (#82, @skeydan). Added luz_callback_gradient_clip inspired FastAI’s implementation. (#90) Added backward argument setup allowing one customize backward called loss scalar value. (#93) Added luz_callback_keep_best_model() reload weights best model training finished. (#95)","code":""},{"path":"/news/index.html","id":"luz-020","dir":"Changelog","previous_headings":"","what":"luz 0.2.0","title":"luz 0.2.0","text":"CRAN release: 2021-10-07","code":""},{"path":"/news/index.html","id":"new-features-0-2-0","dir":"Changelog","previous_headings":"","what":"New features","title":"luz 0.2.0","text":"Allow users provide minimum maximum number epochs calling fit.luz_module_generator(). Removed ctx$epochs context object replaced ctx$min_epochs ctx$max_epochs (#53, @mattwarkentin). Early stopping now occur minimum number training epochs met (#53, @mattwarkentin). Added cuda_index argument accelerator allow selecting specific GPU multiple present (#58, @cmcmaster1). Implemented lr_finder (#59, @cmcmaster1). now handle different kinds data arguments passed fit using as_dataloader() method (#66). valid_data can now scalar value indicating proportion data used fitting. works data torch dataset list. (#69) can now supply dataloader_options fit pass additional information as_dataloader(). (#71) Implemented evaluate function allowing users get metrics model new dataset. (#73)","code":""},{"path":"/news/index.html","id":"bug-fixes-0-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"luz 0.2.0","text":"Fixed bug CSV logger callback saving logs space delimited file (#52, @mattwarkentin). Fixed bug length progress bar validation dataset (#52, @mattwarkentin). Fixed bugs early stopping callback related working properly patience = 1 specified logging callbacks. (#76)","code":""},{"path":"/news/index.html","id":"internal-changes-0-2-0","dir":"Changelog","previous_headings":"","what":"Internal changes","title":"luz 0.2.0","text":"ctx$data now refers current use data instead always refering ctx$train_data. (#54) Refactored ctx object make safer avoid returing output. (#73)","code":""},{"path":"/news/index.html","id":"luz-010","dir":"Changelog","previous_headings":"","what":"luz 0.1.0","title":"luz 0.1.0","text":"CRAN release: 2021-06-17 Added NEWS.md file track changes package.","code":""}]
diff --git a/sitemap.xml b/sitemap.xml
index 97a41ba0..488ceb33 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -51,6 +51,9 @@
   <url>
     <loc>/articles/examples/text-classification.html</loc>
   </url>
+  <url>
+    <loc>/articles/examples/text-generation.html</loc>
+  </url>
   <url>
     <loc>/articles/get-started.html</loc>
   </url>