From ac71914e09314efd151a9735aff046f21d0b494f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Saugat=20Pachhai=20=28=E0=A4=B8=E0=A5=8C=E0=A4=97=E0=A4=BE?= =?UTF-8?q?=E0=A4=A4=29?= Date: Mon, 30 Nov 2020 11:09:30 +0545 Subject: [PATCH 01/18] docs: add flag --glob for repro --- content/docs/command-reference/repro.md | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index b889dae7a8..7be211c26a 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -10,12 +10,23 @@ analyzing dependencies and outputs of the target stages. ```usage usage: dvc repro [-h] [-q | -v] [-f] [-s] [-m] [--dry] [-i] [-p] [-P] [-R] [--no-run-cache] [--force-downstream] - [--no-commit] [--downstream] [--pull] + [--no-commit] [--downstream] [--pull] [--glob] [targets [targets ...]] positional arguments: - targets Stage or path to dvc.yaml or .dvc file to reproduce. Using -R, - directories to search for stages can also be given. + targets Stage or path to dvc.yaml or .dvc file to reproduce. + Using -R, directories to search for stages can also + be given. If no targets are provided, it is assumed + to be the dvc.yaml present in the current working + directory. + + A stage from a dvc.yaml in a different directory can + be specified using a path to the dvc.yaml, followed + by a colon `:`, followed by the name of the stage + (example: `../dvc.yaml:prepare`). + + Using --glob, the targets are used as a pattern to + match stages in the specified file. ``` ## Description @@ -157,6 +168,11 @@ up-to-date and only execute the final stage. > Has no effect if combined with `--no-run-cache`. +- `--glob` - allows running the stages from a file that match the wildcard + [pattern](https://docs.python.org/3/library/glob.html) specified in `targets`. + Note that it does not match pattern with the path, only to the stages present + in the specified file. + - `-h`, `--help` - prints the usage/help message, and exit. - `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if all From c78da03b08e73cb921b966450947fb624f8effdf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Saugat=20Pachhai=20=28=E0=A4=B8=E0=A5=8C=E0=A4=97=E0=A4=BE?= =?UTF-8?q?=E0=A4=A4=29?= Date: Tue, 1 Dec 2020 10:28:08 +0545 Subject: [PATCH 02/18] Move target details to Options --- content/docs/command-reference/repro.md | 37 ++++++++++++++++--------- 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 7be211c26a..e17a2594df 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -11,24 +11,16 @@ analyzing dependencies and outputs of the target stages. usage: dvc repro [-h] [-q | -v] [-f] [-s] [-m] [--dry] [-i] [-p] [-P] [-R] [--no-run-cache] [--force-downstream] [--no-commit] [--downstream] [--pull] [--glob] - [targets [targets ...]] + [targets [ ...]] positional arguments: targets Stage or path to dvc.yaml or .dvc file to reproduce. - Using -R, directories to search for stages can also - be given. If no targets are provided, it is assumed - to be the dvc.yaml present in the current working - directory. - - A stage from a dvc.yaml in a different directory can - be specified using a path to the dvc.yaml, followed - by a colon `:`, followed by the name of the stage - (example: `../dvc.yaml:prepare`). - - Using --glob, the targets are used as a pattern to - match stages in the specified file. + If no targets are provided, it is assumed to be the + dvc.yaml present in the current working directory. ``` +See [\](#options) for more details. + ## Description `dvc repro` provides a way to regenerate data pipeline results, by restoring the @@ -104,6 +96,25 @@ up-to-date and only execute the final stage. ## Options +- `` - Specify the stages to reproduce. + + Target can be a name of the stage in the dvc.yaml file or a path to a .dvc or + a dvc.yaml file. In case of a file, it will reproduce all of the stages + present in that file. + + With `-R`, the target can be a directory to search for stages. + + With `--glob`, the targets are used as a pattern to match stages in the + dvc.yaml file (example: `train-*`). + + A stage from a dvc.yaml in a different directory can be specified using a path + to the dvc.yaml, followed by a colon `:`, followed by the name of the stage + (`models/dvc.yaml:prepare` for example). + + Similarly, to pattern match the stages on a dvc.yaml in a different directory, + the pattern could follow after the path and the colon `:` + (`models/dvc.yaml:train-*` for example). + - `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no changes were found. This executes all of the stages by default, but it can be limited with the `targets` argument, or the `-s`, `-p` options. From a4f840452fe3217fe846071dc12263ba6d54ea36 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Saugat=20Pachhai=20=28=E0=A4=B8=E0=A5=8C=E0=A4=97=E0=A4=BE?= =?UTF-8?q?=E0=A4=A4=29?= Date: Tue, 1 Dec 2020 11:21:53 +0545 Subject: [PATCH 03/18] codify dvc files --- content/docs/command-reference/repro.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index e17a2594df..5161b0c63d 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -98,8 +98,8 @@ up-to-date and only execute the final stage. - `` - Specify the stages to reproduce. - Target can be a name of the stage in the dvc.yaml file or a path to a .dvc or - a dvc.yaml file. In case of a file, it will reproduce all of the stages + Target can be a name of the stage in the `dvc.yaml` file or a path to a `.dvc` + or a `dvc.yaml` file. In case of a file, it will reproduce all of the stages present in that file. With `-R`, the target can be a directory to search for stages. From f650ea6ebbcf6bcda0794addfeeefa5441ec9fe4 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sun, 13 Dec 2020 16:27:19 -0600 Subject: [PATCH 04/18] cmd: std. repro targets arg --- content/docs/command-reference/repro.md | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 5161b0c63d..ae21dda1d8 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -14,13 +14,10 @@ usage: dvc repro [-h] [-q | -v] [-f] [-s] [-m] [--dry] [-i] [targets [ ...]] positional arguments: - targets Stage or path to dvc.yaml or .dvc file to reproduce. - If no targets are provided, it is assumed to be the - dvc.yaml present in the current working directory. + targets Limit command scope to these stage names or dvc.yaml + file paths. ``` -See [\](#options) for more details. - ## Description `dvc repro` provides a way to regenerate data pipeline results, by restoring the @@ -51,8 +48,8 @@ commands. [Stage](/doc/command-reference/run) outputs are deleted from the workspace before executing the stage commands that produce them. There are a few ways to restrict what will be regenerated by this command: by -specifying stages as `targets`, or by using the `--single-item`, among other -options. +specifying stages as `targets` (see [Arguments](#arguments)), or by using the +`--single-item`, among other options. > Note that stages without dependencies are considered _always changed_, so > `dvc repro` always executes them. @@ -94,9 +91,9 @@ same time (e.g. in separate terminals). After both finish successfully, you can then run `dvc repro train`: DVC will know that both branches are already up-to-date and only execute the final stage. -## Options +## Arguments -- `` - Specify the stages to reproduce. +- `targets` - Stage name(s) to reproduce (`./dvc.yaml` by default) Target can be a name of the stage in the `dvc.yaml` file or a path to a `.dvc` or a `dvc.yaml` file. In case of a file, it will reproduce all of the stages @@ -115,6 +112,8 @@ up-to-date and only execute the final stage. the pattern could follow after the path and the colon `:` (`models/dvc.yaml:train-*` for example). +## Options + - `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no changes were found. This executes all of the stages by default, but it can be limited with the `targets` argument, or the `-s`, `-p` options. @@ -144,7 +143,7 @@ up-to-date and only execute the final stage. The stage is only executed if the user types "y". - `-p`, `--pipeline` - reproduce the entire pipelines that the `targets` belong - to. Use `dvc dag ` to show the parent pipeline of a target. + to. Use `dvc dag` to show the parent pipeline of a target. - `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files present in the DVC project. From 955cc3d0363901dbb52226e48a25411b97e65109 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sun, 13 Dec 2020 19:19:02 -0600 Subject: [PATCH 05/18] cmd: Explain base of repro targets arg. --- content/docs/command-reference/repro.md | 55 ++++++++++++++++--------- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index ae21dda1d8..bc6c5b8354 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -14,8 +14,8 @@ usage: dvc repro [-h] [-q | -v] [-f] [-s] [-m] [--dry] [-i] [targets [ ...]] positional arguments: - targets Limit command scope to these stage names or dvc.yaml - file paths. + targets Limit command scope to these .dvc or dvc.yaml files, + or stage names. ``` ## Description @@ -48,8 +48,9 @@ commands. [Stage](/doc/command-reference/run) outputs are deleted from the workspace before executing the stage commands that produce them. There are a few ways to restrict what will be regenerated by this command: by -specifying stages as `targets` (see [Arguments](#arguments)), or by using the -`--single-item`, among other options. +specifying specific reproduction `targets` (see [Arguments](#arguments) for +details), or by using certain command [options](#options), such as +`--single-item`. > Note that stages without dependencies are considered _always changed_, so > `dvc repro` always executes them. @@ -93,24 +94,38 @@ up-to-date and only execute the final stage. ## Arguments -- `targets` - Stage name(s) to reproduce (`./dvc.yaml` by default) +### `targets` (optional) - Target can be a name of the stage in the `dvc.yaml` file or a path to a `.dvc` - or a `dvc.yaml` file. In case of a file, it will reproduce all of the stages - present in that file. +> The default target is `./dvc.yaml` (if this argument is not provided). - With `-R`, the target can be a directory to search for stages. +Accepts one or more file or directory paths (to `.dvc` or `dvc.yaml` files), or +stage name(s) (`./dvc.yaml` by default). DVC will reproduce the corresponding +parts of the project, as detailed below. - With `--glob`, the targets are used as a pattern to match stages in the - dvc.yaml file (example: `train-*`). +- For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) are + checked for changes, and reproduced as needed (explained in the command + [description](#description) above). E.g. `dvc repro pipes/linear/dvc.yaml` - A stage from a dvc.yaml in a different directory can be specified using a path - to the dvc.yaml, followed by a colon `:`, followed by the name of the stage - (`models/dvc.yaml:prepare` for example). +- Directory paths can be provided if the `-R` option is included. `dvc.yaml` + files are searched for (recursively) in the given dirs. E.g. + `dvc repro -R subdir/`. - Similarly, to pattern match the stages on a dvc.yaml in a different directory, - the pattern could follow after the path and the colon `:` - (`models/dvc.yaml:train-*` for example). +- For most **`.dvc` files**, `dvc repro` re-adds changed files or directories + (same as `dvc add`). [Import](/doc/command-reference/repro) `.dvc` files are + ignored (use `dvc update` instead). + +- **Stage names** must be defined in `./dvc.yaml` + +With `--glob`, the targets are used as a pattern to match stages in the dvc.yaml +file (example: `train-*`). + +A stage from a dvc.yaml in a different directory can be specified using a path +to the dvc.yaml, followed by a colon `:`, followed by the name of the stage +(`models/dvc.yaml:prepare` for example). + +Similarly, to pattern match the stages on a dvc.yaml in a different directory, +the pattern could follow after the path and the colon `:` +(`models/dvc.yaml:train-*` for example). ## Options @@ -122,9 +137,9 @@ up-to-date and only execute the final stage. recursive search for changed dependencies. Multiple stages are executed (non-recursively) if multiple stage names are given as `targets`. -- `-R`, `--recursive` - determines the stages to reproduce by searching each - target directory and its subdirectories for stages (in `dvc.yaml`) to inspect. - If there are no directories among the targets, this option is ignored. +- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in all the + target directories and their subdirectories. If there are no directories among + the targets, this option has no effect. - `--no-commit` - do not save outputs to cache. A DVC-file is created, while nothing is added to the cache. (`dvc status` will report that the file is From 25959a0555699d41214391373573ca4ac19a9b68 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 14 Dec 2020 14:48:14 -0600 Subject: [PATCH 06/18] cmd: finish repro targets arg and reorder options --- content/docs/command-reference/repro.md | 47 ++++++++++++------------- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index bc6c5b8354..a1eaf19c92 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -114,18 +114,17 @@ parts of the project, as detailed below. (same as `dvc add`). [Import](/doc/command-reference/repro) `.dvc` files are ignored (use `dvc update` instead). -- **Stage names** must be defined in `./dvc.yaml` +- **Stage names** must be defined in `./dvc.yaml`. E.g. `dvc repro train-vision` -With `--glob`, the targets are used as a pattern to match stages in the dvc.yaml -file (example: `train-*`). + Stages in other `dvc.yaml` files can be given using by using a colon `:` + following the path to that file. E.g. `models/dvc.yaml:prepare` -A stage from a dvc.yaml in a different directory can be specified using a path -to the dvc.yaml, followed by a colon `:`, followed by the name of the stage -(`models/dvc.yaml:prepare` for example). +For even more flexibility, use `--glob` so that the `targets` are interpreted as +patterns to match stages in `./dvc.yaml`. E.g. `subdir/**/?.yaml` (certain file +paths), `train-*` (stage names) or `models/dvc.yaml:train-*` (stages in specific +`dvc.yaml` file) -Similarly, to pattern match the stages on a dvc.yaml in a different directory, -the pattern could follow after the path and the colon `:` -(`models/dvc.yaml:train-*` for example). +> You may find all the option details in the next section. ## Options @@ -137,16 +136,27 @@ the pattern could follow after the path and the colon `:` recursive search for changed dependencies. Multiple stages are executed (non-recursively) if multiple stage names are given as `targets`. -- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in all the - target directories and their subdirectories. If there are no directories among - the targets, this option has no effect. - - `--no-commit` - do not save outputs to cache. A DVC-file is created, while nothing is added to the cache. (`dvc status` will report that the file is `not in cache`.) Use `dvc commit` when ready to commit outputs with DVC. Useful to avoid caching unnecessary data repeatedly when running multiple experiments. +- `-p`, `--pipeline` - reproduce the entire pipelines that the `targets` belong + to. Use `dvc dag` to show the parent pipeline of a target. + +- `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files + present in the DVC project. + +- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in all the + target directories and their subdirectories. If there are no directories among + the targets, this option has no effect. + +- `--glob` - allows running the stages from a file that match the wildcard + [pattern](https://docs.python.org/3/library/glob.html) specified in `targets`. + Note that it does not match pattern with the path, only to the stages present + in the specified file. + - `-m`, `--metrics` - show metrics after reproduction. The target pipelines must have at least one metrics file defined either with the `dvc metrics` command, or by the `-M` or `-m` options of the `dvc run` command. @@ -157,12 +167,6 @@ the pattern could follow after the path and the colon `:` - `-i`, `--interactive` - ask for confirmation before reproducing each stage. The stage is only executed if the user types "y". -- `-p`, `--pipeline` - reproduce the entire pipelines that the `targets` belong - to. Use `dvc dag` to show the parent pipeline of a target. - -- `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files - present in the DVC project. - - `--no-run-cache` - execute stage commands even if they have already been run with the same dependencies/outputs/etc. before. @@ -193,11 +197,6 @@ the pattern could follow after the path and the colon `:` > Has no effect if combined with `--no-run-cache`. -- `--glob` - allows running the stages from a file that match the wildcard - [pattern](https://docs.python.org/3/library/glob.html) specified in `targets`. - Note that it does not match pattern with the path, only to the stages present - in the specified file. - - `-h`, `--help` - prints the usage/help message, and exit. - `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if all From 3c099b92f0e9966bf64a198ec8bbfd1d9be2b9c4 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 14 Dec 2020 19:59:14 -0600 Subject: [PATCH 07/18] cmd: polish up repro targets arg changes --- content/docs/command-reference/repro.md | 63 ++++++++++++------------- 1 file changed, 29 insertions(+), 34 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index a1eaf19c92..9f10f88681 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -1,9 +1,9 @@ # repro -Reproduce complete or partial [pipelines](/doc/command-reference/dag) by -executing commands defined in their [stages](/doc/command-reference/run) in the -correct order. The commands to be executed are determined by recursively -analyzing dependencies and outputs of the target stages. +Reproduce complete or partial pipelines by executing commands +defined in their [stages](/doc/command-reference/run) in the correct order. The +commands to be executed are determined by recursively analyzing dependencies and +outputs of the target stages. ## Synopsis @@ -18,18 +18,17 @@ positional arguments: or stage names. ``` +> See [`targets`](#targets-argument-optional) for more details. + ## Description -`dvc repro` provides a way to regenerate data pipeline results, by restoring the -dependency graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) -implicitly defined by the stages listed in `dvc.yaml`. The commands defined in -these stages can then be executed in the correct order, reproducing pipeline -results. +Provides a way to regenerate data pipeline results, by restoring the dependency +graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) implicitly +defined by the stages listed in `dvc.yaml`. The commands defined in these stages +are then be executed in the correct order. -> Pipeline stages are defined in a -> [`dvc.yaml` file](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) -> (either manually or by using `dvc run`) while initial data dependencies can be -> registered with `dvc add`. +> Pipeline stages are defined in `dvc.yaml` files (either manually or by using +> `dvc run`) while initial data dependencies can be registered with `dvc add`. This command is similar to [Make](https://www.gnu.org/software/make/) in software build automation, but DVC captures build requirements @@ -39,25 +38,22 @@ and caches the pipeline's outputs along the way. 💡 For convenience, a Git hook is available to remind you to `dvc repro` when needed after a `git commit`. See `dvc install` for more details. +[Stage](/doc/command-reference/run) outputs are deleted from the +workspace before executing the stage commands that produce them. `dvc repro` does not run `dvc fetch`, `dvc pull` or `dvc checkout` to get data files, intermediate or final results (except if the `--pull` option is used). -By default, this command checks all [pipeline](/doc/command-reference/dag) -stages to determine which ones have changed. Then it executes the corresponding -commands. [Stage](/doc/command-reference/run) outputs are deleted from the -workspace before executing the stage commands that produce them. - There are a few ways to restrict what will be regenerated by this command: by -specifying specific reproduction `targets` (see [Arguments](#arguments) for -details), or by using certain command [options](#options), such as -`--single-item`. +specifying specific reproduction [`targets`](#targets-argument-optional), or by +using certain command [options](#options), such as `--single-item`. > Note that stages without dependencies are considered _always changed_, so > `dvc repro` always executes them. -It saves all the data files, intermediate or final results into the DVC -cache (unless the `--no-commit` option is used), and updates the hash -values of changed dependencies and outputs in the `dvc.lock` and `.dvc` files. +`dvc repro` saves all the data files, intermediate or final results into the +DVC cache (unless the `--no-commit` option is used), and updates +the hash values of changed dependencies and outputs in the `dvc.lock` and `.dvc` +files. ### Parallel stage execution @@ -92,33 +88,32 @@ same time (e.g. in separate terminals). After both finish successfully, you can then run `dvc repro train`: DVC will know that both branches are already up-to-date and only execute the final stage. -## Arguments - -### `targets` (optional) +### `targets` argument (optional) > The default target is `./dvc.yaml` (if this argument is not provided). Accepts one or more file or directory paths (to `.dvc` or `dvc.yaml` files), or -stage name(s) (`./dvc.yaml` by default). DVC will reproduce the corresponding -parts of the project, as detailed below. +stage name(s) (`./dvc.yaml` by default). DVC will reproduce them as detailed +below. - For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) are checked for changes, and reproduced as needed (explained in the command [description](#description) above). E.g. `dvc repro pipes/linear/dvc.yaml` -- Directory paths can be provided if the `-R` option is included. `dvc.yaml` +- **Directory paths** can be provided if the `-R` option is included. `dvc.yaml` files are searched for (recursively) in the given dirs. E.g. `dvc repro -R subdir/`. -- For most **`.dvc` files**, `dvc repro` re-adds changed files or directories - (same as `dvc add`). [Import](/doc/command-reference/repro) `.dvc` files are - ignored (use `dvc update` instead). - - **Stage names** must be defined in `./dvc.yaml`. E.g. `dvc repro train-vision` Stages in other `dvc.yaml` files can be given using by using a colon `:` following the path to that file. E.g. `models/dvc.yaml:prepare` +- Files and directories tracked by **`.dvc` files** given as `targets` are + updated (same as `dvc add`). E.g. `dvc repro data.dvc` + + > Note that [frozen](/doc/command-reference/freeze) `.dvc` files are ignored. + For even more flexibility, use `--glob` so that the `targets` are interpreted as patterns to match stages in `./dvc.yaml`. E.g. `subdir/**/?.yaml` (certain file paths), `train-*` (stage names) or `models/dvc.yaml:train-*` (stages in specific From 925586660395c92849bb339796ace4061e7a1d70 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 14 Dec 2020 20:09:32 -0600 Subject: [PATCH 08/18] cmd: roll back changes possibly unrelated to #1983 per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552050276 --- content/docs/command-reference/repro.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 08742e47f2..f891a4dfa3 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -1,9 +1,9 @@ # repro -Reproduce complete or partial pipelines by executing commands -defined in their [stages](/doc/command-reference/run) in the correct order. The -commands to be executed are determined by recursively analyzing dependencies and -outputs of the target stages. +Reproduce complete or partial [pipelines](/doc/command-reference/dag) by +executing commands defined in their [stages](/doc/command-reference/run) in the +correct order. The commands to be executed are determined by recursively +analyzing dependencies and outputs of the target stages. ## Synopsis @@ -22,16 +22,16 @@ positional arguments: ## Description -Provides a way to regenerate data pipeline results, by restoring the dependency -graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) implicitly -defined by the stages listed in `dvc.yaml`. The commands defined in these stages -are then be executed in the correct order. +`dvc repro` provides a way to regenerate data pipeline results, by restoring the +dependency graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) +implicitly defined by the stages listed in `dvc.yaml`. The commands defined in +these stages are then be executed in the correct order. For stages with multiple commands (having a list in the `cmd` field), commands are run one after the other in the order they are defined. The failure of any command will halt the remaining stage execution, and raises an error. -> Pipeline stages are defined in `dvc.yaml` files (either manually or by using +> Pipeline stages are defined in a `dvc.yaml` file (either manually or by using > `dvc run`) while initial data dependencies can be registered with `dvc add`. This command is similar to [Make](https://www.gnu.org/software/make/) in From 8cd83c3f33a2c956292dd7e5dd8c207188778492 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 14 Dec 2020 20:19:10 -0600 Subject: [PATCH 09/18] cmd: put targets arg in repro options section per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552051653 --- content/docs/command-reference/repro.md | 54 +++++++++++++------------ 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index f891a4dfa3..3befc40983 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -18,7 +18,7 @@ positional arguments: or stage names. ``` -> See [`targets`](#targets-argument-optional) for more details. +> See [`targets`](#options) for more details. ## Description @@ -48,8 +48,8 @@ needed after a `git commit`. See `dvc install` for more details. files, intermediate or final results (except if the `--pull` option is used). There are a few ways to restrict what will be regenerated by this command: by -specifying specific reproduction [`targets`](#targets-argument-optional), or by -using certain command [options](#options), such as `--single-item`. +specifying specific reproduction [`targets`](#options), or by using certain +command [options](#options), such as `--single-item`. > Note that stages without dependencies are considered _always changed_, so > `dvc repro` always executes them. @@ -91,40 +91,42 @@ same time (e.g. in separate terminals). After both finish successfully, you can then run `dvc repro train`: DVC will know that both branches are already up-to-date and only execute the final stage. -### `targets` argument (optional) +## Options -> The default target is `./dvc.yaml` (if this argument is not provided). +- `targets` (optional argument) -Accepts one or more file or directory paths (to `.dvc` or `dvc.yaml` files), or -stage name(s) (`./dvc.yaml` by default). DVC will reproduce them as detailed -below. + > The default target is `./dvc.yaml` (if this argument is not provided). -- For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) are - checked for changes, and reproduced as needed (explained in the command - [description](#description) above). E.g. `dvc repro pipes/linear/dvc.yaml` + Accepts one or more file or directory paths (to `.dvc` or `dvc.yaml` files), + or stage name(s) (`./dvc.yaml` by default). DVC will reproduce them as + detailed below. -- **Directory paths** can be provided if the `-R` option is included. `dvc.yaml` - files are searched for (recursively) in the given dirs. E.g. - `dvc repro -R subdir/`. + - For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) + are checked for changes, and reproduced as needed (explained in the command + [description](#description) above). E.g. `dvc repro pipes/linear/dvc.yaml` -- **Stage names** must be defined in `./dvc.yaml`. E.g. `dvc repro train-vision` + - **Directory paths** can be provided if the `-R` option is included. + `dvc.yaml` files are searched for (recursively) in the given dirs. E.g. + `dvc repro -R subdir/`. - Stages in other `dvc.yaml` files can be given using by using a colon `:` - following the path to that file. E.g. `models/dvc.yaml:prepare` + - **Stage names** must be defined in `./dvc.yaml`. E.g. + `dvc repro train-vision` -- Files and directories tracked by **`.dvc` files** given as `targets` are - updated (same as `dvc add`). E.g. `dvc repro data.dvc` + Stages in other `dvc.yaml` files can be given using by using a colon `:` + following the path to that file. E.g. `models/dvc.yaml:prepare` - > Note that [frozen](/doc/command-reference/freeze) `.dvc` files are ignored. + - Files and directories tracked by **`.dvc` files** given as `targets` are + updated (same as `dvc add`). E.g. `dvc repro data.dvc` -For even more flexibility, use `--glob` so that the `targets` are interpreted as -patterns to match stages in `./dvc.yaml`. E.g. `subdir/**/?.yaml` (certain file -paths), `train-*` (stage names) or `models/dvc.yaml:train-*` (stages in specific -`dvc.yaml` file) + > Note that [frozen](/doc/command-reference/freeze) `.dvc` files are + > ignored. -> You may find all the option details in the next section. + For even more flexibility, use `--glob` so that the `targets` are interpreted + as patterns to match stages in `./dvc.yaml`. E.g. `subdir/**/?.yaml` (certain + file paths), `train-*` (stage names) or `models/dvc.yaml:train-*` (stages in + specific `dvc.yaml` file) -## Options + > You may find all the option details in the next section. - `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no changes were found. This executes all of the stages by default, but it can be From fbe65abbe4e3e0711f2ae92c99e08ec3f398eb97 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 14 Dec 2020 20:31:03 -0600 Subject: [PATCH 10/18] cmd: simplify repro targets arg and --glob option per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552051653 --- content/docs/command-reference/repro.md | 43 ++++++++----------------- 1 file changed, 14 insertions(+), 29 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 3befc40983..2357caef30 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -93,27 +93,18 @@ up-to-date and only execute the final stage. ## Options -- `targets` (optional argument) - - > The default target is `./dvc.yaml` (if this argument is not provided). - - Accepts one or more file or directory paths (to `.dvc` or `dvc.yaml` files), - or stage name(s) (`./dvc.yaml` by default). DVC will reproduce them as - detailed below. +- `targets` (optional argument) - one or more file or directory paths (to `.dvc` + or `dvc.yaml` files), or stage name(s) (`./dvc.yaml` by default). DVC will + reproduce them as detailed below. - For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) are checked for changes, and reproduced as needed (explained in the command [description](#description) above). E.g. `dvc repro pipes/linear/dvc.yaml` - - **Directory paths** can be provided if the `-R` option is included. - `dvc.yaml` files are searched for (recursively) in the given dirs. E.g. - `dvc repro -R subdir/`. - - **Stage names** must be defined in `./dvc.yaml`. E.g. - `dvc repro train-vision` - - Stages in other `dvc.yaml` files can be given using by using a colon `:` - following the path to that file. E.g. `models/dvc.yaml:prepare` + `dvc repro train-vision`. Stages in other `dvc.yaml` files can be given + using by using a colon `:` following the path to that file. E.g. + `models/dvc.yaml:prepare` - Files and directories tracked by **`.dvc` files** given as `targets` are updated (same as `dvc add`). E.g. `dvc repro data.dvc` @@ -121,12 +112,11 @@ up-to-date and only execute the final stage. > Note that [frozen](/doc/command-reference/freeze) `.dvc` files are > ignored. - For even more flexibility, use `--glob` so that the `targets` are interpreted - as patterns to match stages in `./dvc.yaml`. E.g. `subdir/**/?.yaml` (certain - file paths), `train-*` (stage names) or `models/dvc.yaml:train-*` (stages in - specific `dvc.yaml` file) - - > You may find all the option details in the next section. +- `--glob` - causes the `targets` to be interpreted as wildcard + [patterns](https://docs.python.org/3/library/glob.html) to match for stages. + For example: `train-*` (certain stage names) or `models/dvc.yaml:train-*` + (stages in specific `dvc.yaml` file). Note that it does not match patterns + with the path, only to the stages present in the specified file. - `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no changes were found. This executes all of the stages by default, but it can be @@ -136,9 +126,9 @@ up-to-date and only execute the final stage. recursive search for changed dependencies. Multiple stages are executed (non-recursively) if multiple stage names are given as `targets`. -- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in all the - target directories and their subdirectories. If there are no directories among - the targets, this option has no effect. +- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in any + directories given as `targets`, and in their subdirectories. If there are no + directories among the targets, this option has no effect. - `--no-commit` - do not store the outputs of this execution in the cache (`dvc.yaml` and `dvc.lock` are still created or updated); useful to avoid @@ -151,11 +141,6 @@ up-to-date and only execute the final stage. - `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files present in the DVC project. -- `--glob` - allows running the stages from a file that match the wildcard - [pattern](https://docs.python.org/3/library/glob.html) specified in `targets`. - Note that it does not match pattern with the path, only to the stages present - in the specified file. - - `-m`, `--metrics` - show metrics after reproduction. The target pipelines must have at least one metrics file defined either with the `dvc metrics` command, or by the `-M` or `-m` options of the `dvc run` command. From 822576a95d89fce1e0d0c465153538da4a89f993 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 16 Dec 2020 20:49:55 -0600 Subject: [PATCH 11/18] cmd: roll back correction in repro per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552060602 --- content/docs/command-reference/repro.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 2357caef30..9ffa441978 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -42,11 +42,15 @@ and caches the pipeline's outputs along the way. 💡 For convenience, a Git hook is available to remind you to `dvc repro` when needed after a `git commit`. See `dvc install` for more details. -[Stage](/doc/command-reference/run) outputs are deleted from the -workspace before executing the stage commands that produce them. `dvc repro` does not run `dvc fetch`, `dvc pull` or `dvc checkout` to get data files, intermediate or final results (except if the `--pull` option is used). +By default, this command checks all [pipeline](/doc/command-reference/dag) +stages to determine which ones have changed. Then it executes the corresponding +commands (`cmd` field of `dvc.yaml`). [Stage](/doc/command-reference/run) +outputs are deleted from the workspace before executing the stage +commands that produce them. + There are a few ways to restrict what will be regenerated by this command: by specifying specific reproduction [`targets`](#options), or by using certain command [options](#options), such as `--single-item`. From 9be43471808d0491a1a29387d180d2afcce59d60 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 16 Dec 2020 21:21:00 -0600 Subject: [PATCH 12/18] cmd: revert moving repro -p per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552061181 --- content/docs/command-reference/repro.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 9ffa441978..64444f7e6f 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -139,9 +139,6 @@ up-to-date and only execute the final stage. caching unnecessary data when exploring different data or stages. Use `dvc commit` to finish the operation. -- `-p`, `--pipeline` - reproduce the entire pipelines that the `targets` belong - to. Use `dvc dag` to show the parent pipeline of a target. - - `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files present in the DVC project. @@ -155,6 +152,9 @@ up-to-date and only execute the final stage. - `-i`, `--interactive` - ask for confirmation before reproducing each stage. The stage is only executed if the user types "y". +- `-p`, `--pipeline` - reproduce the entire pipelines that the `targets` belong + to. Use `dvc dag ` to show the parent pipeline of a target. + - `--no-run-cache` - execute stage commands even if they have already been run with the same dependencies/outputs/etc. before. From 1c85228b71289c9f840ea2c97f17848f2c855209 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 16 Dec 2020 21:25:44 -0600 Subject: [PATCH 13/18] cmd: revert moving text about multiple commands in repro per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552060416 --- content/docs/command-reference/repro.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 64444f7e6f..f92e2af146 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -25,11 +25,8 @@ positional arguments: `dvc repro` provides a way to regenerate data pipeline results, by restoring the dependency graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) implicitly defined by the stages listed in `dvc.yaml`. The commands defined in -these stages are then be executed in the correct order. - -For stages with multiple commands (having a list in the `cmd` field), commands -are run one after the other in the order they are defined. The failure of any -command will halt the remaining stage execution, and raises an error. +these stages are then be executed in the correct order, reproducing pipeline +results. > Pipeline stages are defined in a `dvc.yaml` file (either manually or by using > `dvc run`) while initial data dependencies can be registered with `dvc add`. @@ -51,6 +48,10 @@ commands (`cmd` field of `dvc.yaml`). [Stage](/doc/command-reference/run) outputs are deleted from the workspace before executing the stage commands that produce them. +For stages with multiple commands (having a list in the `cmd` field), commands +are run one after the other in the order they are defined. The failure of any +command will halt the remaining stage execution, and raises an error. + There are a few ways to restrict what will be regenerated by this command: by specifying specific reproduction [`targets`](#options), or by using certain command [options](#options), such as `--single-item`. From 474a2a6d8f37e6bc1e8aef6bfc38cf3bf48ef042 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 16 Dec 2020 21:33:31 -0600 Subject: [PATCH 14/18] cmd: remove info about .dvc files from repro per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552061889 --- content/docs/command-reference/repro.md | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index f92e2af146..ce12e6c29b 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -61,7 +61,7 @@ command [options](#options), such as `--single-item`. `dvc repro` saves all the data files, intermediate or final results into the cache (unless the `--no-commit` option is used), and updates the -hash values of changed outputs in the `dvc.lock` and `.dvc` files. +hash values of changed outputs in the `dvc.lock` files. ### Parallel stage execution @@ -98,8 +98,8 @@ up-to-date and only execute the final stage. ## Options -- `targets` (optional argument) - one or more file or directory paths (to `.dvc` - or `dvc.yaml` files), or stage name(s) (`./dvc.yaml` by default). DVC will +- `targets` (optional argument) - one or more file or directory paths (to find + `dvc.yaml` files), or stage name(s) (`./dvc.yaml` by default). DVC will reproduce them as detailed below. - For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) @@ -111,12 +111,6 @@ up-to-date and only execute the final stage. using by using a colon `:` following the path to that file. E.g. `models/dvc.yaml:prepare` - - Files and directories tracked by **`.dvc` files** given as `targets` are - updated (same as `dvc add`). E.g. `dvc repro data.dvc` - - > Note that [frozen](/doc/command-reference/freeze) `.dvc` files are - > ignored. - - `--glob` - causes the `targets` to be interpreted as wildcard [patterns](https://docs.python.org/3/library/glob.html) to match for stages. For example: `train-*` (certain stage names) or `models/dvc.yaml:train-*` From 9734ec629f68909eff5206c9a05a070511d7f2a4 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 17 Dec 2020 13:21:15 -0600 Subject: [PATCH 15/18] cmd: simplify repro targets section and reorder related options per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-552062179 --- content/docs/command-reference/repro.md | 49 ++++++++++++++----------- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index ce12e6c29b..b22c40f818 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -98,36 +98,40 @@ up-to-date and only execute the final stage. ## Options -- `targets` (optional argument) - one or more file or directory paths (to find - `dvc.yaml` files), or stage name(s) (`./dvc.yaml` by default). DVC will - reproduce them as detailed below. +- `targets` (optional argument) - specifies one or more `dvc.yaml` files or + specific stage name(s). `./dvc.yaml` by default. E.g. + `dvc repro pipes/linear/dvc.yaml` - - For **`dvc.yaml` files**, their [pipeline(s)](/doc/command-reference/dag) - are checked for changes, and reproduced as needed (explained in the command - [description](#description) above). E.g. `dvc repro pipes/linear/dvc.yaml` + Stage names must be defined in `./dvc.yaml`. E.g. `dvc repro train-vision`. + Stages in other `dvc.yaml` files can be given using by using a colon `:` + following the path to that file. E.g. `models/dvc.yaml:prepare` - - **Stage names** must be defined in `./dvc.yaml`. E.g. - `dvc repro train-vision`. Stages in other `dvc.yaml` files can be given - using by using a colon `:` following the path to that file. E.g. - `models/dvc.yaml:prepare` + Different things can be provided as targets depending on the flags used (more + details in each option), namely: -- `--glob` - causes the `targets` to be interpreted as wildcard - [patterns](https://docs.python.org/3/library/glob.html) to match for stages. - For example: `train-*` (certain stage names) or `models/dvc.yaml:train-*` - (stages in specific `dvc.yaml` file). Note that it does not match patterns - with the path, only to the stages present in the specified file. + - With `-R` you can provide directory paths to search for `dvc.yaml` files in, + recursively. + - With `--glob`, you can use special patterns (using wildcards) to match + groups of stage names. -- `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no - changes were found. This executes all of the stages by default, but it can be - limited with the `targets` argument, or the `-s`, `-p` options. +- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in any + directories given as `targets`, and in their subdirectories. If there are no + directories among the targets, this option has no effect. + +- `--glob` - causes the `targets` to be interpreted as wildcard + [patterns](https://docs.python.org/3/library/glob.html) to match for stage + names. For example: `train-*` (certain stage names) or + `models/dvc.yaml:train-*` (stages in specific `dvc.yaml` file). Note that it + does not match patterns with the path, only to the stages present in the + specified file. - `-s`, `--single-item` - reproduce only a single stage by turning off the recursive search for changed dependencies. Multiple stages are executed (non-recursively) if multiple stage names are given as `targets`. -- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in any - directories given as `targets`, and in their subdirectories. If there are no - directories among the targets, this option has no effect. +- `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no + changes were found. This executes all of the stages by default, but it can be + limited with the `targets` argument, or the `-s`, `-p` options. - `--no-commit` - do not store the outputs of this execution in the cache (`dvc.yaml` and `dvc.lock` are still created or updated); useful to avoid @@ -135,7 +139,8 @@ up-to-date and only execute the final stage. `dvc commit` to finish the operation. - `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files - present in the DVC project. + present in the DVC project. There's no need to specify `targets` when using + this option. - `-m`, `--metrics` - show metrics after reproduction. The target pipelines must have at least one metrics file defined either with the `dvc metrics` command, From 6e872802632bc2c3a9f1dd3317d9c47c3284b2e2 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 17 Dec 2020 13:25:47 -0600 Subject: [PATCH 16/18] cmd: typo and roll backs in repro --- content/docs/command-reference/repro.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index b22c40f818..78675bccb6 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -25,7 +25,7 @@ positional arguments: `dvc repro` provides a way to regenerate data pipeline results, by restoring the dependency graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) implicitly defined by the stages listed in `dvc.yaml`. The commands defined in -these stages are then be executed in the correct order, reproducing pipeline +these stages are then executed in the correct order, reproducing pipeline results. > Pipeline stages are defined in a `dvc.yaml` file (either manually or by using @@ -54,14 +54,15 @@ command will halt the remaining stage execution, and raises an error. There are a few ways to restrict what will be regenerated by this command: by specifying specific reproduction [`targets`](#options), or by using certain -command [options](#options), such as `--single-item`. +command [options](#options), such as `--single-item` or `--all-pipelines`. > Note that stages without dependencies are considered _always changed_, so > `dvc repro` always executes them. -`dvc repro` saves all the data files, intermediate or final results into the +It stores all the data files, intermediate or final results in the cache (unless the `--no-commit` option is used), and updates the -hash values of changed outputs in the `dvc.lock` files. +hash values of changed dependencies and outputs in the `dvc.lock` and `.dvc` +files. ### Parallel stage execution @@ -138,10 +139,6 @@ up-to-date and only execute the final stage. caching unnecessary data when exploring different data or stages. Use `dvc commit` to finish the operation. -- `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files - present in the DVC project. There's no need to specify `targets` when using - this option. - - `-m`, `--metrics` - show metrics after reproduction. The target pipelines must have at least one metrics file defined either with the `dvc metrics` command, or by the `-M` or `-m` options of the `dvc run` command. @@ -155,6 +152,10 @@ up-to-date and only execute the final stage. - `-p`, `--pipeline` - reproduce the entire pipelines that the `targets` belong to. Use `dvc dag ` to show the parent pipeline of a target. +- `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files + present in the DVC project. There's no need to specify `targets` when using + this option. + - `--no-run-cache` - execute stage commands even if they have already been run with the same dependencies/outputs/etc. before. From 10aedde71cf6314b612c03978243233e335bb2f8 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 17 Dec 2020 13:34:13 -0600 Subject: [PATCH 17/18] cmd: correct targets list in repro -P per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-554918742 --- content/docs/command-reference/repro.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 78675bccb6..f368e6b40e 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -153,8 +153,8 @@ up-to-date and only execute the final stage. to. Use `dvc dag ` to show the parent pipeline of a target. - `-P`, `--all-pipelines` - reproduce all pipelines for all `dvc.yaml` files - present in the DVC project. There's no need to specify `targets` when using - this option. + present in the DVC project. Specifying `targets` has no effects with this + option, as all possible targets are already included. - `--no-run-cache` - execute stage commands even if they have already been run with the same dependencies/outputs/etc. before. From ebb2febc7dfe0491b19ab5c5182c6b2544856c86 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Fri, 18 Dec 2020 19:08:51 -0600 Subject: [PATCH 18/18] cmd: rewrite targets arg as a list of examples per https://github.com/iterative/dvc.org/pull/1983#pullrequestreview-554926395 --- content/docs/command-reference/repro.md | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index f368e6b40e..38c068f53e 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -99,21 +99,16 @@ up-to-date and only execute the final stage. ## Options -- `targets` (optional argument) - specifies one or more `dvc.yaml` files or - specific stage name(s). `./dvc.yaml` by default. E.g. - `dvc repro pipes/linear/dvc.yaml` - - Stage names must be defined in `./dvc.yaml`. E.g. `dvc repro train-vision`. - Stages in other `dvc.yaml` files can be given using by using a colon `:` - following the path to that file. E.g. `models/dvc.yaml:prepare` - - Different things can be provided as targets depending on the flags used (more - details in each option), namely: - - - With `-R` you can provide directory paths to search for `dvc.yaml` files in, - recursively. - - With `--glob`, you can use special patterns (using wildcards) to match - groups of stage names. +- `targets` (optional command argument) - what to reproduce (the pipeline(s) in + `./dvc.yaml` by default). Different things can be provided as targets + depending on the flags used (more details in each option). Examples: + + - `dvc repro linear/dvc.yaml`: Specific `dvc.yaml` file to reproduce + - `dvc repro -R pipelines/`: Directory path to explore recursively for for + `dvc.yaml` files + - `dvc repro train-model`: Specific stage in `./dvc.yaml` + - `dvc repro modeling/dvc.yaml:prepare`: Stage in a specific `dvc.yaml` file + - `dvc repro --glob train-*`: Pattern to match groups of stages - `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in any directories given as `targets`, and in their subdirectories. If there are no