Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: doc repro --glob and update targets arg info. #1983

Merged
merged 20 commits into from
Dec 19, 2020
Merged

Conversation

skshetry
Copy link
Member

@skshetry skshetry commented Nov 30, 2020

Fixes #1975. The feature was introduced in iterative/dvc#4976 which was released in 1.11.0.

@skshetry skshetry added the A: docs Area: user documentation (gatsby-theme-iterative) label Nov 30, 2020
@skshetry skshetry self-assigned this Nov 30, 2020
@shcheklein shcheklein temporarily deployed to dvc-landing-repro-glob-llcj9qf November 30, 2020 05:29 Inactive
Copy link
Member

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @skshetry ... let's just find a place to put it.

For now section will do, but we can consider doing something standard here ... any good ideas that will make it easier to navigate for users?

@shcheklein shcheklein temporarily deployed to dvc-landing-repro-glob-llcj9qf December 1, 2020 04:43 Inactive
@skshetry skshetry requested a review from shcheklein December 1, 2020 04:48
With `--glob`, the targets are used as a pattern to match stages in the
dvc.yaml file (example: `train-*`).

A stage from a dvc.yaml in a different directory can be specified using a path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if in this case it's easier to just show 5 or more examples with their meaning (instead of explaining : explicitly)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be too much to read if you just need to get it done. But, I think it's good to also layout the spec of it, so I added an example for each scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, spec is good, and I saw your examples (that's why I'm actually suggesting to do emphasis on them) the question is - should the spec be part of the examples explanation? this way most people with get the idea from what they see, almost immediately, and they can still read if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. Will make changes later in the day. Thanks.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easier to just show 5 or more examples with their meaning (instead of explaining :

it might be too much to read if you just need to get it done

It's good to have the spec thoough. I can probably help summarize this text when it's ready.

  • Consider instead adding an actual Example about wildcards in addition to this (and link to it from here).

@shcheklein

This comment has been minimized.

@shcheklein shcheklein temporarily deployed to dvc-landing-repro-glob-llcj9qf December 1, 2020 05:37 Inactive
@jorgeorpinel
Copy link
Contributor

CI fail is unrelated, should be fixed in master

This has been fixed. Please merge master, @skshetry.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reorder these paragraphs a little so that everything --glob is together, like this

content/docs/command-reference/repro.md Outdated Show resolved Hide resolved
content/docs/command-reference/repro.md Outdated Show resolved Hide resolved
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @skshetry this is really good.

Side-note guys: I'm a bit concerned about the --glob name for UX reasons. See iterative/dvc#4976 (comment).
Moved to iterative/dvc#4816 (comment)

@jorgeorpinel
Copy link
Contributor

OK so it looks like we have all the info we need for this doc. The Q is whether you can help move stuff around and address some of the feedback above @skshetry please lmk so we can plan to finish this up when ready. Thanks

@skshetry
Copy link
Member Author

skshetry commented Dec 9, 2020

@jorgeorpinel, this should be ready, I don't have anything to add.
I was trying to show examples before the spec explanation like how Ivan mentioned above, but I am not sure how to do that, so I'm skipping it.

@shcheklein

This comment has been minimized.

`dvc repro` does not run `dvc fetch`, `dvc pull` or `dvc checkout` to get data
files, intermediate or final results (except if the `--pull` option is used).

By default, this command checks all [pipeline](/doc/command-reference/dag)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all not relevant?

Copy link
Contributor

@jorgeorpinel jorgeorpinel Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is redundant (with the new targets text a well as the first p in the Description) and slightly wrong (but that will be corrected for #2024). Removing it aims to make the Description shorter so that the whole doc doesn't grow too much as we introduced the targets text.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can easily stash the change if you prefer and leave it for later. Lmk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's make PRs focused if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect some changes here, but it probably should be only the paragraph about targets?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I reverted this. I'm afraid that if we just keep adding paragraphs without reviewing the context we'll end up with lots of redundancy and a text that may be unnecessary long.

- `-R`, `--recursive` - determines the stages to reproduce by searching each
target directory and its subdirectories for stages (in `dvc.yaml`) to inspect.
If there are no directories among the targets, this option is ignored.
- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not relevant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is directly related to repro targets. It used to read "determines the stages ... by searching ... for stages". Isn't this a good opportunity to fix that? TBH I already have a mountain of stashed changes so anything I can't get into existing PRs will probably get lost until we happen to notice it again 🙁

using by using a colon `:` following the path to that file. E.g.
`models/dvc.yaml:prepare`

- Files and directories tracked by **`.dvc` files** given as `targets` are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's skip this I think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I would like that too but are you sure? The help output (and Synopsis above) mention .dvc files so if we don't explain it here, it'd be up to the user to assume what will happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synopsis is a copy/paste from DVC which should probably change, so, let's not complicate. The point is not cover everything to the very small detail - the point is to focus on important stuff and convey it in the simplest possible format (e.g. people don't read long texts, unless they want to dive in)- thus short self-explainable examples are better

Copy link
Contributor

@jorgeorpinel jorgeorpinel Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Removed info about .dvc file targets from this ref.

Comment on lines 110 to 122
Different things can be provided as targets depending on the flags used (more
details in each option), namely:

- With `-R` you can provide directory paths to search for `dvc.yaml` files in,
recursively.
- With `--glob`, you can use special patterns (using wildcards) to match
groups of stage names.

- `-R`, `--recursive` - looks for `dvc.yaml` files to reproduce in any
directories given as `targets`, and in their subdirectories. If there are no
directories among the targets, this option has no effect.

- `--glob` - causes the `targets` to be interpreted as wildcard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added -R and --glob to targets but also moved those options up right next to targets, so I still think it seems a bit redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

Keep only

Different things can be provided as targets depending on the flags used (more
  details in each option), namely:

then a list of examples:

  • dvc reporo -R pipeline - will look recursively into ....
  • dvc reporo --glob ... - will execute stages that match pattern
  • dvc repro - will find dvc.yaml(s) and execute all stages

it's way simpler to grasp the idea, no need to read long long text (if some details are need let's try very hard to squeeze them into each item above)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reporo? 😆 Sounds funny

OK good idea, applied! PTAL.

Note that issue #1614 is pretty evident this way, should we prioritize it?

image

@shcheklein shcheklein merged commit f641643 into master Dec 19, 2020
@skshetry skshetry deleted the repro-glob branch December 19, 2020 05:47
depending on the flags used (more details in each option). Examples:

- `dvc repro linear/dvc.yaml`: Specific `dvc.yaml` file to reproduce
- `dvc repro -R pipelines/`: Directory path to explore recursively for for
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `dvc repro -R pipelines/`: Directory path to explore recursively for for
- `dvc repro -R pipelines/`: Directory path to explore recursively for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops thanks! Want to send a quick typo fix PR since this is merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

repro: document --glob flag
3 participants