-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
list: document usage for data archive #1879
Conversation
## Example: Archive project data | ||
|
||
An archive is a single file that contains multiple files from a project. It can | ||
be used to backup project data. We can use `dvc list` to create an archive of | ||
files and directories (<abbr>outputs</abbr>) tracked by DVC: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to think through the motivation and use cases here. Are we doing what the issue originally suggested ("create a lightweight copy of the project (for backup). It's "lightweight" because it wouldn't include any of the data tracked by DVC")? Or one of the options in #1521 (comment)? Or something else? And why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. is this ready for review? I see the PR is in draft state.
$ dvc list . -R --dvc-only | zip -@ data.zip # if `zip` available | ||
``` | ||
|
||
Alternative for windows (if `xargs` available): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Capital W
- Missing "is"
But I'm not sure this makes sense. How is it an alternative for Windows when it uses xargs
which is a GNU tool?
$ dvc list . -R --dvc-only | xargs python -m zipfile -c data.zip | ||
``` | ||
|
||
Use `git archive` to create an archive of data tracked by Git: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's tracked by Git it's not data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @imhardikj I'm not sure that addressing #1521 is very important TBH. It would be nice but it's not a 0->1 update so maybe put that in a separate PR which we may or may not get to (there are higher priorities I think, according to your project proposal, but up to you if you want to work on that on extra time).
The job here should be pretty straightforward: read and review the correctness of the description, actually try dvc list
throughout the doc including the examples, and update anything that is still outdated (the only change so far in this doc from 0 to 1 was to quickly grep/replace DVC-file for .dvc file and/or dvc.yaml — was that enough? I'm not sure).
Thanks
DVC, by effectively replacing data files, models, directories with `.dvc` files | ||
(`.dvc`), hides actual locations and names. This means that you don't see data | ||
files when you browse a <abbr>DVC repository</abbr> on Git hosting (e.g. | ||
GitHub), you just see the `dvc.yaml` and `.dvc` files. This makes it hard to | ||
navigate the project to find <abbr>data artifacts</abbr> for use with `dvc get`, | ||
DVC replaces data files, models, directories, etc. with small | ||
[metafiles](/doc/user-guide/dvc-files-and-directories#dvc-files-and-directories), | ||
and hides actual locations and names. Hence data files aren't visible when you | ||
browse a <abbr>DVC repository</abbr> on Git hosting (e.g. GitHub), you just see | ||
the `dvc.yaml` and `.dvc` files. This makes it difficult to find <abbr>data | ||
artifacts</abbr> while navigating in a project, for using with `dvc get`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this addresses #1521 and it has some strange language like "for using with" among other phrases (that replace perfectly OK previous text). The changes don't seem to bring any new info. anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks stale.
Sorry @imhardikj I'm closing this as stale. You have enough PRs open already anyway. If you're trying this one again (when/if you get the capacity) please first post your plan of action in the original issue and mention me, so we confirm we're on the same page. Thanks |
Fixes #1521
Partially addresses #1824