-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unassign
and deprecate
#219
Conversation
all versions. | ||
|
||
```console | ||
$ gto show churn --av -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aguschin --av
- minot style comment. It's not common to have multi-letter shortcuts like this. I would not do this.
Hi, I added cc @dmpetrov if you want to check out the README (I don't think those are breaking changes - just renamings here and in Studio). |
README.md
Outdated
6. Deregistering an artifact - completing the lifecycle of an artifact and | ||
marking it as an outdated one. | ||
|
||
### Registering an artifact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing, coz in Studio model registry, you see models that you annotated even if you have not registered them. Should the Add a model
flow in Studio also call register
and create the Git tag to register the model? Is it possible to combine these commands somehow, so that the user does not have to use so many commands (register, publish, annotate)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I think it should not. We can explain that artifact is considered to be registered
if there is any version (that wasn't deprecated/unpublished).
Also I recently understood that what Studio shows is the same you should get if you run GTO on a remote repo, e.g. $ gto show https://github.com/iterative/model-registry
. GTO doesn't process remote repos yet, but we'll add it soon #220
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference from Studio still is that when you add a model from Studio, we don't call the GTO register
api and create the model tag without versions. I think we should make this consistent with GTO and also call register
. Also, IMO having to run 3 commands (register, publish, annotate) for getting your first version up is too much. Can register
internally call publish
to register the first version? And can it also accept model description etc and call annotate
internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the reasoning, but I'm afraid combining register
and annotate
in a single command is also confusing. Consider, that annotate
edits artifacts.yaml
and don't create a commit with this change, while register
creates a git tag for an already existing commit. If a single command both changes files in the current index and creates a git tag, that can be quite confusing IMO.
Even if we make annotate
to create a new commit, register
can be applied to any commit, not to the last one only. So combining register
and annotate
is useful only if you want to annotate+register something on the top of the HEAD. Which is useful when you register your artifact for the first time, but probably not that useful once you've annotated it already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tapadipti for pointing out this! I would expect that during the Add Model
scenario Studio calls 2 commands
gto register
-> create a taggto annotate
Not sure about that a separate command is necessary, but I would upvote for consistent behavior!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aguschin Questions:
- In terms of displaying the model in the Studio model registry, how should we distinguish between models that are only annotated and models that are annotated and registered? Should we treat them the same and display both these types of models in the MR?
- In terms of the
Add a model
flow in Studio, should we ask the users if they want to only annotate or annotate®ister? I think this will introduce unnecessary confusion to the users. Can we agree on one flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how should we distinguish between models that are only annotated and models that are annotated and registered
Please check out this issue. Maybe this is a solution.
should we ask the users if they want to only annotate or annotate®ister
I would ask this from @alex000kim and @mnrozhkov cause they understand user needs better than I do.
on a side note, a similar thing in GTO was proposed by @jorgeorpinel a while ago #145
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tapadipti, do we have an issue for this already or should we create one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we don't. I'm reading #226 and it looks like we should conclude that discussion and then create required tickets in Studio. Has there been any discussion on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, nothing as far as I know.
Codecov Report
@@ Coverage Diff @@
## main #219 +/- ##
==========================================
- Coverage 82.72% 80.91% -1.82%
==========================================
Files 16 16
Lines 1534 1886 +352
==========================================
+ Hits 1269 1526 +257
- Misses 265 360 +95
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Co-authored-by: tapadipti <[email protected]>
README.md
Outdated
1. Registering artifact - marking the very beginning of the artifact history in | ||
the Git repository. | ||
2. Publishing a version - marking an important change in the artifact. | ||
3. Assigning a stage to a version - marking the readiness to be consumed by a | ||
downstream system. | ||
4. Unassigning a stage - removing an "artifact is ready for this stage" mark. | ||
5. Deprecating a version - marking a version as the one that should not be used | ||
any longer. | ||
6. Deregistering an artifact - completing the lifecycle of an artifact and | ||
marking it as an outdated one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the minimalistic standpoint, I was thinking to only introduce (and implement) what seams really necessary: 2, 3, 4, 6. But on the other hand, having all commands makes the picture complete (even if some parts are optional). WDYT? Should we for now skip adding 1 and 5 cause they look as not essential, or should we introduce them and structure the README around the most supposedly common commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest having a single command for 1 and annotate. But 5 (unassign version) is good to keep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just had a chat with @mnrozhkov, the suggestion is to remove the 1. Registering an artifact
from GTO and use instead 2. Publishing a version
. 1.
doesn't solve any particular problem, and 6. Deregistering an artifact
should be undone by either deleting a deregistering git tag (mymodel@deregistered
) manually or by 2.
:
$ gto deregister mymodel
Created git tag 'mymodel@deregistered'
# now the model is deregistered with all its versions
$ gto publish mymodel
Created git tag '[email protected]`
# now the model is registered again
WDYT @tapadipti?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aguschin there are few questions for clarification
- after you publish
mymodel
again, the only[email protected]
is active and available in MR formymodel
? (old version if any are still deprecated?) - can I re-publish a model version that was deprecated? (I don't see any use case for this, just curious)
- IMO
deprecated
sounds better thanderegister
, also you use verbregister
to explainpublish
command ))) may be it's also not optimal command name? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question about the use case! Please ask any other questions like this, if you have some :)
I guess there are two situations in which you may want to deprecate and then un-deprecate a model:
- You deprecated an artifact so others don't use it, but then changed your mind (in this case deleting a model deprecation git tag will do, although it don't preserve history)
- You want to deprecate all previous model versions so others don't use them (e.g. you found a critical bug that seems to exist in all versions prior to some version), but still want to develop and use the artifact further.
in 1 you can just delete a model deprecation git tag, but that won't preserve history of deprecation/undeprecation which could be valuable information (there should have been a reason to deprecate, the user should have been describing that reason in model deprecation git tag message). -- That's why I initially created those commands in pairs (register artifact & deregister artifact).
in 2 you may just go over all previous versions and create a version deprecation git tag for each of them. But that may be a lot of git tags and may not be desirable. Although, this is a premature optimization probably and we should ask user to do this if he wants to deprecate all versions at once, but not an artifact.
On the other topic, I agree that deprecate
sounds better than deregister
, but having paired command names may be easier to grasp for the user. Also, we may name this not "register & deregister an artifact", but "add & deprecate an artifact".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just delete a model deprecation git tag, but that won't preserve history
I think we shouldn't implement solutions that don't preserve history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not though? Git allows you to overwrite history if you really want to.
If a user messes up (makes a typo, pushes a wrong version, etc) they should be able to delete
/remove
that tag.
I am sure users will still find a way to do that with git, so we might as well make it easier for them by supporting deletion behavior with gto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I've implemented
--delete
flags foruntag
anddeprecate
commands (this is not the default behavior). Those delete all relevant Git tags. You can find examples in the README. - For
deprecate
I'm implementing the following approach: any new tag created after the artifact was deprecated "undeprecates" it. If you're going to do that, GTO will error out by default, so you'll need to provide--force
flag.
my 2c:
|
It's a great idea @alex000kim! I'm also checking if the following is possible gto assign awesome-model v1.0.0 # instead of `gto publish awesome-model --version v1.0.0`
gto unassign awesome-model v1.0.0 # instead of `gto deprecate awesome-model --version v0.0.1`
gto assign awesome-model prod # instead of `gto assign awesome-model prod`
gto assign awesome-model v1.0.0 prod # instead of `gto assign awesome-model prod --version v1.0.0`
gto unassign awesome-model prod
gto archive awesome-model # instead of `gto deregister awesome-model` if we could remove |
README.md
Outdated
$ gto assign awesome-model prod | ||
Created git tag 'awesome-model#prod#1' that adds stage 'prod' to 'v0.0.1' | ||
$ gto tag awesome-model --version v0.0.1 --stage prod | ||
Created git tag 'awesome-model#prod#1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this message specify the version as well?
E.g. Created git tag 'awesome-model#prod#1' based on v0.0.1 of awesome-model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since 1) we can't assign a stage without a version and 2) gto tag awesome-model --version v0.0.1
is very different behavior than gto tag awesome-model --version v0.0.1 --stage prod
What do you think about a --to-version
option when using --stage
?
gto tag awesome-model --stage prod --to-version v0.0.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better - easier to read the command. Do you think this should do both version registration and promotion then?
$ gto tag x3 HEAD --version v0.0.1 --stage prod
Created git tag '[email protected]'
Created git tag 'x3#prod#1'
couple more examples to clarify how this will work:
$ gto tag x3 HEAD --stage prod
Created git tag 'x3#prod#2'
$ gto tag x3 --stage prod --to-version v0.0.2
Created git tag 'x3#prod#3'
$ gto tag x3 HEAD --to-version v0.0.3 --stage prod
Error: provide either 'ref' or 'to_version', but not both
$ gto tag x3 --version v0.0.4 --to-version v0.0.4 --stage prod
Error: provide either 'version' or 'to_version', but not both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess whether this makes sense or not depends on whether gto
should allow promotion to a stage from a git ref OR if only a particular version should be allowed to be promoted to a stage.
I kind of assumed the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, if you try to assign a stage, but the version doesn't exist for the REF
, GTO will register it for you. But if you supply --skip-registration
git tag, then it won't.
I.e. let's suppose there are no versions for x4
:
$ gto tag x4 HEAD --stage prod
Created git tag '[email protected]'
Created git tag 'x3#prod#1'
$ gto tag x4 HEAD^1 --stage prod --skip-registration
Created git tag 'x3#prod#1'
Looks like the difference between --version
and --to-version
is this: if you provide REF
and --to-version
, you will get an error, cause there should be only one way to specify the Git revision to reference with Git tag you'll create. While if you provide REF
and --version
, this will be OK: you register a version
and reference REF
with Git tag you create. That's why if you provide REF
, --version
and --stage
, you can both register a version and make an assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think it's better to keep the commands simple and not allow doing 2 things (register version and assign stage) with one command.
- And having 2 similar flags (
--version
and--to-version
) looks confusing to me. - Also, why do we want to generalize this to
tag
/untag
, instead of making the commands specifically about versions and stages? Do we see more kinds of tags being required in the future, and want to be able to cover those potential scenarios? Since versions and stages have special meanings for us and since we treat them separately (eg, we only display the latest stage per version but we display all versions per model), I think it's easier to understand and remember the commands if they are separate for the distinct functionalities. Simplifying the name/syntax is a good idea, but not sure if combining them into a single command is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to keep the commands simple and not allow doing 2 things (register version and assign stage) with one command.
# I guess you're suggesting we shouldn't allow this:
$ gto tag x6 HEAD --stage prod --version 1.0.0
Created git tag '[email protected]'
Created git tag 'x5#prod#1'
# Should we allow this?
$ gto tag x5 HEAD --stage prod
Created git tag '[email protected]'
Created git tag 'x5#prod#1'
# If we don't allow the latter, the alternative is to error out
$ gto tag x5 HEAD --stage prod
Error: no version registered at HEAD. Register a version first, then do an assignment.
But the ability to assign a stage with automatic version registration (like the second command) is a useful shortcut, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why do we want to generalize this to tag/untag, instead of making the commands specifically about versions and stages?
From the above comments from @mnrozhkov and @alex000kim I assume it's too hard to memorize 5 different commands and syntaxes for a new user. I agree on this direction: GTO does very simple things and should be a lightweight tool, and having a big tough-to-read README with many commands is what I would like to avoid.
On the other hand, if unifying commands brings more confusion than value, let's not do those. Currently (without --to-version
and --from-version
) I think it's pretty straightforward though (maybe it's needed to be clarified in docs): if you supply --stage
, you do an assignment, otherwise, you do a registration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterated on this with @mnrozhkov and eventually decided that
tag
looks cool, but if you try to investigate how it works, it becomes confusing (different behavior with--stage
and without--stage
register
andassign
are easier cause they separate use cases welldeprecate
can unassign stages and deregister artifacts
I've updated the README and it looks good now: we have 3 commands only: register
, assign
, deprecate
. Almost without breaking changes. Feel free to read and provide any more feedback re this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented this. So now:
- in CLI you have
register
,assign
,deprecate
. - in API you have
register
,unregister
,assign
,unassign
,deprecate
This is because Studio is the main user of API, and having separate commands for them is clearer.
For human in CLI we'll have deprecate
unifying deregister
and unassign
, making things simpler, hopefully.
A side note: it looks like API and CLI in GTO diverge more and more over time.
I'm merging this with how it works to release existing changes. Let's enhance on top of what it's there already. Follow-up issues
Feel free to join any discussion if you wish. |
This PR adds
close #93
close #51
Since we're introducing multiple new actions here, we need to align names and terms in the new context. Maybe
register
andderegister
an entire artifactpublish
(release
?) anddeprecate
a version of the artifact (can switch 1st and 2nd to avoid extra changes)assign
andunassign
a stage to a version (orpromote
ordemote
if like the previous names).WDYT?
A minor note for myself: we need to warn a user that he needs to deprecate a version if that version have active assignments (but we shouldn't do the same when deprecating an artifact: warn about active versions - which is a bit non-symmetrical).