Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argparse : support "single occurence" arguments #128301

Closed
WasfiJ opened this issue Dec 27, 2024 · 12 comments
Closed

argparse : support "single occurence" arguments #128301

WasfiJ opened this issue Dec 27, 2024 · 12 comments
Labels
pending The issue will be closed if no feedback is provided stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@WasfiJ
Copy link

WasfiJ commented Dec 27, 2024

Feature or enhancement

Proposal:

Hello,

Just finished writing a little gadget to move/rename files inside a given bucket/folder/ on S3 (AWS).
It wouldn't make sense to pass multiple "-b bucket" to my program.
Sure, I can count such args gathered in an Action.dest and emit an error, but it would be nice to express such requirement and have argparse handle it.

Here is an example modification implementing such control in _ExtendAction that worked for me :

  • check len(items) for error condition
  • add user error message, which is necessarily more helpful that a generic "used more than once" (optional)
class UseOnceExtendAction(argparse._ExtendAction):

  def __init__( self, option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None,
                required=False, help=None, metavar=None, deprecated=False, errmsg=None ) :
    super(argparse._ExtendAction, self).__init__(
      option_strings=option_strings, dest=dest, nargs=nargs, const=const, default=default, type=type,
      choices=choices, required=required, help=help, metavar=metavar, deprecated=deprecated )
    self.errmsg = errmsg
  
  def __call__(self, parser, namespace, values, option_string=None):
    items = getattr(namespace, self.dest, None)
    if values and hasattr(items, "__iter__") :
      i = -1
      if   self.default == None and len(items) == 1 : i = 0
      elif self.default != None and len(items) == 2 : i = 1
      if i >= 0 :
        if self.errmsg : self.errmsg = '\n\n  ' + self.errmsg.strip() + '\n'
        else : self.errmsg = '\n' 
        self.errmsg = ' :\n\n     ' + option_string+' ' + items[i] +'\n     '+option_string+' '+ values[0] + self.errmsg
        raise argparse.ArgumentError(self, argparse._('used more than once%s') %self.errmsg )
    items = argparse._copy_items(items)
    items.extend(values)
    setattr(namespace, self.dest, items)

# Use : 
parser.add_argument('-b', '--bucket', required=True, nargs=1, action=UseOnceExtendAction, metavar="<Bucket>", 
  dest='bucket', help='Set S3 bucket.', errmsg='\n  Please provide one single S3 bucket !\n' )

Effect :

$ python3.13 -u 'd:\dev\aws_mass_mv.py' -b bucket1 -b bucket2 

  Usage : aws_mass_mv.py -b <Bucket> -r <root_folder> -i <input_file> [<input_file> ...] [-s <char>]
                         [-w | --overwrite | --clobber ] [-k <max_age>] [-o <log_file>] [-h] [-v]

Error : -b/--bucket used more than once :

     -b bucket1
     -b bucket2

  Please provide one single S3 bucket !

Also, an other suggestion : added a few line breaks in ArgumentParser.error() to make for a clearer message with prominent error indication :

  def error(self, message):  
    try : sys.stderr.write('\n'+self.format_usage()) # formatter.add_usage(self.usage, .. ,'  Usage : ')
    except (AttributeError, OSError) : pass
    self.exit(2, argparse._('\nError : %s\n') % message)

Peace.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

@WasfiJ WasfiJ added the type-feature A feature request or enhancement label Dec 27, 2024
@picnixz picnixz added the stdlib Python modules in the Lib dir label Dec 27, 2024
@picnixz
Copy link
Contributor

picnixz commented Dec 27, 2024

Why use _ExtendAction? why not simply subclass the Action class and check that the namespace does not already contain the dest variable? (I haven't tested it but it should work I think?)

Essentially, you want a StoreUniqueAction, or something like this, right?


Also, I think usually a single store action would simply keep the last argument that was passed (namely --foo x --foo y would end up storing 'y' and not 'x'):

>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--foo')
_StoreAction(option_strings=['--foo'], dest='foo', nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)
>>> parser.parse_args('--foo x --foo y'.split())
Namespace(foo='y')

@picnixz picnixz added the pending The issue will be closed if no feedback is provided label Dec 27, 2024
@WasfiJ
Copy link
Author

WasfiJ commented Dec 28, 2024

Essentially, you want a StoreUniqueAction, or something like this, right?

Yes exactly ! I just copy-pasted the hack that worked for me to show it's not that big of a deal (my program uses only Extend & BooleanOptional, so I subclassed only those).
I would expect this to be an added attribute to Action though : single_use for example, but implementation details belong with the developers.

Also, I think usually a single store action would simply keep the last argument that was passed (namely --foo x --foo y would end up storing 'y' and not 'x'):

Yes and that loses the cardinality info. You can retort that an extend action doesn't and you can examine the dest member. Indeed. But this argument applies to nargs as well : the user can count elements in dest himself, right ?
I'm suggesting to augment the functionality to make argparse nicer to use (and more powerful).

And this is not only nicer, it can be really important. Case in point : my program is called by someone's automated job that builds an input file with a list of files to move. That job providing two target S3 buckets is a sign there is a bug somewhere, which could extend to the list of files, ultimately leading to the clobbering of production files !

So this feature can provide real protection, and you know people and laziness (the opposite of safety - I want password '12345' everywhere, right). If this protection can come "for free" (just setting single_use=True) that means we have better Python code out there.

@picnixz
Copy link
Contributor

picnixz commented Dec 28, 2024

But this argument applies to nargs as well : the user can count elements in dest himself, right ?

Yes and that's essentially what I would recommend. Create a dedicated action which raises if the attribute is being added more than once. Note that instead of an action, you can also subclass argparse.Namespace to check it for you (though it's a bit uglier since you need to handle the fact that the default is None).

I would however subclass Action (public name) instead of _ExtendAction. First, the _ExtendAction is a private class and is not meant to be subclassed outside of argparse. Second, this is, honestly, quite a niche feature. Most of the CLIs would just ignore the first occurrence and only keep the latest one. Agreed that it could help you debugging bugs but for instance, ls -w 1 -w 0 is equivalent to ls -w 0 (this is one example amongst many others).

It would be better to suggest your feature on https://discuss.python.org/c/ideas/6 and if this gathers support, we could implement it but I'm afraid that the standard library is not where it should live. This is more the role of a third-party package.

If this protection can come "for free" (just setting single_use=True) that means we have better Python code out there.

The standard library makes design choices and may even restrict features on purpose since we weigh the cost of maintenance, use cases and flexibility. So it's not entirely free on our side either. As a user this can be frustrating (and I am one of them sometimes) but we need to estimate how likely this feature is needed (others may say that "the caller is responsible to check that they did not supply the argument twice" since this is not really a arduous task I think).

cc @serhiy-storchaka @savannahostrowski

@WasfiJ
Copy link
Author

WasfiJ commented Dec 28, 2024

I agree that in most cases ls -w 1 -w 0 reducing to ls -w 0 is harmless (and even a convenience if you're building such a command in a script and counting on the last -w to be in effect : allows for less complexity in your script - been there, done that).

But I think I make a clear case for the opposite being really useful as well.

My argument was that it is not an arduous task as you say, funny how you sent that back :)
My second argument is that there is a difference between every developer out there implementing this on their own (or, most likely, not : lazy, unaware, short on time (read docs/code), focused on business logic, ..) and this being a readily available easy-access feature in the standard library.

So in short this boils down to cost on your side. And you're certainly the better judge of that.
(I have been on the receiving end of users dictating "easy" features from an armchair !).

Thank you for suggesting the ideas page, I'll post there as well asap !

Peace.

@picnixz
Copy link
Contributor

picnixz commented Dec 28, 2024

My argument was that it is not an arduous task as you say, funny how you sent that back :)

It doesn't mean that it should be part of the standard library. We have a policy that not all one-liners (ok maybe a bit more in this case) should be part of the standard library.

But I think I make a clear case for the opposite being really useful as well.

There is a clear case but how many users would use this? how many actually care? I'm not saying that it's useless, I'm just saying that we need to make choices. A good example of this is when users want to add features to itertools, and most of them get rejected because, even if they are easy to implement, they either don't have much use case or can be substituted for something else. Or the feature is present in a 3rd-party library.


Now, one part that will never be free on our side is the fact that we still need to maintain this feature. Even if it's not an arduous task, it would add burden to anyone maintaining the module because it's a feature that may or may not suit each use case in the future . We need to handle corner cases, documentation, and teaching as well. Now, if the Discourse topic gets support and/or a core developper is willing to support this addition, then we wouldn't reject it. (I'm just explaining how the standard library policy for feature requests works.)

OTOH, we recently improved examples of argparse (cc @savannahostrowski). We could make a recipe for this kind of action, without adding it to the standard library itself (this is an approach usually taken by itertools). But a DPO topic is a good starting point to see if something is meant to be included.

@picnixz
Copy link
Contributor

picnixz commented Dec 28, 2024

FTR, I personally think we should make argparse more extensible in general (namely, expose actions classes). I also think that sometimes more semantics for nargs more than just N / all are needed. (So I understand your request and I'm personally +0 (it's not really useful most of time IMO but it's not entirely useless)).

@WasfiJ
Copy link
Author

WasfiJ commented Dec 28, 2024

Oh ! So Savannah's the better judge, not you !
Just kidding :)

What you're saying makes total sense.
That's why I leave it with the developers, and why I called my modification a "hack". Proper implementation deals with corner cases, smart choices (future-proofing, KISS), documentation, .. and is certainly not free.

May I plead to Savannah that this is a "harmless" feature, because it is (obviously) opt-in ? A current user of argparse could live the rest of their days oblivious to the fact that this was ever added..

Meanwhile, your suggestion of adding an example use to the documentation is really great !
At least people will be aware that such protection can be offered by argparse, and maybe they should harden their programs to the benefit of the unruly user.

@picnixz
Copy link
Contributor

picnixz commented Dec 28, 2024

(I tagged Savannah and Serhiy because they recently worked on argparse and I reviewed some of their PRs). Strictly speaking, any core developer willing to support the feature would be sufficient (we actually don't have an active and explicit maintainer for argparse, just some of us work on the module when we can).

@WasfiJ
Copy link
Author

WasfiJ commented Dec 28, 2024

Oh man ! I was trying to court Savannah, and you just explained how the "Blame" button up left works !
:-)

@serhiy-storchaka
Copy link
Member

I agree with @picnixz. The stdlib provides ready solutions for most common use cases. You can create custom actions and type converters for more specific needs, although in many cases you can get by with standard actions and post-processing.

I would use action='append' with checking the length of the list after parsing arguments. This requires much less code.

@WasfiJ
Copy link
Author

WasfiJ commented Dec 28, 2024

So be it.
With nargs=1 towards a lonesome arg (and dest=buckets), extend yields Namespace(buckets=['bucket1', 'bucket2']) which is nicer than append's list of lists : Namespace(buckets=[['bucket1'], ['bucket2']]) (a single arg passed is then easier to extract, otherwise error condition is the exact same : len(Namespace.buckets)>1).

@WasfiJ WasfiJ closed this as completed Dec 28, 2024
@serhiy-storchaka
Copy link
Member

Don't use nargs=1 with action='append'. Use nargs=None (which is the default, so you does not need to specify the nargs argument).

@picnixz picnixz closed this as not planned Won't fix, can't repro, duplicate, stale Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending The issue will be closed if no feedback is provided stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants
@serhiy-storchaka @picnixz @WasfiJ and others