-
-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More lenient float multipleOf validation #878
More lenient float multipleOf validation #878
Conversation
Hi. Thanks for this but I'm going to politely decline. As I mentioned in the other issues, if someone wants non-float behavior, there is already a way to get it via Decimals. Otherwise, this change just moves which floats are incorrect multiples from some to others. If you indeed are interested in this, I'd point you to this PR which claims there is a better algorithm entirely for us to use. I don't really think that looking at the number of tickets filed is a good metric for surprisability (and I'm not sure I believe surprise is the right metric at all in this case, this library implements a spec, so for better or worse it's compliance with the spec that trumps). But just looking at issues filed is going to be biased against the non-current behavior -- people who expect this behavior aren't filing tickets saying things worked as expected :) But do appreciate the PR nonetheless. |
@Julian, I understand and sympathize with your points, but am going to try to convince you otherwise nonetheless. I see now that you have been involved in discussions on the jsonschema spec itself, without reaching a meaningful course of action. I just see many valid points raised, linked issues, calls for clarification of the spec, other languages sturggling with this etc. My take is that it is up to the language to interpret/implement this as they want (correct me if I'm wrong here). The spec offers no real guidance here (and whilest I think that should be improved, that is another discussion), for the python implementation I think python's best practices should be guiding. And that is where it becomes a bit unclear I guess. On first sight, python works like this: >>> 0.1*round(10.1/0.1)
10.100000000000001
>>> 0.1*(10.1/0.1)
10.1 And it seems fair to reason that 10.1 is indeed not an integer multiple of 0.1. But digging a bit deeper, both python itself and the native json library already do some implicit rounding: >>> import json
>>> v = json.loads("0.100000000000000001")
>>> print(v)
0.1
>>> print(f"{v:.20g}")
0.10000000000000000555
>>> print(json.dumps(v))
0.1 There are three occurences of implicit rounding here:
So the underlying philosophy seems (to a certain degree) to be to treat float(0.1) as exactly 1/10. So even though the behaviour below is strict/correct/explainable:
This is also the case:
If I really cared, I should have used a >>> import simplejson
>>> import jsonschema
>>> jsonschema.validate(instance=simplejson.loads("10.00000000000000005",use_decimal=True), schema={"multipleOf": 1})
Failed validating 'multipleOf' in schema:
{'multipleOf': 1}
On instance:
Decimal('10.00000000000000005') So whilest you have argued that is you want more correct behaviour of multipleOf one should use Decimal and not float, I woul like to argue that I think it is more pythonic to silently add some leniency to multipleOf validation (equal to the float representation error) just like python displays
Not really, it just checks (or should check) if the number is equal to the nearest representable float. Now there are many different ways to implement this, and perhaps one based on Decimal would feel less arbitrary:
If the users intention actually is to use arbitrary precision decimal representaion they can do so by using the Decimal package explicitly, something they should be doing anyways as this holds: >>> Decimal(1) / Decimal(0.1)
Decimal('9.999999999999999444888487687')
>>> 1 / 0.1
10.0 Which in the current implementation actually leads to the arbitrary behaviour where some numbers are multiples of some irrepresentable numbers and others are not. |
Apologies for not having more time to respond in detail, but there's no implicit rounding in your examples -- what Python does in recent versions is a very targeted, very specific display change -- https://bugs.python.org/issue1580 It reprs floats by using the shortest input that produces the given float. But no rounding is happening, 0.1 isn't a representable float, so you always have gotten the same float, before or after the repr change. And absolutely no tolerance is added -- it's strictly a display change. There's as far as I know no precedence anywhere in Python for introducing some implicit tolerance when doing float operations unless the user explicitly asks for it, and what
|
Many user have ran into floating point precision errors when validation multipleOf. E.g. #818, #810, #687, #185, #320, #247. Technically it can be argued that it is silly to check if any number is an integer multiple of e.g. 0.1. Ask silly questions, get silly answers. This approach is a bit unhelpful though.
Looking at other implementations in javascript and python, other libraries struggle with this too
Checking if 10.1 is a multiple of 0.1 yields mixed results:
https://jsonschema.dev/ fails validation
https://www.jsonschemavalidator.net/ accepts it
And react-jsonschema-form accepts it.
So there is no real consensus amongst the libraries. Still I find the number of issues a strong indication that the current behaviour of this library is unexpected. This PR's modifies the multipleOf behaviour to allow for float tolerance (epsilon) to be taken into account.