Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: "unhashable" #64

Open
alandonovan opened this issue Jul 4, 2019 · 5 comments
Open

spec: "unhashable" #64

alandonovan opened this issue Jul 4, 2019 · 5 comments
Assignees

Comments

@alandonovan
Copy link
Contributor

No description provided.

@ndmitchell
Copy link
Contributor

What do you think needs clarifying here? I was assuming unhashable = not hashable, and hashable seems fairly well defined throughout the spec.

@stepancheg
Copy link
Contributor

The spec contains contradictory statements:

Lists are not hashable, so may not be used in the keys of a dictionary.

Values of mutable types such as list and dict are not hashable, unless they have become immutable due to freezing.

Would be helpful it the spec was more precise in definitions.

@alandonovan
Copy link
Contributor Author

Yes, the intention changed at some point in the past so that lists were hashable-only-if-frozen but now are always unhashable; the text of the spec was not brought to consistency.

In the Go implementation, lists are unhashable. In the Java implementation, lists may not be used as keys of a dict, but structures (like structs) that indirectly contain lists may be used as keys of a dict if frozen, and sadly this is widely relied upon. It's hard even just to understand exactly what the Java implementation permits.

@brandjon
Copy link
Member

This has been a long-standing issue that I've unfortunately wavered back and forth on.

On the one hand, I've always found it odd that freezing a value could allow you to do more things with it (hash it), rather than strictly fewer (no mutations). It breaks my mental model that freezing is sort of like up-casting to a base type with fewer operations, and where Liskov substitution says you can always pass the derived type to a method accepting the base type. That's why we find ourselves in odd situations like code breaking when it gets refactored into the same file as where its value is consumed. For instance, the example given in #272.

On the other hand, if we don't allow hashing of formerly-mutable objects, that creates a need for separate frozen-from-birth types. Lists already have this in the form of tuples, but for the recently added set() type it means also borrowing the corresponding frozenset() type from Python. It's not a great user experience to explain to people that no, they can't use a set that was frozen, they instead need a frozen set. And as Alan said, there are many existing uses in Bazel rule logic where a depset is dubiously created with about-to-be-frozen values.


(Digression warning)

We've also thought about possible interactions with a Starlark type system. A key invariant of any type system is that the static type of a variable/expression is a superset of the dynamic types of the possible values it may refer to at runtime. So if you write

x = set()
freeze(x)  # assuming Go-interperter-style freeze() primitive
x.add("abc")

Is the freeze() call understood to be mutating the dynamic type of x from set to frozenset? A static type inference can't possible know that the x in x.add(...) is now a frozenset, at least in the general case where the freeze call is buried in a helper function, without there being some kind of parameterization of type expressions on whether or not they're frozen. And that sounds disturbingly similar to C/C++'s dreaded "const-correctness" problem.

Another, more reasonable semantics is that freeze() returns the const-ified version of the value's type. In that case, the above error would not be caught by the type system, but this modified version would:

x = set()
x = freeze(x)  # uses of x hereon have type `frozenset`
x.add("abc")

(If frozenset is to be considered base type of set then this would require that hashing mutable sets is a dynamic error rather than a static type error.) But this still doesn't escape C/C++'s constness pain, wherein everyone's obliged to remember to annotate their parameters as taking frozenset where possible instead of set.


So perhaps it's better to keep frozen-ness entirely out of scope for the type system. But then that raises questions about what did we really gain by banning hashing of frozen values. If we undo that, then we could add frozenset() to Starlark as a constructor of an already-frozen set() rather than as a distinct type. Tuples and frozen lists would be functionally redundant, though still distinct in their APIs. And in Bazel, depsets could continue to look the other way when someone passes in a mutable hashable value that's about-to-be-frozen-so-pretty-please-don't-worry-about-it.

Ideally I'd like us to reach a decision on this sometime soon, e.g. in Q1 before the proposed Starlark type system is too mature.

@brandjon
Copy link
Member

@comius FYI, some thoughts above about mutability w.r.t. types. Tl;dr: It seems hard to work mutability information (both mutation safety and hashability safety) into the type system without creating a lot more work for users (and us) than it's probably worth. If we don't try, and instead keep them as purely dynamic errors, then we may avoid the need for a distinct frozenset type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants