-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GADT pattern exhaustiveness checking and abstract types #7028
Comments
Comment author: @gasche When the abstract types are used as phantom type-level tags, the feature is unnecessary as type abstr = Blabla can be used itself (it is not optimal as it is not self-discoverable, but neither is your proposal). The proposal is interesting for types that are not phantom, as they are inhabited by values (produced by a FFI interface). Would it make sense, then, to add the "external" keyword as a prefix to the proposed syntax? external type abstr "blabla" this is consistent with external value declarations and highlights the intended use-case (FFI types, so to speak). |
Comment author: @yallop Perhaps something similar could be achieved for external-abstract (FFI) types by adding an injective type constructor to the initial environment that is known-unequal to other builtin types type _ abstract at which point we could write things like this type abstr = [`blabla] abstract and the type checker would know that [ |
Comment author: @alainfrisch This is not only useful for "external" types, also for concrete types (implemented by a sum or a record) which you don't want to expose to preserve value abstraction. In the example above, replace X.abstr by Date.t for a more realistic case. |
Comment author: @lpw25 I think the idea of adding information to the representation of the type, and being able to reveal that aspect of the representation in the interface, is a reasonable approach to this problem. I would prefer the information to take the form of an attribute (e.g. [@@Label foo]) rather than a string. This also fits with the [@Immediate] attribute proposed here: except in Alain's proposal the extra information is a "phantom" piece of information that is not actual used for anything. |
Comment author: @yallop I'm in favour of the idea behind this proposal. It would be nice to be able to pick exactly which properties of a type are exposed through an interface. At the moment we have facilities for hiding the constructors but leaving the constructors exposed (but not vice versa), facilities for exposing co- or contra-variance (but not injectivity or bivariance), facilities for exposing equalities (but not inequalities). I agree with Leo that using a string for the type identity isn't ideal, but I don't think it's a good idea to expose fundamental properties of the language, such as type equality or inequality, using attributes either.
I'm not sure. Over there you made the following observation:
However, type inequality is very much a property of the language itself. |
Comment author: @lpw25
But it is not type inequality it is type incompatibility, which I consider as "extra-lingustic" as the representation is "extra-linguistic". I tend to think that things related to warnings and runtime efficiency as opposed to soundness are aspects of the implementation rather than the language, and so should be handled using attributes. For example, I would not consider attributes for the other properties that you listed (bi-variance, the dual of private constructors etc.) as they affect the soundness of programs. |
Comment author: @yallop
True.
I think that's a reasonable view. However, introducing incompatibility information can cause fatal errors, not just warnings, as in the following program, which is rejected if s and t are incompatible:
|
Comment author: @alainfrisch The fact that a non exhaustive pattern matching is reported only as a warning is an accident of history, I would say. Exhaustivity checking is arguably one aspect of static type safety. |
Comment author: @gasche I feel that the consideration of (in)compatibility rather than (in)equality is a sort of coincidence here: Alain is pragmatic in his proposal and knows that trying to classify (in)equality by non-unique strings would be problematic. But this looks like a pragmatic approximation of a type-theoretic mechanism that, deep down, wants to talk about type identity (thus equality). We are not going to explain it to our users in term of compatibility. |
Comment author: @alainfrisch I'm not sure this would be useful, but one could generalize the idea a little bit. The extra annotation could be an element of a partially ordered set (X, <=) -- the ordering constrains subtyping -- with the following properties:
Possible generalizations:
|
Comment author: @lpw25
I know. I think I've said it before but that error should really be a warning. At the moment it breaks abstraction.
I think if we want our users to understand abstraction we should explain things properly. A lot of the misconceptions about how OCaml works and why certain features are not possible comes from not appreciating this point. |
Comment author: @alainfrisch
I agree with Leo here. It is worth specifying how compatibility is defined and explaining its purpose. |
Comment author: @garrigue Indeed, we may need more details about the compatibility relation, as it may be surprising at times. And I see no problem in adding some extra information to types to allow to distinguish them once abstracted. For soundness I suppose that adding this information should only be allowed when defining a type (i.e. in a .ml, without a manifest type), and checked by module subtyping. Since this name is supposed to be global, some kind of path would make sense, but of course it must be clear that it is not interpreted in any way (so a string makes sense too, but those not feel very much part of the language) However, I wanted to remind all that while not knowing of an incompatibility may cause spurious warnings, and force to add inefficient extra cases as a result, the other problem with abstraction, i.e. the loss of injectivity, is much worse, as it means you are also loosing positive equations, that you may need for typing. If we attempt to solve compatibility, we should solve injectivity too. |
Comment author: @alainfrisch
This depends of course on the typical use case. I consider warnings on non-exhaustive matches as actual type errors, and having to add useless extra cases to get rid of them somehow reduces the interest of using GADTs in some cases. Anyway, do you have ideas that would solve both problems at the same time? If not, would you be willing to accept a concrete proposal along the line of what is discussed here? |
Comment author: @garrigue The simplest would be to use attributes, but I see here some arguments against. |
Comment author: @damiendoligez I don't know about the other properties, but AFAICT the original problem can be solved from within the language with the addition of one small magic ingredient. Just write: module X : sig type rep type abstr = Blabla of rep end = struct type rep = ... type abstr = Blabla of rep end type _ t = | Int: int t | Float: float t | Abstr: X.abstr t let f Int = () But, you will say, this entails more allocations and indirections. That's where the magic ingredient comes in: an optimization to remove the constructor for single-constructor types, which lets users create new incompatible types with the same representation as existing types. The Blablaconstructor then exists only at compile-time, and matching it is a no-op. |
Comment author: @alainfrisch
This optimization would (in theory at least) break the FFI if it is applied automatically. It could be controlled by an attribute, but in this case we pay the "price" of the proposal (an extra attribute on the declaration), without the getting the full benefits (the "client" code is made more complex because of the constructor). Or we make the optimization automatic and we document it in the FFI (the impact on existing bindings is probably minor ). |
Comment author: @damiendoligez The optimization in question was added to 4.04 because it is also useful for other things. It gives a workaround for the present problem, but of course not a perfect solution. |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
See ocaml/RFCs#4 and #9042 for a potential solution. |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
Original bug ID: 7028
Reporter: @alainfrisch
Status: acknowledged (set by @xavierleroy on 2015-11-15T17:17:53Z)
Resolution: open
Priority: normal
Severity: feature
Target version: later
Category: typing
Has duplicate: #7360
Monitored by: @gasche @yallop @hcarty
Bug description
It is well known that GADT and abstract types don't interact well w.r.t. exhaustiveness checking. A typical example:
Since the type checker has no way to know that X.t (which could come from an external unit) is different from int in all contexts, it cannot prove that the pattern matching is exhaustive.
This seriously limits the usefulness of GADT. Possible workarounds: (i) break abstraction; (ii) add impossible pattern matching cases. Note that (i) does not work for instance when the type is really implemented as an abstract type (to represent external data structures). I'd like to propose a pragmatic solution based on a softer version of abstraction breaking. Instead of exposing the concrete definition of the type, one could allow to attach an arbitrary string (or identifier) to any type definition without a manifest type (i.e. "fresh" concrete or abstract types, not re-exported types or pure abbreviations). This string could be forgotten in the interface but not changed or added. This provides the guarantee that if two types have such an attached string, and the two strings are different, it is safe to assume that the two types are non-compatible (i.e. different in any context). Built-in types would come with such internal strings.
I don't have a concrete proposal for the syntax, but conceptually, this would give:
The programmer would be responsible to use unique enough strings if they need to. Using the same string for two types does not break safety, it only prevents the type-checker from proving exhaustivity of some GADT patterns.
A functor could even specify that some abstract types in its argument must have a specific string, allowing to use a GADT as above in the body of the functor (exposing the concrete structure is not an option here).
The text was updated successfully, but these errors were encountered: