Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Camlinternal)Uniform_array: minimal support for uniform arrays #37

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gasche
Copy link
Member

@gasche gasche commented Mar 3, 2024

This RFC proposes to add minimal support in the OCaml implementation for uniform arrays, which always use the standard Array_tag representation and never Double_array_tag.

The compiler would offer specialized built-in and C primitives to work with uniform arrays, and a minimal CamlinternalUniform_array module in the standard library exposing these primitives.

This would make it easier to write some unsafe idioms that enforce uniform arrays in more complex ways today, simplifying the code of Domain.DLS, the unboxed implementation of Dynarray, and the Uniform_array module of Jane Street's Base library and making it easier to reason about their correctness.

Full version, rendered.

@lthls
Copy link

lthls commented Mar 3, 2024

I'd suggest to look into CamlinternalOO if you want to see some existing uses of uniform arrays in the standard library.
Given that this code is crucial to the correctness of objects and classes, it's rather safe to hope that optimisers will not do anything funny about it.
(There are a lot of interesting pieces of code in this file. For instance:

type item = DummyA | DummyB | DummyC of int
let _ = [DummyA; DummyB; DummyC 0] (* to avoid warnings *)

let dummy_item = (Obj.magic () : item)

Why not just dummy_item = DummyA ? I guess we'll never know.)

@gasche
Copy link
Member Author

gasche commented Mar 3, 2024

Thanks for the pointer. From looking at it briefly, I cannot tell whether I would recommend trying to use explicit uniform arrays for this purpose. The "methods" array seems to be a bytecode array with a mix of methods and arguments (so indeed uniform arrays would be a good choice here). The obj type itself is advertised as an array internally, but in fact uses a non-0 tag, so it is not obviously a uniform array -- or at least it falls outside the scope I would naturally consider.

Why not just dummy_item = DummyA ? I guess we'll never know.

Git suggests that this is a historic artifact, before those constructors were introduced item was an abstract type, so it was natural to inhabit it with Obj.magic (). When Jacques moved to a concrete definition in 6a940ef65d7b70f94e221f4b6731b4ed7a9c410e, he could have used DummyA at this point but he did not change the definition.

(What I cannot tell is what DummyB is for. DummyC looks like it could have been vaguely used for dummy_met, but in the end the author decided to use an atom to save an allocation or something.)

@lthls
Copy link

lthls commented Mar 4, 2024

Coming back to the proposal:
I believe that you can get a correct implementation by using Obj.new_block for Uniform_array.make, and regular array functions for all the rest. If you want an extra bit of performance (or, more likely, are annoyed about the compiler not noticing that field accesses cannot allocate), you can expose the Pfield_computed and Psetfield_computed Lambda primitives using new %-prefixed external declarations and use that for get and set.
One issue is that initialising an array now involves a loop calling caml_modify on each iteration; if this is an issue for you you will have to go back to C, although you can probably refactor caml_make_vect to expose a version that does not perform the flat float array optimisation.
So, very little new code in the runtime and the compiler.

I don't see the point on insisting on a particular tag: Array_tag doesn't exist, the tag of an array has never been specified (except for flat float arrays), so I think we should allow any tag up to Obj.last_non_constant_constructor_tag (that doesn't include the object tag used in CamlinternalOO but that's fine with me).

@gasche
Copy link
Member Author

gasche commented Mar 4, 2024

I believe that you can get a correct implementation by using Obj.new_block for Uniform_array.make, and regular array functions for all the rest. If you want an extra bit of performance (or, more likely, are annoyed about the compiler not noticing that field accesses cannot allocate), you can expose the Pfield_computed and Psetfield_computed Lambda primitives using new %-prefixed external declarations and use that for get and set.

Currently translprim.ml has:

    "%array_length", Primitive ((Parraylength gen_array_kind), 1);
    "%array_safe_get", Primitive ((Parrayrefs gen_array_kind), 2);
    "%array_safe_set", Primitive ((Parraysets gen_array_kind), 3);
    "%array_unsafe_get", Primitive ((Parrayrefu gen_array_kind), 2);
    "%array_unsafe_set", Primitive ((Parraysetu gen_array_kind), 3);

    "%floatarray_length", Primitive ((Parraylength Pfloatarray), 1);
    "%floatarray_safe_get", Primitive ((Parrayrefs Pfloatarray), 2);
    "%floatarray_safe_set", Primitive ((Parraysets Pfloatarray), 3);
    "%floatarray_unsafe_get", Primitive ((Parrayrefu Pfloatarray), 2);
    "%floatarray_unsafe_set", Primitive ((Parraysetu Pfloatarray), 3);

I would propose to add just 5 extra lines of code:

    "%uniform_array_length", Primitive ((Parraylength Paddrarray), 1);
    "%uniform_array_safe_get", Primitive ((Parrayrefs Paddrarray), 2);
    "%uniform_array_safe_set", Primitive ((Parraysets Paddrarray), 3);
    "%uniform_array_unsafe_get", Primitive ((Parrayrefu Paddrarray), 2);
    "%uniform_array_unsafe_set", Primitive ((Parraysetu Paddrarray), 3);

One issue is that initialising an array now involves a loop calling caml_modify on each iteration; if this is an issue for you you will have to go back to C, although you can probably refactor caml_make_vect to expose a version that does not perform the flat float array optimisation. [...] So, very little new code in the runtime and the compiler.

Yes, this is the idea. My idea would be to specialize primitives to carve out less-smart versions, so the amount of new code should be very small in any case.

I don't see the point on insisting on a particular tag: Array_tag doesn't exist, the tag of an array has never been specified (except for flat float arrays), so I think we should allow any tag up to Obj.last_non_constant_constructor_tag (that doesn't include the object tag used in CamlinternalOO but that's fine with me).

For now I am not really interested in extending the proposal to explicitly cover non-standard tag choices, which sound a bit dubious to me. If the implementation naturally allows them I will look the other way.

@lthls
Copy link

lthls commented Mar 5, 2024

I would propose to add just 5 extra lines of code:

I don't think you need them. On all platforms for which we support the native compiler, Parraylength is compiled the exact same way no matter the kind. The unsafe accessors are exactly Pfield_computed and Psetfield_computed (which we should expose so that CamlinternalOO uses them instead of array accesses). And it's only very slightly less efficient to implement the safe versions in OCaml compared to using the primitives.

But I guess that shows the difference in how we see this: you seem to want a full-fledged data structure with compiler support, I just want a decent way to manipulate dynamically sized blocks full of values.

@OlivierNicole
Copy link

This looks very useful to have to increase legibility and safety in the important use cases you mention. I can also imagine cases when I’d want to enforce absence of allocation in some hot paths.

@gasche
Copy link
Member Author

gasche commented May 13, 2024

Coq/Rocq also includes a small module of uniform arrays in its codebase, as part of the definition of its "persistent arrays": https://github.com/coq/coq/blob/d7d392191a367839b0f5a7772e48b8f24e9f1b3e/kernel/parray.ml . My understanding is that the need for uniformity arises from the compilation scheme of {vm,native}_compute, which introduces arrays of Coq values, which include native floats at a type that can contain non-float values, the "accumulator" values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants