Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert static initializers of PyTypeObjects to use C99 Designated (named) initializers #127679

Open
markshannon opened this issue Dec 6, 2024 · 13 comments
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@markshannon
Copy link
Member

markshannon commented Dec 6, 2024

Named initializer are both easier to read and less error prone. There is AFAICT no downside to using them.

However, we have been historically reluctant to change code for purely maintenance. Personally, I think this leads to poorer code quality, but others value cleaner code history more.

In this case I think the readability and correctness advantages are such that we should make this change and accept the inconvenience in version control history.

As a motivating example, take the bool type, PyBool_Type:

PyTypeObject PyBool_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "bool",
    offsetof(struct _longobject, long_value.ob_digit),  /* tp_basicsize */
    sizeof(digit),                              /* tp_itemsize */
    bool_dealloc,                               /* tp_dealloc */
    0,                                          /* tp_vectorcall_offset */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_as_async */
    bool_repr,                                  /* tp_repr */
    &bool_as_number,                            /* tp_as_number */
    0,                                          /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    0,                                          /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT,                         /* tp_flags */
    bool_doc,                                   /* tp_doc */
    0,                                          /* tp_traverse */
    0,                                          /* tp_clear */
    0,                                          /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    0,                                          /* tp_iter */
    0,                                          /* tp_iternext */
    0,                                          /* tp_methods */
    0,                                          /* tp_members */
    0,                                          /* tp_getset */
    &PyLong_Type,                               /* tp_base */
    0,                                          /* tp_dict */
    0,                                          /* tp_descr_get */
    0,                                          /* tp_descr_set */
    0,                                          /* tp_dictoffset */
    0,                                          /* tp_init */
    0,                                          /* tp_alloc */
    bool_new,                                   /* tp_new */
    .tp_vectorcall = bool_vectorcall,
};

There are a lot of zeros, and a lot of comments to make sure we have the right number of zeros. If one is missing, things go wrong.
Contrast that with the C99 version:

PyTypeObject PyBool_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    .tp_name = "bool",
    .tp_basicsize = offsetof(struct _longobject, long_value.ob_digit),
    .tp_itemsize = sizeof(digit),
    .tp_flags = Py_TPFLAGS_DEFAULT,

    .tp_as_number  = &bool_as_number, 
    .tp_base = &PyLong_Type, 
    .tp_dealloc  = bool_dealloc,  
    .tp_doc = bool_doc,
    .tp_new = bool_new,
    .tp_repr = bool_repr,
    .tp_vectorcall = bool_vectorcall,
};

The code is self describing (no need for comments naming the fields) and is much less error prone. The fields can listed in any order, in this case with the core field first, then the optional fields alphabetically.

@erlend-aasland
Copy link
Contributor

I've been using the C99 named initializers consistently in all my module isolation changes (PyType_Spec structs, PyModuleDef struct, etc.). +1 for more readable and maintainable code.

@picnixz
Copy link
Contributor

picnixz commented Dec 6, 2024

  • Question 1: which fields should be top-level? (for contributors, this could be good to know in advance)
  • Question 2: do you want one huge PR for all types?
  • Question 3: do you want a comment separating required fields from optional ones?

@picnixz picnixz added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) extension-modules C modules in the Modules dir labels Dec 6, 2024
@erlend-aasland
Copy link
Contributor

META: this would have been a perfect fit for a type-refactoring label (we've talked about such a label before)

@markshannon
Copy link
Member Author

TBH, I was just using that layout as an example of how C99 can lead to more readable code.
I don't have strong opinions on what the order should be. The important thing is that changing the order doesn't break code.

@picnixz
Copy link
Contributor

picnixz commented Dec 6, 2024

Personally, I like when the code looks uniform and nice (call me a purist if you want). I think separating required from optional fields is good but sometimes, some fields are required (e.g., tp_traverse when supporting the GC protocol).

So it's good to know that by just looking at the code instead of rediving into the docs and the specs.

@erlend-aasland
Copy link
Contributor

I prefer separating stuff like the number protocol and the sequence protocol. For tp_traverse, I'd stick to just placing it near tp_free and tp_clear.

@erlend-aasland
Copy link
Contributor

Question 2: do you want one huge PR for all types?

My 2 øre: I'd split this up in one PR for each dir (most of these static types are found in Python/ or Objects/).

@serhiy-storchaka
Copy link
Member

I was reluctant to do such change, but I think that earlier or later we should do this. The end result is not just more readable, it is more greppable, and I often search in the code. But please open a topic on d.p.o and wait for a week. Otherwise, it could lead to a post-merge uproar, and possibly a rollback.

As for the order, just use the current order (possibly with few exceptions if there are reasons). One huge PR is okay. If you can make it (half-)automatically generated to reduce human factor -- even better.

@ericvsmith
Copy link
Member

I agree using named initializers would be a readability, grepability, and maintenance improvement.

@encukou
Copy link
Member

encukou commented Dec 10, 2024

I think it's better to only change code when you're touching it for other reasons.

But, especially if this is important enough that it should be done now, could you add it to PEP 7 first?

@picnixz
Copy link
Contributor

picnixz commented Dec 25, 2024

I found some motivation for fixing #111178, and this leads me to changing a lot of code where I can encounter static exported types (e.g.: #128241 (comment)). Should I still wait for a DPO thread and/or PEP-7 update or can I make the changes in those files (only if I touch the structures)?

@wrongnull
Copy link
Contributor

I would like to send a pr for this if you OK with it @picnixz

@picnixz
Copy link
Contributor

picnixz commented Dec 25, 2024

I think I'll wait for a consensus to be reached or at least for the following points to be addressed:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

7 participants