Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Argument Clinic aware of possible UBSan failures #128180

Open
picnixz opened this issue Dec 22, 2024 · 3 comments
Open

Make Argument Clinic aware of possible UBSan failures #128180

picnixz opened this issue Dec 22, 2024 · 3 comments
Labels
topic-argument-clinic type-feature A feature request or enhancement

Comments

@picnixz
Copy link
Contributor

picnixz commented Dec 22, 2024

Feature or enhancement

A huge part of the CPython code base uses Argument Clinic for generating methods for classes or standalone functions. Generally, methods are then used as follows:

static PyMethodDef foo_methods[] = {
    _FOOMODULE_FOO_BAR_METHODDEF
    {NULL, NULL}
};

where each <MODULE>_<CLASS>_<METHOD>_METHODDEF is something like

PyDoc_STRVAR(_foomodule_Foo_bar__doc__,
"bar($self, /)\n"
"--\n"
"\n");

#define _FOOMODULE_FOO_BAR_METHODDEF 		\
    {										\
        "bar", 								\
        (PyCFunction)_foomodule_Foo_bar, 	\
        METH_NOARGS, 						\
        _foomodule_Foo_bar__doc__           \
    },

We also have

// @file Modules/foomodule.c

/*[clinic input]
_foomodule.Foo.bar

[clinic start generated code]*/

static PyObject *
_foomodule_Foo_bar_impl(FooObject *self)
/*[clinic end generated code: output=... input=...]*/
{
	Py_RETURN_NONE;
}

// @file Modules/clinic/foomodule.c.h

static PyObject *
_foomodule_Foo_bar_impl(FooObject *self);

static PyObject *
_foomodule_Foo_bar(FooObject *self, PyObject *Py_UNUSED(ignored))
{
    return _foomodule_Foo_bar_impl(self);
}

Now, in order to fix #111178, we generally convert non-AC clinic methods by changing FooObject * to PyObject * and doing a pointer cast in the function body. See #111178 for a wider discussion. The issue is that we cannot easily do the same for AC-generated code since the prototype is part of the auto-generated block.

We can do it if we specify self: self(type="PyObject *") instead of not specifying self in the AC input, but this is an overkill. Instead, AC should always generate two functions: an implementation function which takes a typed object (i.e., FooObject) and a public dispatcher which takes a PyObject *self and possibly unused arguments. Stated otherwise:

static PyObject *
-_foomodule_Foo_bar(FooObject *self, PyObject *Py_UNUSED(ignored))
+_foomodule_Foo_bar(PyObject *self, PyObject *Py_UNUSED(ignored))
{
-    return _foomodule_Foo_bar_impl(self);
+    return _foomodule_Foo_bar_impl((FooObject *)self);
}

I decided to open a separate issue to track this change orthogonally to the UBSan failures fixes that can be achieved without any AC changes. I hope that this will not deterioriate the performances of CPython too much.


Note: this does not only affect methods as above; this affects any method that is casted to PyCFunction or anything else with incompatible function pointer signatures.

@picnixz picnixz added type-feature A feature request or enhancement topic-argument-clinic labels Dec 22, 2024
@picnixz
Copy link
Contributor Author

picnixz commented Dec 26, 2024

cc @ZeroIntensity

@ZeroIntensity
Copy link
Member

Took a stab at this. For simple cases, all that needs to change is this property to always return PyObject *, and then the functionality to cast it is already implemented:

def parser_type(self) -> str:
assert self.type is not None
if self.function.kind in {METHOD_INIT, METHOD_NEW, STATIC_METHOD, CLASS_METHOD}:
tp, _ = correct_name_for_self(self.function)
return tp
return self.type

We'll still need to change some other things though, because some generated clinic code relies on specific structures (e.g. fields in PyTypeObject *). More importantly, we'll have to manually go through and audit where we need to change casts for areas that call clinic funtions manually, typically for locking. For example, I did this with the SSL thread safety fix:

if (_ssl__SSLSocket_owner_set(self, owner, NULL) == -1) {

Here, self is a PySSLSocket * rather than a PyObject *, so changing that would break this case. We'll probably need to deal with that on an agonizing case-by-case basis.

@picnixz
Copy link
Contributor Author

picnixz commented Dec 29, 2024

I'm not afraid of doing the case-by-case, considering this is what I've been doing for the rest. But thanks for digging into this, I haven't looked at clinic a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-argument-clinic type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants