-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UBSan: Calling a function through pointer to incorrect function type is undefined behavior #111178
Comments
See also http://maskray.me/blog/2022-12-18-control-flow-integrity#fsanitizefunction by @MaskRay for more background. |
This is one example where the recent trend of thinking we should use specific types on our C API is not compatible with parts of our design. Things like your proposed change are likely good. I'd like to avoid gaining new In this case it's odd that this counts as undefined behavior. Reality: A pointer is always going to be a pointer regardless of what type it points to - one machine word. The standard is deficient. |
note that the standard requires that all pointers are of the same finite size, including FPs |
That trend, as I understand it, is unrelated, or only tangentially related. It doesn't involve casting function types; and the discussions around it didn't involve type definition. Please don't conflate it with this issue. |
My comment is more about seeing a function using the specific type. I didn't go looking to see if it has always been that way or was a recent change. Per the source history, this one has always been defined using PyListObject. That it causes UBSAN issues is just something to be aware of if we do want to use specific types in more places. |
Regardless of pointers all being the same size, the compiler behavior concern might be more to do with the structure aliasing concept rather than the pointer itself. I'd have to ask llvm folks. The answer likely wouldn't change what we need to deal with. |
Poking around, we've got quite a few of these PyTypeObject function pointers that are cast in order to construct the structures with functions taking more specific types. The cleanup for clang-17's new UBSAN check will touch many files. (but is otherwise a pretty mechanical change - in most cases we can just change the parameter type to be generic and add a casting assignment to a local variable of the original name within the function to "launder" the type). |
Struct casting & aliasing rules say that you can safely cast When compilers start punishing this, it'll be bad news. Pretty much all types defined in C use this pattern – not just in CPython core, but in third-party extensions as well. It's even in the tutorial: https://docs.python.org/3/extending/newtypes_tutorial.html#adding-data-and-methods-to-the-basic-example
The overwhelming majority of the fixes should be simple. It's the sheer number of them that will require disproportionate review effort. |
I updated our Clang UBSAN buildbot last night, clang-17. So it fails the build on this issue now. =) https://buildbot.python.org/all/#/workers/2 |
Oh no, this issue is related to Control Flow Integrity (CFI). I think that this problem should be treated seriously, more and more companies are enforcing CFI. It's better if Python can be adapted to respect CFI as soon as possible. CFI even started to be implemented at the hardware (CPU) level. The Linux kernel does its best to implement CFI to harden the kernel. Recently, I added the // Cast an function to the PyCFunction type to use it with PyMethodDef.
//
// This macro can be used to prevent compiler warnings if the first parameter
// uses a different pointer type than PyObject* (ex: METH_VARARGS and METH_O
// calling conventions).
//
// The macro can also be used for METH_FASTCALL and METH_VARARGS|METH_KEYWORDS
// calling conventions to avoid compiler warnings because the function has more
// than 2 parameters. The macro first casts the function to the
// "void func(void)" type to prevent compiler warnings.
//
// If a function is declared with the METH_NOARGS calling convention, it must
// have 2 parameters. Since the second parameter is unused, Py_UNUSED() can be
// used to prevent a compiler warning. If the function has a single parameter,
// it triggers an undefined behavior when Python calls it with 2 parameters
// (bpo-33012).
#define _PyCFunction_CAST(func) \
_Py_CAST(PyCFunction, _Py_CAST(void(*)(void), (func))) The problem is that typedef PyObject *(*PyCFunction)(PyObject *, PyObject *); whereas Python uses calling convention with more or less parameters than I think that we should decide a global strategy for this problem, rather than trying the whack-a-mole game with various strategies. It's a problem as bad a Python 2 PyObject structure inheritance which was not compatible with strict aliasing. This problem was solved with a new API, especially this macro: #define PyObject_HEAD PyObject ob_base; |
I have a different experience with casting from/to |
I have opened a couple of PRs for a few examples of this issue. I intended to separate them by codeowner(s), if that is helpful. Some instances of this issue lead to far more errors than others; I am not aware how instances involving generated code should be handled: Lines 1013 to 1017 in d384813
|
Related to this point: how likely is it that instances of this issue will keep being introduced or reintroduced, at least without greater awareness and acceptance of the relevant C rules? How much does having CI run with |
It helps. Particularly once we've gotten to a stable point so that the ubsan build is not Red again. As people notice when things start failing after a change. I believe a lot of what we do within our C code for these design patterns is cargo cult from other existing examples. So if we've cleaned things up in a consistent manner, including the code generators, to do something in a "right" way, that'll naturally become what gets done in new code. |
…meters in slot typedefs table (GH-112742) In the slot typedefs table, the parameter of `destructor` and the first parameter of `traverseproc` should both be `PyObject *` rather than `void *`. Same for `inquiry`.
…` parameters in slot typedefs table (pythonGH-112742) In the slot typedefs table, the parameter of `destructor` and the first parameter of `traverseproc` should both be `PyObject *` rather than `void *`. Same for `inquiry`. (cherry picked from commit 00cce0f) Co-authored-by: Christopher Chavez <[email protected]>
…` parameters in slot typedefs table (pythonGH-112742) In the slot typedefs table, the parameter of `destructor` and the first parameter of `traverseproc` should both be `PyObject *` rather than `void *`. Same for `inquiry`. (cherry picked from commit 00cce0f) Co-authored-by: Christopher Chavez <[email protected]>
…r` parameters in slot typedefs table (GH-112742) (GH-112792) gh-111178: Docs: fix `traverseproc`, `inquiry`, and `destructor` parameters in slot typedefs table (GH-112742) In the slot typedefs table, the parameter of `destructor` and the first parameter of `traverseproc` should both be `PyObject *` rather than `void *`. Same for `inquiry`. (cherry picked from commit 00cce0f) Co-authored-by: Christopher Chavez <[email protected]>
…r` parameters in slot typedefs table (GH-112742) (GH-112793) gh-111178: Docs: fix `traverseproc`, `inquiry`, and `destructor` parameters in slot typedefs table (GH-112742) In the slot typedefs table, the parameter of `destructor` and the first parameter of `traverseproc` should both be `PyObject *` rather than `void *`. Same for `inquiry`. (cherry picked from commit 00cce0f) Co-authored-by: Christopher Chavez <[email protected]>
Bug report
Bug description:
UBSan (UndefinedBehaviorSanitizer) in LLVM.org Clang 17 makes
-fsanitize=function
available for C; previously, it was only for C++. (So it may also be made available in future Apple Xcode clang and GCC.) By default, it is implied by -fsanitize=undefined (which is what./configure --with-undefined-behavior-sanitizer
uses), but it can be disabled using -fno-sanitize=function.For a project such as CPython, which has long relied on function pointers for callbacks, yet seems to have only required that callbacks behave as expected under typical ABI calling conventions, rather than more strictly be declared/defined as a type compatible with the function pointer they will be called as, this leads to numerous errors from UBSan.
Examples when starting Python REPL:
Example workaround for the first error and likely many others, where instead of casting functions to incompatible pointers, the functions use compatible signatures and cast their parameter(s):
In other cases, it may be less disruptive to introduce a wrapper function with the correct signature:
Likely instances of this can be found at compile time using e.g.
-Wcast-function-type
(although this emits false positives for when the function pointer is cast back to the correct type before called, and this warning is suppressed by intermediate casts through(void *)
):I would be interested in combing through and replacing similar instances. But I would not be surprised if sooner or later I encounter an instance that is won’t-fix because it involves a stable API, or if I am told that this problem should be ignored because fixing it is too disruptive or requires disproportionate review effort. I am not aware how immediate any danger is from optimizing compilers exploiting this type of undefined behavior.
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Written by @picnixz:
For detecting the UBSan failures, contributors may configure Python with:
The complete list of failures can be retrieved as follows:
Note that different builds should be configured in order to hunt all UBSan failures (e.g.,
--with-trace-refs
or--disable-gil
to expose conditional compiled code guarded by macros).Linked PRs
visitproc
callback functions properly and remove unnecessary casts in gcmodule.c #112687traverseproc
,inquiry
, anddestructor
parameters in slot typedefs table #112742traverseproc
,inquiry
, anddestructor
parameters in slot typedefs table (GH-112742) #112792traverseproc
,inquiry
, anddestructor
parameters in slot typedefs table (GH-112742) #112793longobject.c
#122972partialobject
#124733_elementtree.c
#127982gdbmobject
#128178Python/bltinmodule.c
#128235Objects/bytearrayobject.c
#128236Objects/bytesobject.c
#128237Modules/_bz2module.c
#128238Objects/capsule.c
#128239Objects/codeobject.c
#128240Objects/complexobject.c
#128241Python/context.c
#128242Modules/_csv.c
#128243Modules/curses*.c
#128244Objects/descrobject.c
#128245Objects/enumobject.c
#128246Python/hamt.c
#128247Modules/{blake2,md5,sha1,sha2,sha3}module.c
#128248Modules/socketmodule.c
#128249Modules/_sre/sre.c
#128250Objects/tupleobject.c
#128251Modules/zlibmodule.c
#128252Modules/_abc.c
#128253Python/traceback.c
#128259The text was updated successfully, but these errors were encountered: