need to route ufunc signatures we do not have hook for into threader #12

tdimitri · 2020-09-29T13:49:23Z

we can thread ufuncs we do not understand.
For a binary_reduce, on a large array, we can divide the work up assigning each work chunk to a thread.
each work item would output to a slot in another output array (allocated on the fly).
then that output array can be sent back to the binary_reduce loop for the final calculation (example would be each thread calculates the sum, then the final calculation does the sum of sums)

For non binary reduce on large arrays, we can divide up the work as normal (for both binary and unary ufuncs).

mattip · 2020-09-29T13:58:08Z

we can thread ufuncs we do not understand.

We can reuse the pointer we pull out of PyUFunc_ReplaceLoopBySignature and plug it back into our loop override. This has the advantage of using the SSE/AVX optimized loop without needing the CPU detection since NumPy already uses AVX for many ufuncs.

tdimitri added the enhancement New feature or request label Sep 29, 2020

tdimitri self-assigned this Sep 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

need to route ufunc signatures we do not have hook for into threader #12

need to route ufunc signatures we do not have hook for into threader #12

tdimitri commented Sep 29, 2020

mattip commented Sep 29, 2020

need to route ufunc signatures we do not have hook for into threader #12

need to route ufunc signatures we do not have hook for into threader #12

Comments

tdimitri commented Sep 29, 2020

mattip commented Sep 29, 2020