Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to route ufunc signatures we do not have hook for into threader #12

Open
tdimitri opened this issue Sep 29, 2020 · 1 comment
Open
Assignees
Labels
enhancement New feature or request

Comments

@tdimitri
Copy link
Collaborator

we can thread ufuncs we do not understand.
For a binary_reduce, on a large array, we can divide the work up assigning each work chunk to a thread.
each work item would output to a slot in another output array (allocated on the fly).
then that output array can be sent back to the binary_reduce loop for the final calculation (example would be each thread calculates the sum, then the final calculation does the sum of sums)

For non binary reduce on large arrays, we can divide up the work as normal (for both binary and unary ufuncs).

@tdimitri tdimitri added the enhancement New feature or request label Sep 29, 2020
@tdimitri tdimitri self-assigned this Sep 29, 2020
@mattip
Copy link
Collaborator

mattip commented Sep 29, 2020

we can thread ufuncs we do not understand.

We can reuse the pointer we pull out of PyUFunc_ReplaceLoopBySignature and plug it back into our loop override. This has the advantage of using the SSE/AVX optimized loop without needing the CPU detection since NumPy already uses AVX for many ufuncs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants