-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds load balanced kernel for point-point, point-linestring and linestring-linestring #1144
base: branch-23.08
Are you sure you want to change the base?
Adds load balanced kernel for point-point, point-linestring and linestring-linestring #1144
Conversation
…ti-geometry based
…y multigeometry test case.
…to fix/segment_iterator
…into improvement/load_balanced_distance_kernel
…m:isVoid/cuspatial into improvement/load_balanced_distance_kernel
auto aggregate = BlockReduce(temp_storage).Reduce(partial, cub::Min()); | ||
|
||
// atmomic with leading thread | ||
if (cooperative_groups::this_thread_block().thread_rank() == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any advantage to this over if (threadIdx.x == 0)
? What code does this produce? It's impossible for threadIdx.x == 0 to have exited before any other threads in its block in this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot think of any advantage comparing to threadIdx.x == 0
. I originally had hopped writing modularized kernel with cg but but later regressed. This is probably a remnant from prototyping.
std::numeric_limits<T>::max()); | ||
|
||
std::size_t constexpr threads_per_block = 256; | ||
std::size_t const num_blocks = | ||
(multilinestrings1.num_points() + threads_per_block - 1) / threads_per_block; | ||
std::size_t num_threads = 1e8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than hard coding, this may be a job for the occupancy API...
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__OCCUPANCY.html
Description
Place holder for PRs to close during burndown period.
Closes #1061
Checklist