-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda
: add moments
#3516
cuda
: add moments
#3516
Conversation
55a2fb5
to
ba4551c
Compare
@cudawarped Thanks a lot for your efforts. I have performed preliminary tests, and it seems satisfying my use case. I might get back to you later on with more details or comments once I get better chance to dig on this further. @alalek @asmorkalov Can you please consider this PR? |
@cudawarped Perhaps an idea for a follow PR, but I wonder if we can have something powerful around binary images. For example, with a multi-labels image, maybe we can compute moments of all shapes?
Actually, we already have If you think this deserves pushing for, then maybe I can open an issue, and tag some people. |
@cv3d I agree but it usually envolves several steps making it quite slow, i.e.
NPP has serveral routines to do this although I have found them to be buggy and very slow plus you need to use their slow labeling routine first before you can use CompressedMarkerLabelsInfo. It would be interesting but unfortunately I don't have time in the near future to look at this. |
@alalek @asmorkalov This PR is ready for a while now. |
@asmorkalov Thank you for running the CI, sorry it looks like I've left the wrong sanity check in for the spatial moments test I'll address this when I can. |
@asmorkalov Fixed the sanity check, ready for your review 🤞 |
@asmorkalov Should this be included in 4.9.0? |
I tested the PR locally with CUDA 11.8 and GF 1080. Python test fails with error:
|
fe95e09
to
eca7f1c
Compare
@asmorkalov Should be fixed now, I changed from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
As requested by @cv3d this PR continues the work of adding
cuda::moments
started in #3500.This PR introduces two functions for calculating image moments on the device, one asynchronous (
cuda::spatialMoments
) which only calculates the spatial moments and leaves the result on the device and one synchronous (cuda::moments
) which calculates the spatial moments on the device, downloads the result to the host and then usescv::Moments()
to calculate the remaining centralized and normalized image moments.The existing
cv::moments()
function returns the result in double precision due to the magnitude of the calculated image moments. Unfortunately double precision performance on Nvidia GPU's is at least 32x (depending on the generation) slower than single precision making the perfomance of this asynchrounous device side version only marginally (~3x depending on CPU/GPU) faster than its CPU counterpart. As a result the user can increase the performance by choosing a single precision result, and/or reducing the number of spatial image moments to calculate.To give an indication of the performace resulting from the various parameter combinations, the table below has some preliminary timings of the CPU (i72700H) version vs the kernel execution time on a RTX 3070 Ti Laptop GPU. The source image was uchar and contained a circle in the centre of radius 0.9*width.
binaryImage == false
binaryImage == true
Note:
binaryImage == false
as the performace is not affected by this flag, unlike on the CPU where the performance withbinaryImage == true
is significantly slower.binaryImage == false
) with the kernel execution time when calculating all the spatial image moments.@cv3d I hope this is what you wanted? Can you please take a look and let me know if this works correctly on the data you are using? Additionally if you have any comments suggestions please let me know.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.