RISC-V: added RVV 1.0 weekly build #165

mshabunin · 2024-03-29T16:16:09Z

Build using clang (same as in precommit), test on CanMV-K230 board.

imgproc: fix unaligned memory access in filters and Gaussian blur #25364 * filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform. * GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations Performance comparison: - [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods) - [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods) Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK. Slightly related PR: opencv/ci-gha-workflow#165

imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364 * filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform. * GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations Performance comparison: - [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods) - [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods) Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK. Slightly related PR: opencv/ci-gha-workflow#165

mshabunin · 2024-04-09T20:57:03Z

Created two more PRs to fix tests on 4.x and 5.x branches.

Fix unaligned filters + increase test thresholds (5.x) #25379 Port of #25364 to 5.x + minor changes in 3d tests to pass on RISC-V platform Failed tests: ``` [ RUN ] AP3P.ctheta1p_nan_23607 /home/ci/opencv/modules/3d/test/test_solvepnp_ransac.cpp:2320: Failure Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16 [ FAILED ] AP3P.ctheta1p_nan_23607 (1 ms) [ RUN ] Rendering/RenderingTest.accuracy/4, where GetParam() = ((320, 240), Flat, CW, Color, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00102317 vs 0.000989 [ FAILED ] Rendering/RenderingTest.accuracy/4, where GetParam() = ((320, 240), Flat, CW, Color, CV_32F, CV_32S) (22 ms) [ RUN ] Rendering/RenderingTest.accuracy/5, where GetParam() = ((320, 240), Shaded, None, Color, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00102317 vs 0.000989 [ FAILED ] Rendering/RenderingTest.accuracy/5, where GetParam() = ((320, 240), Shaded, None, Color, CV_32F, CV_32S) (22 ms) [ RUN ] Rendering/RenderingTest.accuracy/8, where GetParam() = ((320, 240), Flat, CW, Clipping, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00162132 vs 0.0016 [ FAILED ] Rendering/RenderingTest.accuracy/8, where GetParam() = ((320, 240), Flat, CW, Clipping, CV_32F, CV_32S) (22 ms) [ RUN ] Rendering/RenderingTest.accuracy/9, where GetParam() = ((320, 240), Shaded, None, Clipping, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.000554117 vs 0.000544 [ FAILED ] Rendering/RenderingTest.accuracy/9, where GetParam() = ((320, 240), Shaded, None, Clipping, CV_32F, CV_32S) (27 ms) ``` Related CI PR: opencv/ci-gha-workflow#165

imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364 * filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform. * GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations Performance comparison: - [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods) - [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods) Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK. Slightly related PR: opencv/ci-gha-workflow#165

opencv-alalek

Well done 👍

imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364 * filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform. * GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations Performance comparison: - [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods) - [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods) Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK. Slightly related PR: opencv/ci-gha-workflow#165

mshabunin marked this pull request as ready for review March 29, 2024 16:17

mshabunin force-pushed the add-rvv10 branch 3 times, most recently from 9d8508b to fb6ea8c Compare March 29, 2024 19:05

mshabunin mentioned this pull request Mar 29, 2024

RISC-V: added rsync to image opencv-infrastructure/opencv-gha-dockerfile#36

Merged

mshabunin force-pushed the add-rvv10 branch from fb6ea8c to 38f49bf Compare March 30, 2024 08:43

mshabunin mentioned this pull request Apr 7, 2024

imgproc: fix unaligned memory access in filters and Gaussian blur opencv/opencv#25364

Merged

2 tasks

mshabunin force-pushed the add-rvv10 branch from 10cfcd6 to 2c01fa1 Compare April 9, 2024 16:54

This was referenced Apr 9, 2024

Fix unaligned filters + increase test thresholds (5.x) opencv/opencv#25379

Merged

calib3d: increased AP3P test threshold for RISC-V platform opencv/opencv#25380

Merged

mshabunin force-pushed the add-rvv10 branch from 2c01fa1 to 85ebdeb Compare April 10, 2024 08:44

mshabunin mentioned this pull request Apr 11, 2024

imgproc: add 512mb tag for FindContours.accuracy test opencv/opencv#25396

Merged

mshabunin force-pushed the add-rvv10 branch from 85ebdeb to b6ba7e2 Compare April 11, 2024 15:59

RISC-V: added RVV 1.0 weekly build

dd4ad4e

mshabunin force-pushed the add-rvv10 branch from b6ba7e2 to dd4ad4e Compare April 11, 2024 16:02

opencv-alalek approved these changes Apr 12, 2024

View reviewed changes

opencv-alalek self-assigned this Apr 12, 2024

opencv-alalek added the enhancement New feature or request label Apr 12, 2024

opencv-alalek merged commit eac7a8a into opencv:main Apr 12, 2024
4 checks passed

mshabunin deleted the add-rvv10 branch April 12, 2024 08:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RISC-V: added RVV 1.0 weekly build #165

RISC-V: added RVV 1.0 weekly build #165

mshabunin commented Mar 29, 2024

mshabunin commented Apr 9, 2024

opencv-alalek left a comment

RISC-V: added RVV 1.0 weekly build #165

RISC-V: added RVV 1.0 weekly build #165

Conversation

mshabunin commented Mar 29, 2024

mshabunin commented Apr 9, 2024

opencv-alalek left a comment

Choose a reason for hiding this comment