Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RISC-V: added RVV 1.0 weekly build #165

Merged
merged 1 commit into from
Apr 12, 2024
Merged

Conversation

mshabunin
Copy link
Contributor

Build using clang (same as in precommit), test on CanMV-K230 board.

@mshabunin mshabunin marked this pull request as ready for review March 29, 2024 16:17
@mshabunin mshabunin force-pushed the add-rvv10 branch 3 times, most recently from 9d8508b to fb6ea8c Compare March 29, 2024 19:05
asmorkalov pushed a commit to opencv/opencv that referenced this pull request Apr 9, 2024
imgproc: fix unaligned memory access in filters and Gaussian blur #25364

* filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform.
* GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations

Performance comparison:
- [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods)
- [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods)

Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK.

Slightly related PR: opencv/ci-gha-workflow#165
mshabunin added a commit to mshabunin/opencv that referenced this pull request Apr 9, 2024
imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364

* filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform.
* GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations

Performance comparison:
- [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods)
- [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods)

Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK.

Slightly related PR: opencv/ci-gha-workflow#165
@mshabunin
Copy link
Contributor Author

Created two more PRs to fix tests on 4.x and 5.x branches.

asmorkalov pushed a commit to opencv/opencv that referenced this pull request Apr 10, 2024
Fix unaligned filters + increase test thresholds (5.x) #25379

Port of #25364 to 5.x + minor changes in 3d tests to pass on RISC-V platform

Failed tests:
```
[ RUN      ] AP3P.ctheta1p_nan_23607
/home/ci/opencv/modules/3d/test/test_solvepnp_ransac.cpp:2320: Failure
Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16
[  FAILED  ] AP3P.ctheta1p_nan_23607 (1 ms)

[ RUN      ] Rendering/RenderingTest.accuracy/4, where GetParam() = ((320, 240), Flat, CW, Color, CV_32F, CV_32S)
/home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure
Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00102317 vs 0.000989
[  FAILED  ] Rendering/RenderingTest.accuracy/4, where GetParam() = ((320, 240), Flat, CW, Color, CV_32F, CV_32S) (22 ms)

[ RUN      ] Rendering/RenderingTest.accuracy/5, where GetParam() = ((320, 240), Shaded, None, Color, CV_32F, CV_32S)
/home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure
Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00102317 vs 0.000989
[  FAILED  ] Rendering/RenderingTest.accuracy/5, where GetParam() = ((320, 240), Shaded, None, Color, CV_32F, CV_32S) (22 ms)

[ RUN      ] Rendering/RenderingTest.accuracy/8, where GetParam() = ((320, 240), Flat, CW, Clipping, CV_32F, CV_32S)
/home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure
Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00162132 vs 0.0016
[  FAILED  ] Rendering/RenderingTest.accuracy/8, where GetParam() = ((320, 240), Flat, CW, Clipping, CV_32F, CV_32S) (22 ms)

[ RUN      ] Rendering/RenderingTest.accuracy/9, where GetParam() = ((320, 240), Shaded, None, Clipping, CV_32F, CV_32S)
/home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure
Expected: (normL2Depth) <= (normL2Threshold), actual: 0.000554117 vs 0.000544
[  FAILED  ] Rendering/RenderingTest.accuracy/9, where GetParam() = ((320, 240), Shaded, None, Clipping, CV_32F, CV_32S) (27 ms)
```

Related CI PR: opencv/ci-gha-workflow#165
susumu-iino pushed a commit to susumu-iino/opencv that referenced this pull request Apr 11, 2024
imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364

* filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform.
* GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations

Performance comparison:
- [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods)
- [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods)

Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK.

Slightly related PR: opencv/ci-gha-workflow#165
Copy link

@opencv-alalek opencv-alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done 👍

@opencv-alalek opencv-alalek self-assigned this Apr 12, 2024
@opencv-alalek opencv-alalek added the enhancement New feature or request label Apr 12, 2024
@opencv-alalek opencv-alalek merged commit eac7a8a into opencv:main Apr 12, 2024
4 checks passed
@mshabunin mshabunin deleted the add-rvv10 branch April 12, 2024 08:10
klatism pushed a commit to klatism/opencv that referenced this pull request May 17, 2024
imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364

* filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform.
* GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations

Performance comparison:
- [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods)
- [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods)

Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK.

Slightly related PR: opencv/ci-gha-workflow#165
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants