A simple program that uses parallel GPU computing on an NVIDIA video card using CUDA technology.
The implemented program simulates the process of heat transfer along the entire length of the rod of a given size using an explicit finite-difference scheme.
The longer the rod (array size) - the more calculations need to be done to achieve the result.
- Linux machine
- CMake 3.1 or later
- CUDA Toolkit 9
Compilation:
nvcc main.cu
Running:
[optirun] ./a.out [ARRAY SIZE] [CUDA THREADS NUMBER]
CPU: Core i7-6500U CPU @ 2.50GHz ×4
GPU: GeForce 940M
gleb@home: optirun ./a.out 10
>>> CPU time: 0.003 ms
>>> GPU time: 0.301 ms
>>> Rate: 0.010
gleb@home: optirun ./a.out 1000
>>> CPU time: 0.282 ms
>>> GPU time: 0.284 ms
>>> Rate: 0.992
gleb@home: optirun ./a.out 10000
>>> CPU time: 3.091 ms
>>> GPU time: 0.427 ms
>>> Rate: 7.233
gleb@home: optirun ./a.out 100000
>>> CPU time: 29.232 ms
>>> GPU time: 1.904 ms
>>> Rate: 15.353
The final graph of the dependence of the performance gain on the size of the array:
This repo is published under the MIT license, see LICENSE.