Buddy Benchmark is an extensible benchmark framework. We intend to provide a platform for performance comparison of various frameworks and optimizers. This project is based on Google Benchmark.
Clone the project:
$ git clone [email protected]:buddy-compiler/buddy-benchmark.git
$ git submodule update --init
Please check the deep learning benchmark document at this link.
Currently, the image processing benchmark includes the following frameworks or optimizers:
- OpenCV (link)
NOTE: Please build OpenCV from source to achieve the best performance.
Build OpenCV:
$ cd buddy-benchmark/thirdparty/opencv
$ mkdir build && cd build
$ cmake -G Ninja .. -DCMAKE_BUILD_TYPE=Release
$ ninja
NOTE: Please make sure the buddy-opt
tool of buddy-mlir project can work well.
Run the image processing benchmark:
CMake Options | Default Value |
---|---|
-DBUDDY_OPT_STRIP_MINING |
256 |
-DMLIR_LINALG_TILE |
2 |
-DBUDDY_OPT_ATTR |
avx512f |
-DBUDDY_OPT_TRIPLE |
x86_64-unknown-linux-gnu |
Note:
1. Please replace the /PATH/TO/*
with your local path.
2. For running executable :
i. Please replace <image path>
with path of the image which is to be used for
benchmarking.
ii. Please replace <kernel name>
with name of the kernel which is to be used for
benchmarking as specifed in include/ImageProcessing/Kernels.h
.
ii. Please replace <kernelmorph name>
with name of the unsigned int kernel which is to be used for
benchmarking as specifed in include/ImageProcessing/Kernels.h
.
iii. Please replace <Boundary Option>
with CONSTANT_PADDING
or REPLICATE_PADDING
.
Ex. ./image-processing-benchmark ../../benchmarks/ImageProcessing/Images/YuTu.png random3x3KernelAlign random3x3KernelAlignInt CONSTANT_PADDING
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DIMAGE_PROCESSING_BENCHMARKS=ON \
-DOpenCV_DIR=$PWD/../thirdparty/opencv/build/ \
-DEIGEN_DIR=$PWD/../thirdparty/eigen/ \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja image-processing-benchmark
$ cd bin && ./image-processing-benchmark <image path> <kernel name> <kernelmorph name> <Boundary Option>
Currently, the audio processing benchmark includes the following frameworks or optimizers:
- KFR (link)
Note: Please replace the /PATH/TO/*
with your local path.
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DAUDIO_PROCESSING_BENCHMARKS=ON \
-DCMAKE_CXX_COMPILER=clang++ \
-DKFR_DIR=/PATH/TO/KFR/SOURCE/CODE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja audio-processing-benchmark
$ cd bin
$ ./audio-processing-benchmark
To better demonstrate the result after processing, we provide a tool for figure plotting. To use this tool, you have to make sure that you are using python3
and that the numpy
, matplotlib
and scipy
packages have been installed properly. Use the following command to install the required packages:
$ pip install matplotlib scipy
You can customize the python3
path by adding the option -DPYTHON_BINARY_DIR=/PATH/TO/PYTHON/BIN
while building:
Note: Please replace the /PATH/TO/*
with your local path.
$ cd build
$ cmake -G Ninja .. \
-DAUDIO_PROCESSING_BENCHMARKS=ON \
-DCMAKE_CXX_COMPILER=clang++ \
-DKFR_DIR=/PATH/TO/KFR/SOURCE/CODE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD \
-DPYTHON_BINARY_DIR=/PATH/TO/PYTHON/BIN/
$ ninja audio-plot
Once the processing is done, you can use this tool to plot a comparision figure:
$ cd bin
$ ./audio-plot ../../benchmarks/AudioProcessing/Audios/NASA_Mars.wav ResultKFRIir.wav
The result is saved in bin/res.png
. For more usage, use audio-plot -h
for detailed information.
Some of the benchmarks are ported from gcc-loops(link) in LLVM test suit and linpackc(link)
Note: Please replace the /PATH/TO/*
with your local path and the XXX
with specific target name (ex: gccloops,linpackc,matrix).
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DVECTORIZATION_BENCHMARKS=ON \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja vectorization-XXX-benchmark
$ cd bin
$ ./vectorization-XXX-benchmark
Currently, we use the Spike simulator to run the Gemmini cases. The cycle-accurate benchmark cases are working in the progress. Before building the benchmark target, please see the following table and ensure you use the correct configuration.
Cases | Hardware Configuration |
---|---|
Gemmini-ResNet-101 | defaultFpConfig (link) |
We assume you have already built all the components in the Gemmini README file. Now, let's build and run the cases.
$ source /path/to/chipyard/env.sh
$ cd buddy-benchmark
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/ \
-DGEMMINI_BENCHMARKS=ON
$ ninja
$ cd bin
$ spike --extension=gemmini pk Gemmini-ResNet-101
Build and run MLIR operation optimization benchmark cases.
$ mkdir build && cd build
$ cmake -G Ninja .. \
-DCMAKE_BUILD_TYPE=RELEASE \
-DOP_OPTIMIZATION_BENCHMARKS=ON \
-DBUDDY_MLIR_BUILD_DIR=/PATH/TO/BUDDY-MLIR/BUILD/
$ ninja <your target operation benchmark>
// Operation benchamrk supported include:
// - conv2d-nchw-fchw-benchmark
// - matmul-benchmark
Run TVM operation optimization benchmark cases.
- Install TVM (steps).
- Enter to your TVM (virtual) environment.
- Configure TVM path and Python path.
- Navigate to your target operation directory (e.g.
buddy-benchmark/benchmarks/OpOptimization/MatMul/TVM
). - (Optional) Configure the main file to specify the
target
orsize
of the benchmark. - Run the main python file.
(tvm)$ export TVM_HOME=/path/to/tvm
(tvm)$ export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
(tvm)$ cd benchmarks/OpOptimization/<target operation>/TVM
(tvm)$ python main.py