Overall Model Testing/Benchmarking Plan #19115
Labels
infrastructure/benchmark
Relating to benchmarking infrastructure
infrastructure
Relating to build systems, CI, or testing
Tentative Plan to Extend Bencmarking/Validation Testing of Pytorch/ONNX Models
Halo Models
Halo Models are currently thoroughly tested and benchmarked in model-validation and benchmarking respectively. They currently live in the experimental section of IREE, and there are a few improvements that need to be made from the compiler side and testing side before we can move it out of experimental.
For our halo models (sdxl, llama, flux, etc.), we can expect for the modelling to be implemented in sharktank, and we can assume that MLIR generation is well tested and taken care of there. IREE testing of halo models will always start at the MLIR stage. Like the current implementation, we can continue to host these artifacts in azure or move there somewhere more accessible such as Hugging face.
The following tasks would help us move out of experimental and build something reliable, easy to navigate, and recreate locally:
Model Validation
Compiler Tasks:
Testing Tasks:
Model Benchmarking
Testing Tasks:
General Models
For the general model suite, we can start mainly with pytorch and onnx as these are the two frameworks we rely on. All these tests should live in the iree-test-suite. For onnx, we have a supported path in IREE to import MLIR from onnx source files example. For pytorch models, we rely on the iree-turbine repo to export to MLIR example, so we can decide to either include this intake path or start at the MLIR.
We already have had some work on this here, so a good starting point to expand on.
Testing Tasks:
The text was updated successfully, but these errors were encountered: