When using NPU for inference with complex models, it may not be possible to convert the entire model into a format fully supported by the NPU due to insufficient operator support. As a result, the model needs to be split, with the computationally intensive parts converted to a format supported by the NPU, while the remaining parts run on the CPU. This repository is designed to facilitate that process, allowing users to easily split their ONNX models as needed.
-
Notifications
You must be signed in to change notification settings - Fork 0
This project provides a simple way to split an ONNX computation graph into two parts based on user-defined nodes.
License
LJ-Hao/onnx-spliter
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
This project provides a simple way to split an ONNX computation graph into two parts based on user-defined nodes.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published