Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot convert onnx to trt yolov9 #3

Open
SonNNguyen opened this issue Nov 12, 2024 · 2 comments
Open

Cannot convert onnx to trt yolov9 #3

SonNNguyen opened this issue Nov 12, 2024 · 2 comments

Comments

@SonNNguyen
Copy link

SonNNguyen commented Nov 12, 2024

Can you help? this issue when I run ./start-triton-server.sh

Im using
nvcr.io/nvidia/tritonserver:21.07-py3

root@bf5cff23afa2:/apps# bash ./start-triton-server.sh --models yolov9-e-qat --model_mode infer --plugin efficientNMS --opt_batch_size 4 --max_batch_size 4 --instance_group 1
&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # /usr/src/tensorrt/bin/trtexec --onnx=./models_onnx/infer-yolov9-e-qat-end2end.onnx --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:4x3x640x640 --fp16 --int8 --useCudaGraph --workspace=3129 --saveEngine=./models/yolov9-e-qat/1/infer-efficientNMS-yolov9-e-qat-max-batch-4.engine
[11/12/2024-04:06:32] [I] === Model Options ===
[11/12/2024-04:06:32] [I] Format: ONNX
[11/12/2024-04:06:32] [I] Model: ./models_onnx/infer-yolov9-e-qat-end2end.onnx
[11/12/2024-04:06:32] [I] Output:
[11/12/2024-04:06:32] [I] === Build Options ===
[11/12/2024-04:06:32] [I] Max batch: explicit
[11/12/2024-04:06:32] [I] Workspace: 3129 MiB
[11/12/2024-04:06:32] [I] minTiming: 1
[11/12/2024-04:06:32] [I] avgTiming: 8
[11/12/2024-04:06:32] [I] Precision: FP32+FP16+INT8
[11/12/2024-04:06:32] [I] Calibration: Dynamic
[11/12/2024-04:06:32] [I] Refit: Disabled
[11/12/2024-04:06:32] [I] Sparsity: Disabled
[11/12/2024-04:06:32] [I] Safe mode: Disabled
[11/12/2024-04:06:32] [I] Restricted mode: Disabled
[11/12/2024-04:06:32] [I] Save engine: ./models/yolov9-e-qat/1/infer-efficientNMS-yolov9-e-qat-max-batch-4.engine
[11/12/2024-04:06:32] [I] Load engine:
[11/12/2024-04:06:32] [I] NVTX verbosity: 0
[11/12/2024-04:06:32] [I] Tactic sources: Using default tactic sources
[11/12/2024-04:06:32] [I] timingCacheMode: local
[11/12/2024-04:06:32] [I] timingCacheFile:
[11/12/2024-04:06:32] [I] Input(s)s format: fp32:CHW
[11/12/2024-04:06:32] [I] Output(s)s format: fp32:CHW
[11/12/2024-04:06:32] [I] Input build shape: images=1x3x640x640+4x3x640x640+4x3x640x640
[11/12/2024-04:06:32] [I] Input calibration shapes: model
[11/12/2024-04:06:32] [I] === System Options ===
[11/12/2024-04:06:32] [I] Device: 0
[11/12/2024-04:06:32] [I] DLACore:
[11/12/2024-04:06:32] [I] Plugins:
[11/12/2024-04:06:32] [I] === Inference Options ===
[11/12/2024-04:06:32] [I] Batch: Explicit
[11/12/2024-04:06:32] [I] Input inference shape: images=4x3x640x640
[11/12/2024-04:06:32] [I] Iterations: 10
[11/12/2024-04:06:32] [I] Duration: 3s (+ 200ms warm up)
[11/12/2024-04:06:32] [I] Sleep time: 0ms
[11/12/2024-04:06:32] [I] Streams: 1
[11/12/2024-04:06:32] [I] ExposeDMA: Disabled
[11/12/2024-04:06:32] [I] Data transfers: Enabled
[11/12/2024-04:06:32] [I] Spin-wait: Disabled
[11/12/2024-04:06:32] [I] Multithreading: Disabled
[11/12/2024-04:06:32] [I] CUDA Graph: Enabled
[11/12/2024-04:06:32] [I] Separate profiling: Disabled
[11/12/2024-04:06:32] [I] Time Deserialize: Disabled
[11/12/2024-04:06:32] [I] Time Refit: Disabled
[11/12/2024-04:06:32] [I] Skip inference: Disabled
[11/12/2024-04:06:32] [I] Inputs:
[11/12/2024-04:06:32] [I] === Reporting Options ===
[11/12/2024-04:06:32] [I] Verbose: Disabled
[11/12/2024-04:06:32] [I] Averages: 10 inferences
[11/12/2024-04:06:32] [I] Percentile: 99
[11/12/2024-04:06:32] [I] Dump refittable layers:Disabled
[11/12/2024-04:06:32] [I] Dump output: Disabled
[11/12/2024-04:06:32] [I] Profile: Disabled
[11/12/2024-04:06:32] [I] Export timing to JSON file:
[11/12/2024-04:06:32] [I] Export output to JSON file:
[11/12/2024-04:06:32] [I] Export profile to JSON file:
[11/12/2024-04:06:32] [I]
[11/12/2024-04:06:32] [I] === Device Information ===
[11/12/2024-04:06:32] [I] Selected Device: NVIDIA GeForce RTX 3050 Ti Laptop GPU
[11/12/2024-04:06:32] [I] Compute Capability: 8.6
[11/12/2024-04:06:32] [I] SMs: 20
[11/12/2024-04:06:32] [I] Compute Clock Rate: 1.485 GHz
[11/12/2024-04:06:32] [I] Device Global Memory: 3910 MiB
[11/12/2024-04:06:32] [I] Shared Memory per SM: 100 KiB
[11/12/2024-04:06:32] [I] Memory Bus Width: 128 bits (ECC disabled)
[11/12/2024-04:06:32] [I] Memory Clock Rate: 6.001 GHz
[11/12/2024-04:06:32] [I]
[11/12/2024-04:06:32] [I] TensorRT version: 8001
[11/12/2024-04:06:32] [I] [TRT] [MemUsageChange] Init CUDA: CPU +532, GPU +0, now: CPU 539, GPU 1107 (MiB)
[11/12/2024-04:06:32] [I] Start parsing network model
[11/12/2024-04:06:32] [I] [TRT] ----------------------------------------------------------------
[11/12/2024-04:06:32] [I] [TRT] Input filename: ./models_onnx/infer-yolov9-e-qat-end2end.onnx
[11/12/2024-04:06:32] [I] [TRT] ONNX IR version: 0.0.10
[11/12/2024-04:06:32] [I] [TRT] Opset version: 13
[11/12/2024-04:06:32] [I] [TRT] Producer name: pytorch
[11/12/2024-04:06:32] [I] [TRT] Producer version: 1.14.0
[11/12/2024-04:06:32] [I] [TRT] Domain:
[11/12/2024-04:06:32] [I] [TRT] Model version: 0
[11/12/2024-04:06:32] [I] [TRT] Doc string:
[11/12/2024-04:06:32] [I] [TRT] ----------------------------------------------------------------
[11/12/2024-04:06:33] [W] [TRT] onnx2trt_utils.cpp:362: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/12/2024-04:06:33] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/12/2024-04:06:40] [E] [TRT] ModelImporter.cpp:720: While parsing node number 2133 [Range -> "/model/model.42/Range_output_0"]:
[11/12/2024-04:06:40] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[11/12/2024-04:06:40] [E] [TRT] ModelImporter.cpp:722: input: "/model/model.42/Constant_8_output_0"
input: "/model/model.42/Cast_output_0"
input: "/model/model.42/Constant_9_output_0"
output: "/model/model.42/Range_output_0"
name: "/model/model.42/Range"
op_type: "Range"

[11/12/2024-04:06:40] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[11/12/2024-04:06:40] [E] [TRT] ModelImporter.cpp:725: ERROR: builtin_op_importers.cpp:3170 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"
[11/12/2024-04:06:41] [E] Failed to parse onnx file
[11/12/2024-04:06:41] [I] Finish parsing network model
[11/12/2024-04:06:41] [E] Parsing model failed
[11/12/2024-04:06:41] [E] Engine creation failed
[11/12/2024-04:06:41] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8001] # /usr/src/tensorrt/bin/trtexec --onnx=./models_onnx/infer-yolov9-e-qat-end2end.onnx --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:4x3x640x640 --fp16 --int8 --useCudaGraph --workspace=3129 --saveEngine=./models/yolov9-e-qat/1/infer-efficientNMS-yolov9-e-qat-max-batch-4.engine
Conversion of yolov9-e-qat ONNX model to TensorRT engine failed

@levipereira
Copy link
Owner

levipereira commented Nov 13, 2024

This repo is outdated and has some issues. I need to implement the new models and suggest using the models from this repository: https://github.com/levipereira/yolo_e2e/releases/tag/v1.0.

The error occurred when using a different resolution in TensorRT than the one exported in ONNX, likely due to a bug with dynamic shape handling.

Feel free to use the export functionality from https://github.com/levipereira/yolo_e2e/ as well.

You can also try re-exporting the model from pt to onnx at a resolution of 640x640 and retry

@jaffe-fly
Copy link

yolov8x still get error

bash ./start-triton-server.sh  --models yolov8x --model_mode eval --plugin yoloNMS --opt_batch_size 4 --max_batch_size 4 --instance_group 1

error

[12/08/2024-11:57:31] [I] [TRT] No checker registered for op: YOLO_NMS_TRT. Attempting to check as plugin.
[12/08/2024-11:57:31] [E] [TRT] IPluginRegistry::getCreator: Error Code 4: API Usage Error (Cannot find plugin: YOLO_NMS_TRT, version: 1, namespace:.)
[12/08/2024-11:57:31] [E] [TRT] ModelImporter.cpp:949: While parsing node number 395 [YOLO_NMS_TRT -> "num_dets"]:
[12/08/2024-11:57:31] [E] [TRT] ModelImporter.cpp:950: --- Begin node ---
input: "/end2end/Unsqueeze_output_0"
input: "/end2end/Slice_4_output_0"
output: "num_dets"
output: "det_boxes"
output: "det_scores"
output: "det_classes"
output: "det_indices"
name: "/end2end/YOLO_NMS_TRT"
op_type: "YOLO_NMS_TRT"
attribute {
  name: "background_class"
  ints: -1
  type: INTS
}
attribute {
  name: "box_coding"
  ints: 1
  type: INTS
}
attribute {
  name: "class_agnostic"
  i: 0
  type: INT
}
attribute {
  name: "iou_threshold"
  f: 0.7
  type: FLOAT
}
attribute {
  name: "max_output_boxes"
  i: 300
  type: INT
}
attribute {
  name: "plugin_version"
  s: "1"
  type: STRING
}
attribute {
  name: "score_activation"
  i: 0
  type: INT
}
attribute {
  name: "score_threshold"
  f: 0.001
  type: FLOAT
}
domain: "TRT"

[12/08/2024-11:57:31] [E] [TRT] ModelImporter.cpp:951: --- End node ---
[12/08/2024-11:57:31] [E] [TRT] ModelImporter.cpp:954: ERROR: onnxOpCheckers.cpp:786 In function checkFallbackPluginImporter:
[6] creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[12/08/2024-11:57:31] [E] Failed to parse onnx file
[12/08/2024-11:57:31] [I] Finished parsing network model. Parse time: 1.07307
[12/08/2024-11:57:31] [E] Parsing model failed
[12/08/2024-11:57:31] [E] Failed to create engine from model or file.
[12/08/2024-11:57:31] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100400] [b26] # /usr/src/tensorrt/bin/trtexec --onnx=./models_onnx/eval-yolov8x-trt.onnx --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:4x3x640x640 --fp16 --useCudaGraph --memPoolSize=workspace:7390
./start-triton-server.sh: line 316: --saveEngine=./models/yolov8x/1/eval-yoloNMS-yolov8x-max-batch-4.engine: No such file or directory
Conversion of yolov8x ONNX model to TensorRT engine failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants