07.09.2025, 13:50
Hi @Selur,
I have a few questions and suggestions.
First, could you add support for dynamic TensorRT models? Specifically, the ability to include min, opt, and max shapes during the model building process.
Second, would it be possible to turn off disabled optimizations by default (like -Jitt cudnn...), or alternatively, allow the user to select a "classic" build using only a flag like --fp16, --bf16, or --fp32? This would let trtexec find the best optimizations on its own.
Here's why I'm asking: for some models, when I create them the same way the hybrid build does by default, I get drastically slower FPS or tile-FPS. This is in comparison to a classic build using a simple command like:
trtexec --onnx=model.onnx --saveEngine=model.engine --shapes=input:1x3x1080x1920 --fp16 --verbose
This issue is particularly noticeable with more complex and demanding GAN models.
A third point: I'm not 100% certain, but I believe the author of the vs-mlrt addon managed to enable fp16 for tensorrt_RTX. I'm not sure if they did this by first quantizing the model with NVIDIA's ModelOpt or through a standard onnxconvert common process. As we know, the standard tensorrt_rtx build doesn't natively support --bf16 and --fp16 flags.
Thanks a lot for all your work, @Selur.
Best regards.
I have a few questions and suggestions.
First, could you add support for dynamic TensorRT models? Specifically, the ability to include min, opt, and max shapes during the model building process.
Second, would it be possible to turn off disabled optimizations by default (like -Jitt cudnn...), or alternatively, allow the user to select a "classic" build using only a flag like --fp16, --bf16, or --fp32? This would let trtexec find the best optimizations on its own.
Here's why I'm asking: for some models, when I create them the same way the hybrid build does by default, I get drastically slower FPS or tile-FPS. This is in comparison to a classic build using a simple command like:
trtexec --onnx=model.onnx --saveEngine=model.engine --shapes=input:1x3x1080x1920 --fp16 --verbose
This issue is particularly noticeable with more complex and demanding GAN models.
A third point: I'm not 100% certain, but I believe the author of the vs-mlrt addon managed to enable fp16 for tensorrt_RTX. I'm not sure if they did this by first quantizing the model with NVIDIA's ModelOpt or through a standard onnxconvert common process. As we know, the standard tensorrt_rtx build doesn't natively support --bf16 and --fp16 flags.
Thanks a lot for all your work, @Selur.
Best regards.