07.09.2025, 16:25
The main goal and advantage of generating dynamic TensorRT models is to avoid having to create a new model for every single input resolution. This would make things much more efficient.
I noticed that the author of VideoJaNai (which also uses vsmlrt as a backend) is doing something similar.
![[Image: 1.png]](https://i.postimg.cc/fSJL1fLx/1.png)
![[Image: 2.png]](https://i.postimg.cc/bdJsGpTs/2.png)
![[Image: 3.png]](https://i.postimg.cc/0rBrDFWZ/3.png)
https://github.com/the-database/VideoJaNai/
To create a single dynamic model that supports a wide range of resolutions (from an 8-pixel video all the way up to 1080p), they use the following trtexec command:
--fp16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
This is in contrast to their static model creation, which is tied to a specific resolution:
--fp16 --optShapes=input:%video_resolution% --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
I was thinking, could you implement a similar approach? It would be great if hybrid and vsmlrt could use these dynamic models, making them universal for almost all resolutions.
The ideal workflow would be: once a dynamic model is created for a specific ONNX file, the system would reuse that same TensorRT model for any future processing with that ONNX, as long as the new input material fits within the dimensional range the model was built for (i.e., between minShapes and maxShapes).
If this is too complicated to look into right now, no worries at all—maybe something to consider for the future? I'm not exactly sure which values would need to be changed, but I thought it was a promising idea to share!
I noticed that the author of VideoJaNai (which also uses vsmlrt as a backend) is doing something similar.
![[Image: 1.png]](https://i.postimg.cc/fSJL1fLx/1.png)
![[Image: 2.png]](https://i.postimg.cc/bdJsGpTs/2.png)
![[Image: 3.png]](https://i.postimg.cc/0rBrDFWZ/3.png)
https://github.com/the-database/VideoJaNai/
To create a single dynamic model that supports a wide range of resolutions (from an 8-pixel video all the way up to 1080p), they use the following trtexec command:
--fp16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
This is in contrast to their static model creation, which is tied to a specific resolution:
--fp16 --optShapes=input:%video_resolution% --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
I was thinking, could you implement a similar approach? It would be great if hybrid and vsmlrt could use these dynamic models, making them universal for almost all resolutions.
The ideal workflow would be: once a dynamic model is created for a specific ONNX file, the system would reuse that same TensorRT model for any future processing with that ONNX, as long as the new input material fits within the dimensional range the model was built for (i.e., between minShapes and maxShapes).
If this is too complicated to look into right now, no worries at all—maybe something to consider for the future? I'm not exactly sure which values would need to be changed, but I thought it was a promising idea to share!

