![]() |
|
Feature Request: Dynamic TensorRT models and build options - Printable Version +- Selur's Little Message Board (https://forum.selur.net) +-- Forum: Hybrid - Support (https://forum.selur.net/forum-1.html) +--- Forum: Problems & Questions (https://forum.selur.net/forum-3.html) +--- Thread: Feature Request: Dynamic TensorRT models and build options (/thread-4207.html) |
Feature Request: Dynamic TensorRT models and build options - mikazmaj - 07.09.2025 Hi @Selur, I have a few questions and suggestions. First, could you add support for dynamic TensorRT models? Specifically, the ability to include min, opt, and max shapes during the model building process. Second, would it be possible to turn off disabled optimizations by default (like -Jitt cudnn...), or alternatively, allow the user to select a "classic" build using only a flag like --fp16, --bf16, or --fp32? This would let trtexec find the best optimizations on its own. Here's why I'm asking: for some models, when I create them the same way the hybrid build does by default, I get drastically slower FPS or tile-FPS. This is in comparison to a classic build using a simple command like: trtexec --onnx=model.onnx --saveEngine=model.engine --shapes=input:1x3x1080x1920 --fp16 --verbose This issue is particularly noticeable with more complex and demanding GAN models. A third point: I'm not 100% certain, but I believe the author of the vs-mlrt addon managed to enable fp16 for tensorrt_RTX. I'm not sure if they did this by first quantizing the model with NVIDIA's ModelOpt or through a standard onnxconvert common process. As we know, the standard tensorrt_rtx build doesn't natively support --bf16 and --fp16 flags. Thanks a lot for all your work, @Selur. Best regards. RE: Feature Request: Dynamic TensorRT models and build options - Selur - 07.09.2025 Quote:First, could you add support for dynamic TensorRT models? Specifically, the ability to include min, opt, and max shapes during the model building process.Assuming you are referring to vs-mlrt and TRT_RTX: static_shape: bool = Truestatic_shape: bool = Truea. What are allowed values for those tuples? b. Are these independent of tiling settings? c. What are the defaults for min_shapes and opt_shapes? => if you can answer these, I can look into adding support for non-static shapes for TRT&TRT_RTX. Quote:Second, would it be possible to turn off disabled optimizations by default (like -Jitt cudnn...), or alternatively, allow the user to select a "classic" build using only a flag like --fp16, --bf16, or --fp32? This would let trtexec find the best optimizations on its own.I don't see how I could while using vsmlrt.py and using the convenient wrappers. => if you can tell me how to do this, I can think about adding support for it. Quote:A third point: I'm not 100% certain, but I believe the author of the vs-mlrt addon managed to enable fp16 for tensorrt_RTX. I'm not sure if they did this by first quantizing the model with NVIDIA's ModelOpt or through a standard onnxconvert common process. As we know, the standardafaik v15.13.cu13: latest TensorRT libraries: does not support TRT_RTX at all v15.13.ort: latest ONNX Runtime libraries: fp16 inference for RIFE v2 and SAFA models, as well as fp32/fp16 inference for some SwinIR models, are not currently working in TRT_RTX. Hybrid itself, does allow settings FP16 with TRT_RTX in the devs for a week or so. It allows calling SCUNet for example with: clip = vsmlrt.SCUNet(clip=clip, model=4, overlap=16, backend=Backend.TRT_RTX(fp16=True,device_id=0,verbose=True,use_cuda_graph=True,num_streams=3,builder_optimization_level=3,engine_folder="J:/TRT"))Cu Selur RE: Feature Request: Dynamic TensorRT models and build options - mikazmaj - 07.09.2025 The main goal and advantage of generating dynamic TensorRT models is to avoid having to create a new model for every single input resolution. This would make things much more efficient. I noticed that the author of VideoJaNai (which also uses vsmlrt as a backend) is doing something similar. ![]() ![]() ![]() https://github.com/the-database/VideoJaNai/ To create a single dynamic model that supports a wide range of resolutions (from an 8-pixel video all the way up to 1080p), they use the following trtexec command: --fp16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference This is in contrast to their static model creation, which is tied to a specific resolution: --fp16 --optShapes=input:%video_resolution% --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference I was thinking, could you implement a similar approach? It would be great if hybrid and vsmlrt could use these dynamic models, making them universal for almost all resolutions. The ideal workflow would be: once a dynamic model is created for a specific ONNX file, the system would reuse that same TensorRT model for any future processing with that ONNX, as long as the new input material fits within the dimensional range the model was built for (i.e., between minShapes and maxShapes). If this is too complicated to look into right now, no worries at all—maybe something to consider for the future? I'm not exactly sure which values would need to be changed, but I thought it was a promising idea to share!
RE: Feature Request: Dynamic TensorRT models and build options - Selur - 07.09.2025 Okay, that does not answer my questions clearly. ![]() But after some reading,... a. min_shapes in vsmlrt is the min resolution of the content that the trained engine can handle. b. max_shapes in vsmlrt is the max resolution of the content that the trained engine can handle. c. opt_shapes in vsmlrt is the resolution of the content that the engine really gets trained. The more min&max differ from opt the less reliable/efficient is the model. So if the contents you apply a model to only have just a few resolutions, you get better results training the model for each resolution. If the resolutions of your content, are around a few fixed resolutions, creating multiple engine files around those fixed resolutions is a good idea. Using something like: min_shapes = (720, 480) # smallest resolutionIf you use tiling, note that the tiling size is the resolution, so if you always use 256x256 tiling it does not make sense to use dynamic shapes. So if you have a model that requires a specific min resolution it does not make sense to use a min_shapes value below this resolution. Would be interresting to read about some tests, experiences about how 'much' diversion is 'okay' between opt<>min opt<>max before one really should better create a separate engine file. Cu Selur Ps.: send you link to a test version => let me know whether that works as expected. RE: Feature Request: Dynamic TensorRT models and build options - mikazmaj - 07.09.2025 Thanks for the test version. It looks like you've successfully implemented the dynamic TensorRT build. I don't think there's any visual difference in the output file when using the dynamic model. It just runs a bit slower when the resolution isn't optimal, and the performance difference can vary from model to model. I tested the SCUNet vsmlrt plugin by creating a single engine for resolutions from 8x8 to 4096x4096. I then tried it with clips ranging from 176x144 to 4K input, and it worked perfectly without needing to rebuild the TensorRT engine. The dynamic engine did its job as expected. I assume that DPIR and the other internal vsmlrt filters are working correctly too. However, I'm getting a syntax-related error in the resize section; please see the screenshot and the error text. ![]() # ImportsQuote:2025-09-07 20:09:39.260 The same thing happens when I try to enable vsmlrt with dynamic dimensions under "VS-Others-Vsmlrt". I'm guessing it's a small and subtle bug in the code. Let me know if you need a more detailed log. I have to admit, I forgot where the log-debug files are saved, but I do know how to enable the option in Hybrid
RE: Feature Request: Dynamic TensorRT models and build options - Selur - 07.09.2025 Typo, there are too many commas. => updated the Hybrid_dynamic_shapes download, which hopefully fixes the problem. Cu Selur |