This forum uses cookies

***Selur*** · (This post was last modified: 07.09.2025, 14:49 by Selur.)

Quote:First, could you add support for dynamic TensorRT models? Specifically, the ability to include min, opt, and max shapes during the model building process.

Assuming you are referring to vs-mlrt and
TRT_RTX:

static_shape: bool = True

        min_shapes: typing.Tuple[int, int] = (0, 0)

        opt_shapes: typing.Optional[typing.Tuple[int, int]] = None

        max_shapes: typing.Optional[typing.Tuple[int, int]] = None

TRT:

static_shape: bool = True

        min_shapes: typing.Tuple[int, int] = (0, 0)

        max_shapes: typing.Optional[typing.Tuple[int, int]] = None

        opt_shapes: typing.Optional[typing.Tuple[int, int]] = None

In theory yes, but I would need to know:
a. What are allowed values for those tuples?
b. Are these independent of tiling settings?
c. What are the defaults for min_shapes and opt_shapes?
=> if you can answer these, I can look into adding support for non-static shapes for TRT&TRT_RTX.

Quote:Second, would it be possible to turn off disabled optimizations by default (like -Jitt cudnn...), or alternatively, allow the user to select a "classic" build using only a flag like --fp16, --bf16, or --fp32? This would let trtexec find the best optimizations on its own.
Here's why I'm asking: for some models, when I create them the same way the hybrid build does by default, I get drastically slower FPS or tile-FPS. This is in comparison to a classic build using a simple command like:
trtexec --onnx=model.onnx --saveEngine=model.engine --shapes=input:1x3x1080x1920 --fp16 --verbose
This issue is particularly noticeable with more complex and demanding GAN models.

I don't see how I could while using vsmlrt.py and using the convenient wrappers.
=> if you can tell me how to do this, I can think about adding support for it.

Quote:A third point: I'm not 100% certain, but I believe the author of the vs-mlrt addon managed to enable fp16 for tensorrt_RTX. I'm not sure if they did this by first quantizing the model with NVIDIA's ModelOpt or through a standard onnxconvert common process. As we know, the standard
tensorrt_rtx build doesn't natively support --bf16 and --fp16 flags.

afaik
v15.13.cu13: latest TensorRT libraries: does not support TRT_RTX at all
v15.13.ort: latest ONNX Runtime libraries: fp16 inference for RIFE v2 and SAFA models, as well as fp32/fp16 inference for some SwinIR models, are not currently working in TRT_RTX.
Hybrid itself, does allow settings FP16 with TRT_RTX in the devs for a week or so.
It allows calling SCUNet for example with:

clip = vsmlrt.SCUNet(clip=clip, model=4, overlap=16, backend=Backend.TRT_RTX(fp16=True,device_id=0,verbose=True,use_cuda_graph=True,num_streams=3,builder_optimization_level=3,engine_folder="J:/TRT"))

Cu Selur

Login
Username:
Password:	Lost Password?
	Remember me