Quote:First, could you add support for dynamic TensorRT models? Specifically, the ability to include min, opt, and max shapes during the model building process.Assuming you are referring to vs-mlrt and
TRT_RTX:
static_shape: bool = True
min_shapes: typing.Tuple[int, int] = (0, 0)
opt_shapes: typing.Optional[typing.Tuple[int, int]] = None
max_shapes: typing.Optional[typing.Tuple[int, int]] = None
static_shape: bool = True
min_shapes: typing.Tuple[int, int] = (0, 0)
max_shapes: typing.Optional[typing.Tuple[int, int]] = None
opt_shapes: typing.Optional[typing.Tuple[int, int]] = None
a. What are allowed values for those tuples?
b. Are these independent of tiling settings?
c. What are the defaults for min_shapes and opt_shapes?
=> if you can answer these, I can look into adding support for non-static shapes for TRT&TRT_RTX.
Quote:Second, would it be possible to turn off disabled optimizations by default (like -Jitt cudnn...), or alternatively, allow the user to select a "classic" build using only a flag like --fp16, --bf16, or --fp32? This would let trtexec find the best optimizations on its own.I don't see how I could while using vsmlrt.py and using the convenient wrappers.
Here's why I'm asking: for some models, when I create them the same way the hybrid build does by default, I get drastically slower FPS or tile-FPS. This is in comparison to a classic build using a simple command like:
trtexec --onnx=model.onnx --saveEngine=model.engine --shapes=input:1x3x1080x1920 --fp16 --verbose
This issue is particularly noticeable with more complex and demanding GAN models.
=> if you can tell me how to do this, I can think about adding support for it.
Quote:A third point: I'm not 100% certain, but I believe the author of the vs-mlrt addon managed to enable fp16 for tensorrt_RTX. I'm not sure if they did this by first quantizing the model with NVIDIA's ModelOpt or through a standard onnxconvert common process. As we know, the standardafaik
tensorrt_rtx build doesn't natively support --bf16 and --fp16 flags.
v15.13.cu13: latest TensorRT libraries: does not support TRT_RTX at all
v15.13.ort: latest ONNX Runtime libraries: fp16 inference for RIFE v2 and SAFA models, as well as fp32/fp16 inference for some SwinIR models, are not currently working in TRT_RTX.
Hybrid itself, does allow settings FP16 with TRT_RTX in the devs for a week or so.
It allows calling SCUNet for example with:
clip = vsmlrt.SCUNet(clip=clip, model=4, overlap=16, backend=Backend.TRT_RTX(fp16=True,device_id=0,verbose=True,use_cuda_graph=True,num_streams=3,builder_optimization_level=3,engine_folder="J:/TRT"))
Cu Selur
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.