I released the new RC3 (attached)
If are not found bugs should be also the initial release.
This is the (final) header:
Code:
def propainter(
clip: vs.VideoNode,
length: int = 100,
clip_mask: vs.VideoNode = None,
img_mask_path: str = None,
mask_dilation: int = 8,
neighbor_length: int = 10,
ref_stride: int = 10,
raft_iter: int = 20,
mask_region: tuple[int, int, int, int] = None,
weights_dir: str = model_dir,
enable_fp16: bool = True,
device_index: int = 0,
inference_mode: bool = False
) -> vs.VideoNode:
"""ProPainter: Improving Propagation and Transformer for Video Inpainting
:param clip: Clip to process. Only RGB24 format is supported.
:param length: Sequence length that the model processes (min. 12 frames). High values will
increase the inference speed but will increase also the memory usage. Default: 100
:param clip_mask: Clip mask, must be of the same size and lenght of input clip. Default: None
:param img_mask_path: Path of the mask image: Default: None
:param mask_dilation: Mask dilation for video and flow masking. Default: 8
:param neighbor_length: Length of local neighboring frames. Low values decrease the
memory usage. Default: 10
:param ref_stride: Stride of global reference frames. High values will allow to
reduce the memory usage and increase the inference speed. Default: 10
:param raft_iter: Iterations for RAFT inference. Low values will decrease the inference
speed but could affect the output quality. Default: 20
:param mask_region: Allow to restirct the region of the mask, format: (width, height, left, top).
The region must be big enough to allow the inference. Available only if clip_mask
is specified. Default: None
:param enable_fp16: If True use fp16 (half precision) during inference. Default: fp16 (for RTX30 or above)
:param device_index: Device ordinal of the GPU (if = -1 CPU mode is enabled). Default: 0
:param inference_mode: Enable/Disable torch inference mode. Default: False
"""
I added some interesting parameter.
1) clip_mask: now it is possible to pass a clip mask. You can test it using the provided sample and the following code:
Code:
# build clip mask
clip_mask = core.imwri.Read(["running_car_mask.png"])
clip_mask = core.std.Loop(clip=clip_mask, times=clip.num_frames)
clip_mask = core.std.AssumeFPS(clip=clip_mask, fpsnum=25, fpsden=1)
# remove mask using propainter
from vspropainter import propainter
clip = propainter(clip, length=96, clip_mask=clip_mask)
In my tests it seems that using a mask clip even if there is only one frame, improves the speed.
2) mask_region: it is possible to define a smaller area for apply the propainter mask. You can test it using the provided sample and the following code:
Code:
# build clip mask
clip_mask = core.imwri.Read(["running_car_mask.png"])
clip_mask = core.std.Loop(clip=clip_mask, times=clip.num_frames)
clip_mask = core.std.AssumeFPS(clip=clip_mask, fpsnum=25, fpsden=1)
# remove mask using propainter
from vspropainter import propainter
clip = propainter(clip, length=96, clip_mask=clip_mask, mask_region=(596-68*2, 336-28*2, 68, 28))
3) inference_mode: if true it will be enabled the "torch" inference mode. I don't have noted any speed increment/decrement by enabling it.
I was able to significantly speed up the inference using this code:
Code:
# build clip mask
clip_mask = core.imwri.Read(["running_car_mask.png"])
clip_mask = core.std.Loop(clip=clip_mask, times=clip.num_frames)
clip_mask = core.std.AssumeFPS(clip=clip_mask, fpsnum=25, fpsden=1)
# remove mask using propainter
from vspropainter import propainter
clip = propainter(clip, length=96, clip_mask=clip_mask, mask_region=(596-88*2, 336-38*2, 88, 38), ref_stride=25, raft_iter=15, inference_mode=True)
have fun!
Dan