Okay, strangely commenting out the '@torch.inference_mode()'-lines gets speed up to 3.8fps again.
Side note: Additionally, increasing ref_stride to 100 increases speed to 3.99fps, increasing raft_iter to 30 does slow down the encoding to 2.02fps.
Keeping the '@torch.inference_mode()'-lines and using ref_stride=100 increases the speed to 5.14 fps.
Cu Selur
Ps.: did a small test with a non-transparent logo (see attachment)
# changing range from limited to full range for propainter
clip = core.resize.Bicubic(clip, range_in_s="limited", range_s="full")
# setting color range to PC (full) range.
clip = core.std.SetFrameProps(clip=clip, _ColorRange=0)
# adjusting color space from YUV420P8 to RGB24 for propainter
clip = core.resize.Bicubic(clip=clip, format=vs.RGB24, matrix_in_s="709", range_s="full")
# Define the region coordinates and size
x = 80
y = 70
width = 400
height = 256
# Create a blank clip for the mask of the same size as the original clip
mask = core.std.BlankClip(clip=clip, color=[255, 255, 255])
mask = core.std.SetFrameProps(clip=mask, _ColorRange=1)
mask = core.std.CropRel(mask, left=x, top=y, right=clip.width - (x + width), bottom=clip.height - (y + height))
mask = core.std.AddBorders(mask, left=x, top=y, right=clip.width - (x + width), bottom=clip.height - (y + height))
# Convert the mask to grayscale
mask = core.resize.Bicubic(mask, format=vs.GRAY8, matrix_s="709")
# Binarize Mask
binarize_mask = core.std.BinarizeMask(clip, 1)
# Crop to the region of interest
cropped_clip = core.std.CropRel(clip, left=x, top=y, right=clip.width - (x + width), bottom=clip.height - (y + height))
# Apply propainter to the cropped region
from vspropainter import propainter
processed_cropped_clip = propainter(cropped_clip, length=250, mask_path="running_car_mask_cropped_400x256.png", device_index=0, enable_fp16=True)
# Pad the processed region back to the original size
padded_clip = core.std.AddBorders(processed_cropped_clip, left=x, top=y, right=clip.width - (x + width), bottom=clip.height - (y + height))
# Merge the processed region back into the original frame
final_clip = core.std.MaskedMerge(clip, padded_clip, mask, planes=[0,1,2])
# undo range change
final_clip = core.resize.Bicubic(final_clip, range_in_s="full", range_s="limited")
# Adjust output color from RGB24 to YUV420P8
final_clip = core.resize.Bicubic(clip=final_clip, format=vs.YUV420P8, matrix_s="709", range_s="limited")
# Set output frame rate to 25fps (progressive)
final_clip = core.std.AssumeFPS(clip=final_clip, fpsnum=25, fpsden=1)
If are not found bugs should be also the initial release.
This is the (final) header:
def propainter(
clip: vs.VideoNode,
length: int = 100,
clip_mask: vs.VideoNode = None,
img_mask_path: str = None,
mask_dilation: int = 8,
neighbor_length: int = 10,
ref_stride: int = 10,
raft_iter: int = 20,
mask_region: tuple[int, int, int, int] = None,
weights_dir: str = model_dir,
enable_fp16: bool = True,
device_index: int = 0,
inference_mode: bool = False
) -> vs.VideoNode:
"""ProPainter: Improving Propagation and Transformer for Video Inpainting
:param clip: Clip to process. Only RGB24 format is supported.
:param length: Sequence length that the model processes (min. 12 frames). High values will
increase the inference speed but will increase also the memory usage. Default: 100
:param clip_mask: Clip mask, must be of the same size and lenght of input clip. Default: None
:param img_mask_path: Path of the mask image: Default: None
:param mask_dilation: Mask dilation for video and flow masking. Default: 8
:param neighbor_length: Length of local neighboring frames. Low values decrease the
memory usage. Default: 10
:param ref_stride: Stride of global reference frames. High values will allow to
reduce the memory usage and increase the inference speed. Default: 10
:param raft_iter: Iterations for RAFT inference. Low values will decrease the inference
speed but could affect the output quality. Default: 20
:param mask_region: Allow to restirct the region of the mask, format: (width, height, left, top).
The region must be big enough to allow the inference. Available only if clip_mask
is specified. Default: None
:param enable_fp16: If True use fp16 (half precision) during inference. Default: fp16 (for RTX30 or above)
:param device_index: Device ordinal of the GPU (if = -1 CPU mode is enabled). Default: 0
:param inference_mode: Enable/Disable torch inference mode. Default: False
"""
I added some interesting parameter.
1) clip_mask: now it is possible to pass a clip mask. You can test it using the provided sample and the following code:
In my tests it seems that using a mask clip even if there is only one frame, improves the speed.
2) mask_region: it is possible to define a smaller area for apply the propainter mask. You can test it using the provided sample and the following code:
Nice. Not sure whether I get around to it today, but tomorrow I'll create a Hybrid dev version with support for ProPainter.
Will send you a link via pm once, I'm ready.