Selur's Little Message Board - Deoldify Vapoursynth filter

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109

Hello Selur,

sorry if I bother you again about this filter, but I just released a new version: https://github.com/dan64/vs-deoldify/rel...tag/v1.1.5

Following the Model Comparison analysis (concluded today) included in the README, which confirmed the positive effect of model combination I changed the default values of parameters used by the function ddeoldify(). Now the dd_weight is set equal to 0.5 by default. Please pay attention to the fact that in the version of DDColor I changed the range of values of dd_strength to be equivalent to the render_factor in Deoldify (now both parameters have default value = 24). You should change the GUI using the new defaults, the parameter dd_strength should be named strength or even better render_factor and not input size as now. In the new version the relationship between render_factor and input size is the following

Code:
input_size = render_factor * 16

In effect both the parameters are related to the resolution used to perform the inference, bigger values implies that the inference is using a bigger matrix, improving the quality of the inference, but slowing down the encoding process. A good range for these parameters, that do not slow down too much the encoding (of course it depends also on the available GPU) is between 23-33.

Thanks,
Dan

Yeah, got a notification, too groggy to look into it atm. (long and stressful work day and tiring way home), but I'll look at it tomorrow after work.

Cu Selur

(06.03.2024, 17:12)zspeciman Wrote: [ -> ]@Dan64, that was a nice test you run. I was colorizing some photos and videos as well to see the difference. DDcolor photo images are stunning, but in the videos, in some parts it works very well (more robust color than standalone DeOldify) and in others parts more like the 60s psychedelic colors. The merge concept is a brilliant idea to combine the stability of DeOldify with the the color pop of DDcolor

I also observed this effect, I noted that happen more frequently on dark scenes, I'm working on a solution to remove this effect, probably an adaptive merge could solve the problem.

Dan

(06.03.2024, 17:12)zspeciman Wrote: [ -> ]1. In DDcolor, what is Input size about? FP16? Artistic Model vs ModelScope?
2. I wasn't sure what Streams settings was about either, but when I changed 1 to 4, the video had a several corrupted images, so I stuck to 1
3. In DeOldify with Simple Merge enabled, in the DDcolor settings on the right, what does that input size about?

Answers:

1. FP16 reduce the size of the frame thus speeding the inference(), Artistic is better than ModelScope (see my models comparison)
2. yes it is better set streams=1
3. See my previous post: https://forum.selur.net/thread-3595-post...l#pid21545

Dan

@Dan64: send you a link to an adjusted Hybrid dev version.
(not updating torch update until public next release)

Cu Selur

Thanks for the new release, I updated the Hybrid's screenshot in README.md

Dan

You might note somewhere what version works with what Hybrid release, since the current public release does not work with v1.1.5 as expected. Also, current torch addon does not come with v1.1.5.

Cu Selur

Hello Selur,

I found a way to stabilize DDColor, with a method that I called AdaptiveMerge, the problem is that is too slow.
Using this method the encoding speed is reduced by 50% and this method is very simple, I was unable to find a way to speed up the computation.
I don't know very well Vapoursynth and the available documentation is "poor". Maybe there is a way to get a faster encoding...
The current code is the following

Code:
def AdaptiveMerge3(clipa: vs.VideoNode = None, clipb: vs.VideoNode = None, clipb_weight: float = 0.0) -> vs.VideoNode:

    #Vapoursynth version 

    def merge_frame(n, f):                

        clip1 = clipa[n]

        clip2 = clipb[n]  

        clip2_yuv = clip2.resize.Bicubic(format=vs.YUV444PS, matrix_s="709", range_s="limited")  

        clip2_avg_y = vs.core.std.PlaneStats(clip2_yuv, plane=0)

        luma = clip2_avg_y.get_frame(0).props['PlaneStatsAverage']

        #vs.core.log_message(2, "Luma(" + str(n) + ") = " + str(luma))

        brightness = min(1.5 * luma, 1)

        w = max(clipb_weight * brightness, 0.15)

        clip3 = vs.core.std.Merge(clip1, clip2, weight=w)   

        f_out = f.copy()

        f_out = clip3.get_frame(0)

        return f_out

    clipm = clipa.std.ModifyFrame(clips=clipa, selector=merge_frame)

    return clipm

Since DDColor on dark scenes shows a psychedelic effect (try to use it with the attached video), the adaptive merge "weight" the weight_merge parameter by the brightness of the image.
In order to not penalize too much DDColor I multiply this value by 1.2.
For example if the brightness is 45% and the weight_merge is 50% the effective weight used in the merge is given by = 50%*(1.2*45%) = 27%
This computation must executed at frame level.
Do you have any idea on how it is possible to speed up the function AdaptiveMerge3 ?

Thanks,
Dan

Got a few ideas how to speed that up, not sure whether it will work.
Will look at it after work.
What Formats are clipa and clipb you feed to AdaptiveMerge3, some more context how you use it would be helpful.

Cu Selur

I end up using Pillow and OpenCV.
This version is only 5% slower (imput format of clips is RGB24)

Code:
def AdaptiveMerge4(clipa: vs.VideoNode = None, clipb: vs.VideoNode = None, clipb_weight: float = 0.0) -> vs.VideoNode:

    #Python version with constants hard-coded

    def merge_frame(n, f):                

        img1 = frame_to_image(f[0])

        img2 = frame_to_image(f[1]) 

        luma = get_pil_brightness(img2)

        #vs.core.log_message(2, "Luma(" + str(n) + ") = " + str(luma))        

        brightness = min(1.2 * luma, 1)

        w = max(0.5 * brightness, 0.15)

        img_m = Image.blend(img1, img2, w)        

        return image_to_frame(img_m, f[0].copy())                

    clipm = clipa.std.ModifyFrame(clips=[clipa, clipb], selector=merge_frame)

    return clipm

def get_pil_brightness(img: Image) -> float:

    img_np = np.asarray(img)

    hsv = cv2.cvtColor(img_np, cv2.COLOR_RGB2HSV)

    brightness = np.mean(hsv[:,:, 2])

    return (brightness/255)

But I'm not happy about this solution, because is only a patch to DDColor.
I'm thinking to develop a Temporal Chroma Smoother for DDColor, but I still have to find a way on how implement it.

Thanks,
Dan

Quote:This version is only 5% slower (imput format of clips is RGB24)

Slower than compared to?