Yesterday, 14:24
Hello there!
Sometimes I like to give myself some little challenges... my latest was looking for a HD version of "Titan A.E.", discovering no Blu-ray to be released, and trying to upscale it myself from DVD using AviSynth (I still have no knowledge in Python, so it won't be VapourSynth for me).
First rather "conventional" attempt: NNEDI3. Quick and not too dirty.
Second attempt: Looking for specific "Anime upscalers" in the list of external AviSynth plugins. Had a little trouble I tried to solve in the doom9 forum until I was asked bluntly why I don't use mlrt. Well ... because I did not yet know it. So I had to learn about it!
Asd-g released a partial AviSynth+ port of vs-mlrt with a useful archive of models including convenient wrapper scripts with simpler parameter lists. Specifically for upscaling, I found CUGAN and RealESRGAN models to be available.
The ESRGAN model turned out to be quite useful for upscaling cartoon material, and I made it working on my GeForce 1660 Super (6 GB VRAM) using the ncnn Vulkan backend. The ort backend for CUDA was a bit harder to run, causing a lot of errors, until I discovered methods to prevent them, e.g. by applying memory-saving parameters. So I would like to recommend one of these methods today to blow up video by a factor of 4:
For more natural video content, CUGAN may be the more generic model. But it is also a lot more demanding, needs much more memory, often caused an "inference error" in the ncnn interface. I had to learn that splitting the video into tiles is very useful to avoid that. A full PAL DVD video source (720×576) required such a set of parameters:
At least I hope this is a sensible combination, splitting the width of 720 pixels into 3 tiles of 256 pixels width 24 pixels overlaps. May be a waste. Not sure which tile widths are optimal.
In general I am not overly enthusiastic about these AI models. They look quite artificial to me, denoise and flatten a lot. I found to prefer a 50:50 merge with NNEDI3 and a final addition of some subtle noise to look more credible.

Sometimes I like to give myself some little challenges... my latest was looking for a HD version of "Titan A.E.", discovering no Blu-ray to be released, and trying to upscale it myself from DVD using AviSynth (I still have no knowledge in Python, so it won't be VapourSynth for me).
First rather "conventional" attempt: NNEDI3. Quick and not too dirty.
Second attempt: Looking for specific "Anime upscalers" in the list of external AviSynth plugins. Had a little trouble I tried to solve in the doom9 forum until I was asked bluntly why I don't use mlrt. Well ... because I did not yet know it. So I had to learn about it!
Asd-g released a partial AviSynth+ port of vs-mlrt with a useful archive of models including convenient wrapper scripts with simpler parameter lists. Specifically for upscaling, I found CUGAN and RealESRGAN models to be available.
The ESRGAN model turned out to be quite useful for upscaling cartoon material, and I made it working on my GeForce 1660 Super (6 GB VRAM) using the ncnn Vulkan backend. The ort backend for CUDA was a bit harder to run, causing a lot of errors, until I discovered methods to prevent them, e.g. by applying memory-saving parameters. So I would like to recommend one of these methods today to blow up video by a factor of 4:
Code:
LwLibavVideoSource("TitanAE.mkv") # output of makeMKV reading the DVD main PGC
ColorMatrix("Rec.601->Rec.709")
ConvertToYV24()
Crop(0, 80, -0, -80)
ConvertBits(32)
ConvertToPlanarRGB()
mlrt_RealESRGAN(model=2, backend=["ncnn", "fp16=true"])
#mlrt_RealESRGAN(model=2, backend=["ort", """provider="CUDA" """]) # just for documenting, hardly used; not using CUDA explicitly may fall back to using CPU
Spline16Resize(1920, 800)
ConvertBits(8)
ConvertToYV12()
For more natural video content, CUGAN may be the more generic model. But it is also a lot more demanding, needs much more memory, often caused an "inference error" in the ncnn interface. I had to learn that splitting the video into tiles is very useful to avoid that. A full PAL DVD video source (720×576) required such a set of parameters:
Code:
mlrt_CUGAN(noise=2, scale=2, tiles=3, overlap_w=24, backend=["ncnn", "fp16=true"])
At least I hope this is a sensible combination, splitting the width of 720 pixels into 3 tiles of 256 pixels width 24 pixels overlaps. May be a waste. Not sure which tile widths are optimal.
In general I am not overly enthusiastic about these AI models. They look quite artificial to me, denoise and flatten a lot. I found to prefer a 50:50 merge with NNEDI3 and a final addition of some subtle noise to look more credible.