Selur's Little Message Board

Full Version: Using mlrt in AviSynth+ on non-RTX GPU
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello there! Dodgy 

Sometimes I like to give myself some little challenges... my latest was looking for a HD version of "Titan A.E.", discovering no Blu-ray to be released, and trying to upscale it myself from DVD using AviSynth (I still have no knowledge in Python, so it won't be VapourSynth for me).

First rather "conventional" attempt: NNEDI3. Quick and not too dirty.

Second attempt: Looking for specific "Anime upscalers" in the list of external AviSynth plugins. Had a little trouble I tried to solve in the doom9 forum until I was asked bluntly why I don't use mlrt. Well ... because I did not yet know it. So I had to learn about it!

Asd-g released a partial AviSynth+ port of vs-mlrt with a useful archive of models including convenient wrapper scripts with simpler parameter lists. Specifically for upscaling, I found CUGAN and RealESRGAN models to be available.

The ESRGAN model turned out to be quite useful for upscaling cartoon material, and I made it working on my GeForce 1660 Super (6 GB VRAM) using the ncnn Vulkan backend. The ort backend for CUDA was a bit harder to run, causing a lot of errors, until I discovered methods to prevent them, e.g. by applying memory-saving parameters. So I would like to recommend one of these methods today to blow up video by a factor of 4:

Code:
LwLibavVideoSource("TitanAE.mkv") # output of makeMKV reading the DVD main PGC
ColorMatrix("Rec.601->Rec.709")
ConvertToYV24()
Crop(0, 80, -0, -80)
ConvertBits(32)
ConvertToPlanarRGB()
mlrt_RealESRGAN(model=2, backend=["ncnn", "fp16=true"])
#mlrt_RealESRGAN(model=2, backend=["ort", """provider="CUDA" """]) # just for documenting, hardly used; not using CUDA explicitly may fall back to using CPU
Spline16Resize(1920, 800)
ConvertBits(8)
ConvertToYV12()

For more natural video content, CUGAN may be the more generic model. But it is also a lot more demanding, needs much more memory, often caused an "inference error" in the ncnn interface. I had to learn that splitting the video into tiles is very useful to avoid that. A full PAL DVD video source (720×576) required such a set of parameters:

Code:
mlrt_CUGAN(noise=2, scale=2, tiles=3, overlap_w=24, backend=["ncnn", "fp16=true"])

At least I hope this is a sensible combination, splitting the width of 720 pixels into 3 tiles of 256 pixels width 24 pixels overlaps. May be a waste. Not sure which tile widths are optimal.

In general I am not overly enthusiastic about these AI models. They look quite artificial to me, denoise and flatten a lot. I found to prefer a 50:50 merge with NNEDI3 and a final addition of some subtle noise to look more credible.
Given that your source is already in a good quality and anime/cartoon you might want to try some of the models (compact models if possible, since they are faster) from https://openmodeldb.info/.
(using chaiNNer you can convert most of them to onnx)
I often get better results using one of those, then NNEDI3 (+ maybe another model or filtering) + some line darkening and/or luma sharpening.
For noise cartoon/anime I would recommend first doing some cleanup with Avisynth and then (masked) SCUNet and then the above.
(side note: I would go a bit different at it with Vapoursynth)

About 'natural' content: I'm no fan so far for the machine learning upscaling.
For general filtering, machine learning can help a lot, but I would always recommend having the possibility of 'masked filtering' in your mind when using machine learning based filters.

---
Code:
ConvertBits(32)
ConvertToPlanarRGB()
mlrt_RealESRGAN(model=2, backend=["ncnn", "fp16=true"])
Instead of 'ConvertBits(32)' ,'ConvertBits(16)' should be fine when using fp16, but it shouldn't really make a speed difference. Smile

Cu Selur

Ps.: Welcome to the forum. Smile
Quote:Instead of 'ConvertBits(32)' ,'ConvertBits(16)' should be fine when using fp16

No, the interface explicitly requires 32 bit floating point colour components, it does not accept 16 bit integer. Maybe despite it converts it down internally later.
Ah, good to know, I mainly use vs-mlrt. Smile