Selur's Little Message Board
Deoldify Vapoursynth filter - Printable Version

+- Selur's Little Message Board (https://forum.selur.net)
+-- Forum: Talk, Talk, Talk (https://forum.selur.net/forum-5.html)
+--- Forum: Small Talk (https://forum.selur.net/forum-7.html)
+--- Thread: Deoldify Vapoursynth filter (/thread-3595.html)



RE: Deoldify Vapoursynth filter - Selur - 01.03.2024

using
C:\Users\Selur>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" --progress "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" NUL -c y4m
I see:
Information: Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
NumExpr defaulting to 8 threads.
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)

Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")

Script evaluation done in 10.00 seconds
Output 429 frames in 33.14 seconds (12.95 fps)
These python outputs might be the root of the problem,...


RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024

I performed another test

Test 3: ffmpeg with Windows pipe using raw output instead of y4m

D:\PProjects\vs-deoldify_dev>"D:\Programs\Hybrid\64bit\Vapoursynth\ffmpeg.exe" -f vapoursynth -i "D:\PProjects\vs-deoldify_dev\encoding.vpy" -vcodec rawvideo -pixel_format yuv420p -strict -1 -f rawvideo -   | "D:\Programs\Hybrid\64bit\x265.exe" --input-res 1280x692 --fps 23.976 - -o "D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265"
ffmpeg version N-113112-g548ceb9b8f-gf5f414d9c4+3 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.0 (Rev3, Built by MSYS2 project)
  configuration:  --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache g++' --ld='ccache g++' --extra-cxxflags=-fpermissive --extra-cflags=-Wno-int-conversion --disable-autodetect --enable-amf --enable-bzlib --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libdav1d --enable-libaom --disable-debug --enable-fontconfig --enable-libass --enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp --enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 --enable-librav1e --enable-libsrt --enable-libgsm --enable-libvmaf --enable-libsvtav1 --enable-openal --enable-vapoursynth --enable-opencl --enable-opengl --enable-mbedtls --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++ --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads --extra-cflags=-DAL_LIBTYPE_STATIC --extra-cflags='-IH:/mabs/local64/include' --extra-cflags='-IH:/mabs/local64/include/AL'
  libavutil      58. 36.100 / 58. 36.100
  libavcodec     60. 36.100 / 60. 36.100
  libavformat    60. 20.100 / 60. 20.100
  libavdevice    60.  4.100 / 60.  4.100
  libavfilter     9. 14.101 /  9. 14.101
  libswscale      7.  6.100 /  7.  6.100
  libswresample   4. 13.100 /  4. 13.100
  libpostproc    57.  4.100 / 57.  4.100
yuv  [info]: 1280x692 fps 23976/1000 i420p8 unknown frame count
raw  [info]: output file: D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265
x265 [info]: HEVC encoder version 3.5+115-88fd6d3ad
x265 [info]: build info [Windows][GCC 13.2.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-3.1 (Main tier)
x265 [info]: Thread pool created using 20 threads
x265 [info]: Slices                              : 1
x265 [info]: frame threads / pool features       : 4 / wpp(11 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias  : 23 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip mode=1 signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing deblock sao

Input #0, vapoursynth, from 'D:\PProjects\vs-deoldify_dev\encoding.vpy':
  Duration: 00:01:48.19, start: 0.000000, bitrate: 0 kb/s
  Stream #0:0: Video: wrapped_avframe, yuv420p10le, 1280x692, 23.98 tbr, 23.98 tbn
Stream mapping:
  Stream #0:0 -> #0:0 (wrapped_avframe (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Output #0, rawvideo, to 'pipe:':
  Metadata:
    encoder         : Lavf60.20.100
  Stream #0:0: Video: rawvideo (Y3[11][10] / 0xA0B3359), yuv420p10le(bt709, progressive), 1280x692, q=2-31, 318555 kb/s, 23.98 fps, 23.98 tbn
      Metadata:
        encoder         : Lavc60.36.100 rawvideo
[out#0/rawvideo @ 000001ebba96d540] video:6731430kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
frame= 2594 fps=6.1 q=-0.0 Lsize= 6731430kB time=00:01:48.19 bitrate=509688.1kbits/s speed=0.254x
x265 [info]: frame I:     24, Avg QP:34.23  kb/s: 32112.72
x265 [info]: frame P:   2205, Avg QP:36.14  kb/s: 13681.62
x265 [info]: frame B:   2959, Avg QP:38.59  kb/s: 16072.83
x265 [info]: Weighted P-Frames: Y:3.4% UV:3.3%

encoded 5188 frames in 446.18s (11.63 fps), 15130.72 kb/s, Avg QP:37.53

This time the encoding worked. It seems that the problem is limited to the "yuv4mpegpipe"/"y4m" format.

Dan


RE: Deoldify Vapoursynth filter - Selur - 01.03.2024

did you check the output?


RE: Deoldify Vapoursynth filter - Selur - 01.03.2024

NVEncC reports:
y4m: failed to parse y4m header.
failed to initialize file reader(s).
Failed to open input file.



RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024

(01.03.2024, 21:46)Selur Wrote: These python outputs might be the root of the problem,...

I will release a version where these warning will be removed.
It is just enough to add in __init__.py the following code

warnings.filterwarnings("ignore", category=UserWarning, message="Arguments other than a weight enum or `None`.*?")
warnings.filterwarnings("ignore", category=UserWarning, message="torch.nn.utils.weight_norm is deprecated.*?")
I already tested it and the warning are fully removed.

But the encoding problem is due to "y4m" formatting.

Dan


RE: Deoldify Vapoursynth filter - Selur - 01.03.2024

Yeah, but without the:
from vsdeoldify import ddeoldify
clip = ddeoldify(clip=clip, model=0)
the color conversions are all the same, so it's not a general issue, but something triggered by the vsdeoldify.
Using vspipe --info "path to file" also seems correct.


RE: Deoldify Vapoursynth filter - Selur - 01.03.2024

using:
"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" g:\test.y4m -c y4m
and looking at the start of the y4m shows no issues and the file is playable. (and contains deoldify output)
Hmm,...
F:\Hybrid\64bit\Vapoursynth>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" --filter-time "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" - -c y4m | "F:\Hybrid\64bit\NVEncC.exe" --y4m -i - --fps 25.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 23.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --multiref-l0 3 --multiref-l1 3 --bframes 3 --bref-mode auto --mv-precision Q-pel --preset quality --colorrange limited --colormatrix bt470bg --cuda-schedule sync --output "J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1"
--------------------------------------------------------------------------------
J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1
--------------------------------------------------------------------------------

Information: Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
y4m: failed to parse y4m header.
failed to initialize file reader(s).
Failed to open input file.
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)

Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")

Error: fwrite() call failed when writing frame: 0, plane: 0, errno: 22
Output 33 frames in 4.14 seconds (7.97 fps)
Filtername           Filter mode   Time (%)   Time (s)
ModifyFrame          parreq          99.88       4.14
Bicubic              parallel         0.56       0.02
LWLibavSource        unordered        0.44       0.02
Bicubic              parallel         0.24       0.01
SetFrameProps        parallel         0.00       0.00
SetFrameProp         parallel         0.00       0.00
AssumeFPS            parallel         0.00       0.00
SetFrameProp         parallel         0.00       0.00
AssumeFPS            parallel         0.00       0.00
the
Error: fwrite() call failed when writing frame: 0, plane: 0, errno: 22
is probably the issue,..

limiting the Vapoursynth cores with 'core.num_threads = 1' doesn't help either


RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024

I cannot debug the formatting of "y4m".
I perfomed another test.

Test 4: vsPipe using raw output.

D:\PProjects\vs-deoldify_dev>"D:\Programs\Hybrid\64bit\Vapoursynth\vspipe.exe" "D:\PProjects\vs-deoldify_dev\encoding.vpy" -   | "D:\Programs\Hybrid\64bit\x265.exe" --input-res 1280x692 --fps 23.976 - -o "D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265"
yuv  [info]: 1280x692 fps 23976/1000 i420p8 unknown frame count
raw  [info]: output file: D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265
x265 [info]: HEVC encoder version 3.5+115-88fd6d3ad
x265 [info]: build info [Windows][GCC 13.2.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-3.1 (Main tier)
x265 [info]: Thread pool created using 20 threads
x265 [info]: Slices                              : 1
x265 [info]: frame threads / pool features       : 4 / wpp(11 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias  : 23 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip mode=1 signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing deblock sao

Information: Note: NumExpr detected 20 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
Output 2594 frames in 419.82 seconds (6.18 fps)
x265 [info]: frame I:     24, Avg QP:34.23  kb/s: 32112.72
x265 [info]: frame P:   2205, Avg QP:36.14  kb/s: 13681.62
x265 [info]: frame B:   2959, Avg QP:38.59  kb/s: 16072.83
x265 [info]: Weighted P-Frames: Y:3.4% UV:3.3%

encoded 5188 frames in 440.83s (11.77 fps), 15130.72 kb/s, Avg QP:37.53

and also vsPipe worked using raw output. Smile

What is interesting the the pipe version almost doubled the encoding speed

ffmpeg with pipe:
encoded 5188 frames in 446.18s (11.63 fps), 15130.72 kb/s, Avg QP:37.53

vsPipe:
encoded 5188 frames in 440.83s (11.77 fps), 15130.72 kb/s, Avg QP:37.53

ffmpeg.exe -f vapoursynth:
encoded 2594 frames in 416.57s (6.23 fps), 463.80 kb/s, Avg QP:32.24

This is a big improvement!  Smile
The encoding speed of Jupiter version of Deoldify on my PC is about 5.6 fps

Dan


RE: Deoldify Vapoursynth filter - Selur - 01.03.2024

setting
NUMEXPR_MAX_THREADS=1
works too

F:\Hybrid\64bit\Vapoursynth>set NUMEXPR_MAX_THREADS=1

F:\Hybrid\64bit\Vapoursynth>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" - -c y4m | "F:\Hybrid\64bit\NVEncC.exe" --y4m -i - --fps 25.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 23.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --multiref-l0 3 --multiref-l1 3 --bframes 3 --bref-mode auto --mv-precision Q-pel --preset quality --colorrange limited --colormatrix bt470bg --cuda-schedule sync --output "J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1"

--------------------------------------------------------------------------------
J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1
--------------------------------------------------------------------------------
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)

Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")

NVEncC (x64) 7.41 (r2681) by rigaya, Jan 22 2024 13:02:15 (VC 1929/Win)
OS Version     Windows 11 x64 (22631) [UTF-8]
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.31GHz] (16C/32T)
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe4x16][551.61]
NVENC / CUDA   NVENC API 12.1, CUDA 12.4, schedule mode: sync
Input Buffers  CUDA, 20 frames
Input Info     y4m(yv12(10bit))->p010 [AVX2], 640x352, 25/1 fps
Vpp Filters    copyHtoD
Output Info    AV1 main 10bit @ Level auto
               640x352p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control   VBR
Multipass      none
Bitrate        0 kbps (Max: 0 kbps)
Target Quality 23.00
Initial QP     I:20  P:23  B:25
QP range       I:0-255  P:0-255  B:0-255
QP Offset      cb:0  cr:0
VBV buf size   auto
Split Enc Mode auto
Lookahead      off
GOP length     250 frames
B frames       3 frames [ref mode: middle]
Ref frames     7 frames, MultiRef L0:auto L1:auto
AQ             on (spatial, temporal, strength 5)
Part size      max auto / min auto
Tile num       columns auto / rows auto
TemporalLayers max 1
Refs           forward auto, backward auto
VUI            matrix:bt470bg,range:limited
Others         mv:Q-pel
Output 429 frames in 30.94 seconds (13.87 fps)%

encoded 429 frames, 14.22 fps, 1135.94 kbps, 2.32 MB
encode time 0:00:30, CPU: 0.0%, GPU: 34.4%, VE: 0.1%, GPUClock: 2790MHz, VEClock: 2175MHz
frame type IDR   2
frame type I     2,  total size  0.03 MB
frame type P   108,  total size  0.01 MB
frame type B   319,  total size  2.28 MB

setting NUMEXPR_MAX_THREADS=2 worked fine too.
as does NUMEXPR_MAX_THREADS=32
-> so it only seems to crash when NUMEXPR_MAX_THREADS isn't set.


RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024

So it seems that the problem is related to the message

Information: Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.

In effect, after this message, vsPipe display the warning

y4m: failed to parse y4m header.

You should provide an option in Hybrid config to set the value of  NUMEXPR_MAX_THREADS .

Dan