RE: Deoldify Vapoursynth filter - Selur - 01.03.2024
using
C:\Users\Selur>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" --progress "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" NUL -c y4m
I see:
Information: Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
NumExpr defaulting to 8 threads.
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Script evaluation done in 10.00 seconds
Output 429 frames in 33.14 seconds (12.95 fps)
These python outputs might be the root of the problem,...
RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024
I performed another test
Test 3: ffmpeg with Windows pipe using raw output instead of y4m
D:\PProjects\vs-deoldify_dev>"D:\Programs\Hybrid\64bit\Vapoursynth\ffmpeg.exe" -f vapoursynth -i "D:\PProjects\vs-deoldify_dev\encoding.vpy" -vcodec rawvideo -pixel_format yuv420p -strict -1 -f rawvideo - | "D:\Programs\Hybrid\64bit\x265.exe" --input-res 1280x692 --fps 23.976 - -o "D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265"
ffmpeg version N-113112-g548ceb9b8f-gf5f414d9c4+3 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 13.2.0 (Rev3, Built by MSYS2 project)
configuration: --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache g++' --ld='ccache g++' --extra-cxxflags=-fpermissive --extra-cflags=-Wno-int-conversion --disable-autodetect --enable-amf --enable-bzlib --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libdav1d --enable-libaom --disable-debug --enable-fontconfig --enable-libass --enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp --enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 --enable-librav1e --enable-libsrt --enable-libgsm --enable-libvmaf --enable-libsvtav1 --enable-openal --enable-vapoursynth --enable-opencl --enable-opengl --enable-mbedtls --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++ --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads --extra-cflags=-DAL_LIBTYPE_STATIC --extra-cflags='-IH:/mabs/local64/include' --extra-cflags='-IH:/mabs/local64/include/AL'
libavutil 58. 36.100 / 58. 36.100
libavcodec 60. 36.100 / 60. 36.100
libavformat 60. 20.100 / 60. 20.100
libavdevice 60. 4.100 / 60. 4.100
libavfilter 9. 14.101 / 9. 14.101
libswscale 7. 6.100 / 7. 6.100
libswresample 4. 13.100 / 4. 13.100
libpostproc 57. 4.100 / 57. 4.100
yuv [info]: 1280x692 fps 23976/1000 i420p8 unknown frame count
raw [info]: output file: D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265
x265 [info]: HEVC encoder version 3.5+115-88fd6d3ad
x265 [info]: build info [Windows][GCC 13.2.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-3.1 (Main tier)
x265 [info]: Thread pool created using 20 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 4 / wpp(11 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias : 23 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip mode=1 signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing deblock sao
Input #0, vapoursynth, from 'D:\PProjects\vs-deoldify_dev\encoding.vpy':
Duration: 00:01:48.19, start: 0.000000, bitrate: 0 kb/s
Stream #0:0: Video: wrapped_avframe, yuv420p10le, 1280x692, 23.98 tbr, 23.98 tbn
Stream mapping:
Stream #0:0 -> #0:0 (wrapped_avframe (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Output #0, rawvideo, to 'pipe:':
Metadata:
encoder : Lavf60.20.100
Stream #0:0: Video: rawvideo (Y3[11][10] / 0xA0B3359), yuv420p10le(bt709, progressive), 1280x692, q=2-31, 318555 kb/s, 23.98 fps, 23.98 tbn
Metadata:
encoder : Lavc60.36.100 rawvideo
[out#0/rawvideo @ 000001ebba96d540] video:6731430kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
frame= 2594 fps=6.1 q=-0.0 Lsize= 6731430kB time=00:01:48.19 bitrate=509688.1kbits/s speed=0.254x
x265 [info]: frame I: 24, Avg QP:34.23 kb/s: 32112.72
x265 [info]: frame P: 2205, Avg QP:36.14 kb/s: 13681.62
x265 [info]: frame B: 2959, Avg QP:38.59 kb/s: 16072.83
x265 [info]: Weighted P-Frames: Y:3.4% UV:3.3%
encoded 5188 frames in 446.18s (11.63 fps), 15130.72 kb/s, Avg QP:37.53
This time the encoding worked. It seems that the problem is limited to the "yuv4mpegpipe"/"y4m" format.
Dan
RE: Deoldify Vapoursynth filter - Selur - 01.03.2024
did you check the output?
RE: Deoldify Vapoursynth filter - Selur - 01.03.2024
NVEncC reports:
y4m: failed to parse y4m header.
failed to initialize file reader(s).
Failed to open input file.
RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024
(01.03.2024, 21:46)Selur Wrote: These python outputs might be the root of the problem,...
I will release a version where these warning will be removed.
It is just enough to add in __init__.py the following code
warnings.filterwarnings("ignore", category=UserWarning, message="Arguments other than a weight enum or `None`.*?")
warnings.filterwarnings("ignore", category=UserWarning, message="torch.nn.utils.weight_norm is deprecated.*?")
I already tested it and the warning are fully removed.
But the encoding problem is due to "y4m" formatting.
Dan
RE: Deoldify Vapoursynth filter - Selur - 01.03.2024
Yeah, but without the:
from vsdeoldify import ddeoldify
clip = ddeoldify(clip=clip, model=0)
the color conversions are all the same, so it's not a general issue, but something triggered by the vsdeoldify.
Using vspipe --info "path to file" also seems correct.
RE: Deoldify Vapoursynth filter - Selur - 01.03.2024
using:
"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" g:\test.y4m -c y4m
and looking at the start of the y4m shows no issues and the file is playable. (and contains deoldify output)
Hmm,...
F:\Hybrid\64bit\Vapoursynth>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" --filter-time "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" - -c y4m | "F:\Hybrid\64bit\NVEncC.exe" --y4m -i - --fps 25.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 23.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --multiref-l0 3 --multiref-l1 3 --bframes 3 --bref-mode auto --mv-precision Q-pel --preset quality --colorrange limited --colormatrix bt470bg --cuda-schedule sync --output "J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1"
--------------------------------------------------------------------------------
J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1
--------------------------------------------------------------------------------
Information: Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
y4m: failed to parse y4m header.
failed to initialize file reader(s).
Failed to open input file.
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Error: fwrite() call failed when writing frame: 0, plane: 0, errno: 22
Output 33 frames in 4.14 seconds (7.97 fps)
Filtername Filter mode Time (%) Time (s)
ModifyFrame parreq 99.88 4.14
Bicubic parallel 0.56 0.02
LWLibavSource unordered 0.44 0.02
Bicubic parallel 0.24 0.01
SetFrameProps parallel 0.00 0.00
SetFrameProp parallel 0.00 0.00
AssumeFPS parallel 0.00 0.00
SetFrameProp parallel 0.00 0.00
AssumeFPS parallel 0.00 0.00
the
Error: fwrite() call failed when writing frame: 0, plane: 0, errno: 22
is probably the issue,..
limiting the Vapoursynth cores with 'core.num_threads = 1' doesn't help either
RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024
I cannot debug the formatting of "y4m".
I perfomed another test.
Test 4: vsPipe using raw output.
D:\PProjects\vs-deoldify_dev>"D:\Programs\Hybrid\64bit\Vapoursynth\vspipe.exe" "D:\PProjects\vs-deoldify_dev\encoding.vpy" - | "D:\Programs\Hybrid\64bit\x265.exe" --input-res 1280x692 --fps 23.976 - -o "D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265"
yuv [info]: 1280x692 fps 23976/1000 i420p8 unknown frame count
raw [info]: output file: D:\PProjects\vs-deoldify_dev\VideoTest1_720p-1.265
x265 [info]: HEVC encoder version 3.5+115-88fd6d3ad
x265 [info]: build info [Windows][GCC 13.2.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-3.1 (Main tier)
x265 [info]: Thread pool created using 20 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 4 / wpp(11 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias : 23 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip mode=1 signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing deblock sao
Information: Note: NumExpr detected 20 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
Output 2594 frames in 419.82 seconds (6.18 fps)
x265 [info]: frame I: 24, Avg QP:34.23 kb/s: 32112.72
x265 [info]: frame P: 2205, Avg QP:36.14 kb/s: 13681.62
x265 [info]: frame B: 2959, Avg QP:38.59 kb/s: 16072.83
x265 [info]: Weighted P-Frames: Y:3.4% UV:3.3%
encoded 5188 frames in 440.83s (11.77 fps), 15130.72 kb/s, Avg QP:37.53
and also vsPipe worked using raw output.
What is interesting the the pipe version almost doubled the encoding speed
ffmpeg with pipe:
encoded 5188 frames in 446.18s (11.63 fps), 15130.72 kb/s, Avg QP:37.53
vsPipe:
encoded 5188 frames in 440.83s (11.77 fps), 15130.72 kb/s, Avg QP:37.53
ffmpeg.exe -f vapoursynth:
encoded 2594 frames in 416.57s (6.23 fps), 463.80 kb/s, Avg QP:32.24
This is a big improvement!
The encoding speed of Jupiter version of Deoldify on my PC is about 5.6 fps
Dan
RE: Deoldify Vapoursynth filter - Selur - 01.03.2024
setting
works too
F:\Hybrid\64bit\Vapoursynth>set NUMEXPR_MAX_THREADS=1
F:\Hybrid\64bit\Vapoursynth>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" - -c y4m | "F:\Hybrid\64bit\NVEncC.exe" --y4m -i - --fps 25.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 23.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --multiref-l0 3 --multiref-l1 3 --bframes 3 --bref-mode auto --mv-precision Q-pel --preset quality --colorrange limited --colormatrix bt470bg --cuda-schedule sync --output "J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1"
--------------------------------------------------------------------------------
J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1
--------------------------------------------------------------------------------
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
NVEncC (x64) 7.41 (r2681) by rigaya, Jan 22 2024 13:02:15 (VC 1929/Win)
OS Version Windows 11 x64 (22631) [UTF-8]
CPU AMD Ryzen 9 7950X 16-Core Processor [5.31GHz] (16C/32T)
GPU #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe4x16][551.61]
NVENC / CUDA NVENC API 12.1, CUDA 12.4, schedule mode: sync
Input Buffers CUDA, 20 frames
Input Info y4m(yv12(10bit))->p010 [AVX2], 640x352, 25/1 fps
Vpp Filters copyHtoD
Output Info AV1 main 10bit @ Level auto
640x352p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control VBR
Multipass none
Bitrate 0 kbps (Max: 0 kbps)
Target Quality 23.00
Initial QP I:20 P:23 B:25
QP range I:0-255 P:0-255 B:0-255
QP Offset cb:0 cr:0
VBV buf size auto
Split Enc Mode auto
Lookahead off
GOP length 250 frames
B frames 3 frames [ref mode: middle]
Ref frames 7 frames, MultiRef L0:auto L1:auto
AQ on (spatial, temporal, strength 5)
Part size max auto / min auto
Tile num columns auto / rows auto
TemporalLayers max 1
Refs forward auto, backward auto
VUI matrix:bt470bg,range:limited
Others mv:Q-pel
Output 429 frames in 30.94 seconds (13.87 fps)%
encoded 429 frames, 14.22 fps, 1135.94 kbps, 2.32 MB
encode time 0:00:30, CPU: 0.0%, GPU: 34.4%, VE: 0.1%, GPUClock: 2790MHz, VEClock: 2175MHz
frame type IDR 2
frame type I 2, total size 0.03 MB
frame type P 108, total size 0.01 MB
frame type B 319, total size 2.28 MB
setting NUMEXPR_MAX_THREADS=2 worked fine too.
as does NUMEXPR_MAX_THREADS=32
-> so it only seems to crash when NUMEXPR_MAX_THREADS isn't set.
RE: Deoldify Vapoursynth filter - Dan64 - 01.03.2024
So it seems that the problem is related to the message
Information: Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Information: NumExpr defaulting to 8 threads.
In effect, after this message, vsPipe display the warning
y4m: failed to parse y4m header.
You should provide an option in Hybrid config to set the value of NUMEXPR_MAX_THREADS .
Dan
|