Posts: 12.057 
	Threads: 66 
	Joined: May 2017
	
	 
 
	
	
		This is something related to vs-deoldify only. 
Your wrapper should have an option to set it, and always set it.
	 
	
	
---- 
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page. 
 
	
	
 
 
	
	
	
		
	Posts: 987 
	Threads: 81 
	Joined: Feb 2020
	
	 
 
	
	
		This message is triggered by " numpy" which is used by Deoldify.
 
I will add an option in the filter to set the number of thread (default = 8).
 
It is just enough to add the following code in the filter
 os.environ['NUMEXPR_MAX_THREADS'] = '8'
 
Dan
	  
	
	
	
	
 
 
	
	
	
		
	Posts: 12.057 
	Threads: 66 
	Joined: May 2017
	
	 
 
	
	
		top, going to bed now.   
	 
	
	
---- 
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page. 
 
	
	
 
 
	
	
	
		
	Posts: 987 
	Threads: 81 
	Joined: Feb 2020
	
	 
 
	
	
		I tested vsPipe with "y4m" 
D:\PProjects\vs-deoldify_dev>set NUMEXPR_MAX_THREADS=10 
 
D:\PProjects\vs-deoldify_dev>"D:\Programs\Hybrid\64bit\Vapoursynth\vspipe.exe" "D:\PProjects\vs-deoldify_dev\encoding.vpy" - -c y4m   | "D:\Programs\Hybrid\64bit\x265.exe" --preset fast --input - --fps 24000/1001 --output-depth 10 --y4m --profile main10 --b-adapt 2 --crf 21.00 --psy-rd 2.00 --deblock=-1:-1 --psnr --ssim --range limited --sar 1:1 --output "D:\PProjects\vs-deoldify_dev\VideoTest1_720p.265" 
 
y4m  [info]: 1280x692 fps 24000/1001 i420p10 sar 1:1 unknown frame count 
raw  [info]: output file: D:\PProjects\vs-deoldify_dev\VideoTest1_720p.265 
x265 [info]: HEVC encoder version 3.5+115-88fd6d3ad 
x265 [info]: build info [Windows][GCC 13.2.0][64 bit] 10bit 
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 
x265 [warning]: --psnr used with psy on: results will be invalid! 
x265 [warning]: --tune psnr should be used if attempting to benchmark psnr! 
x265 [info]: Main 10 profile, Level-3.1 (Main tier) 
x265 [info]: Thread pool created using 20 threads 
x265 [info]: Slices                              : 1 
x265 [info]: frame threads / pool features       : 4 / wpp(11 rows) 
x265 [warning]: Source height < 720p; disabling lookahead-slices 
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra 
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 2 
x265 [info]: Keyframe min / max / scenecut / bias  : 23 / 250 / 40 / 5.00 
x265 [info]: Lookahead / bframes / badapt        : 15 / 4 / 2 
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0 
x265 [info]: References / ref-limit  cu / depth  : 3 / on / on 
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1 
x265 [info]: Rate Control / qCompress            : CRF-21.0 / 0.60 
x265 [info]: tools: rd=2 psy-rd=2.00 rskip mode=1 signhide tmvp fast-intra 
x265 [info]: tools: strong-intra-smoothing deblock(tC=-1:B=-1) sao 
Output 2594 frames in 415.95 seconds (6.24 fps) 
x265 [info]: frame I:     14, Avg QP:19.85  kb/s: 8718.74   PSNR Mean: Y:48.786 U:51.412 V:51.627  SSIM Mean: 0.991435 (20.673dB) 
x265 [info]: frame P:    663, Avg QP:20.24  kb/s: 3345.85   PSNR Mean: Y:48.327 U:50.204 V:50.376  SSIM Mean: 0.991880 (20.904dB) 
x265 [info]: frame B:   1917, Avg QP:25.28  kb/s: 653.67    PSNR Mean: Y:47.784 U:48.858 V:48.975  SSIM Mean: 0.991540 (20.726dB) 
x265 [info]: Weighted P-Frames: Y:11.3% UV:10.4% 
 
encoded 2594 frames in 415.78s (6.24 fps), 1385.29 kb/s, Avg QP:23.96, Global PSNR: 48.267, SSIM Mean Y: 0.9916264 (20.771 dB)
 
Now the speed decrease to 6.24 fps. 
So to speed-up the encoding is not the pipe but the "raw" format. 
An increase of speed of 1.9x is worth your attention.
 
You should consider to abandon the "y4m" format for vsPipe and switch to " raw" format, please check if you obtain the same increase in speed.
 
Thanks, 
Dan
	  
	
	
	
	
 
 
	
	
	
		
	Posts: 12.057 
	Threads: 66 
	Joined: May 2017
	
	 
 
	
		
		
		01.03.2024, 23:01 
(This post was last modified: 01.03.2024, 23:21 by Selur.)
		
	 
	
		Will do some testing tomorrow, question is vspipe or x265 faster with raw and is this always the case. 
Also note that the main downside of raw video pipes is that any output to std:out will the stream,...
 F:\Hybrid\64bit>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" - -c y4m | "F:\Hybrid\64bit\NVEncC.exe" --y4m -i - --fps 25.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 23.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --multiref-l0 3 --multiref-l1 3 --bframes 3 --bref-mode auto --mv-precision Q-pel --preset quality --colorrange limited --colormatrix bt470bg --cuda-schedule sync --output "J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1" 
-------------------------------------------------------------------------------- 
J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1 
-------------------------------------------------------------------------------- 
 
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights. 
  warnings.warn(msg) 
 
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. 
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") 
 
NVEncC (x64) 7.41 (r2681) by rigaya, Jan 22 2024 13:02:15 (VC 1929/Win) 
OS Version     Windows 11 x64 (22631) [UTF-8] 
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.50GHz] (16C/32T) 
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe4x16][551.61] 
NVENC / CUDA   NVENC API 12.1, CUDA 12.4, schedule mode: sync 
Input Buffers  CUDA, 20 frames 
Input Info     y4m(yv12(10bit))->p010 [AVX2], 640x352, 25/1 fps 
Vpp Filters    copyHtoD 
Output Info    AV1 main 10bit @ Level auto 
               640x352p 1:1 25.000fps (25/1fps) 
Encoder Preset quality 
Rate Control   VBR 
Multipass      none 
Bitrate        0 kbps (Max: 0 kbps) 
Target Quality 23.00 
Initial QP     I:20  P:23  B:25 
QP range       I:0-255  P:0-255  B:0-255 
QP Offset      cb:0  cr:0 
VBV buf size   auto 
Split Enc Mode auto 
Lookahead      off 
GOP length     250 frames 
B frames       3 frames [ref mode: middle] 
Ref frames     7 frames, MultiRef L0:auto L1:auto 
AQ             on (spatial, temporal, strength 5) 
Part size      max auto / min auto 
Tile num       columns auto / rows auto 
TemporalLayers max 1 
Refs           forward auto, backward auto 
VUI            matrix:bt470bg,range:limited 
Others         mv:Q-pel 
Output 429 frames in 29.80 seconds (14.40 fps)% 
 
encoded 429 frames, 14.48 fps, 1135.94 kbps, 2.32 MB 
encode time 0:00:29, CPU: 0.0%, GPU: 7.9%, GPUClock: 2805MHz, VEClock: 2175MHz 
frame type IDR   2 
frame type I     2,  total size  0.03 MB 
frame type P   108,  total size  0.01 MB 
frame type B   319,  total size  2.28 MB 
 
F:\Hybrid\64bit>"F:\Hybrid\64bit\Vapoursynth\vspipe.exe" "J:\tmp\encodingTempSynthSkript_2024-03-01@20_35_55_0010_0.vpy" - | "F:\Hybrid\64bit\NVEncC.exe" --raw --input-res 640x352 -i - --fps 25.000 --codec av1 --sar 1:1 --output-depth 10 --vbr 0 --vbr-quality 23.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 7 --multiref-l0 3 --multiref-l1 3 --bframes 3 --bref-mode auto --mv-precision Q-pel --preset quality --colorrange limited --colormatrix bt470bg --cuda-schedule sync --output "J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1" 
-------------------------------------------------------------------------------- 
J:\tmp\test_1_2024-03-01@20_35_55_0010_02.av1 
-------------------------------------------------------------------------------- 
 
NVEncC (x64) 7.41 (r2681) by rigaya, Jan 22 2024 13:02:15 (VC 1929/Win) 
OS Version     Windows 11 x64 (22631) [UTF-8] 
CPU            AMD Ryzen 9 7950X 16-Core Processor [5.52GHz] (16C/32T) 
GPU            #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe4x16][551.61] 
NVENC / CUDA   NVENC API 12.1, CUDA 12.4, schedule mode: sync 
Input Buffers  CUDA, 20 frames 
Input Info     raw(yv12)->nv12 [AVX2], 640x352, 25/1 fps 
Vpp Filters    copyHtoD 
               cspconv(nv12 -> p010) 
Output Info    AV1 main 10bit @ Level auto 
               640x352p 1:1 25.000fps (25/1fps) 
Encoder Preset quality 
Rate Control   VBR 
Multipass      none 
Bitrate        0 kbps (Max: 0 kbps) 
Target Quality 23.00 
Initial QP     I:20  P:23  B:25 
QP range       I:0-255  P:0-255  B:0-255 
QP Offset      cb:0  cr:0 
VBV buf size   auto 
Split Enc Mode auto 
Lookahead      off 
GOP length     250 frames 
B frames       3 frames [ref mode: middle] 
Ref frames     7 frames, MultiRef L0:auto L1:auto 
AQ             on (spatial, temporal, strength 5) 
Part size      max auto / min auto 
Tile num       columns auto / rows auto 
TemporalLayers max 1 
Refs           forward auto, backward auto 
VUI            matrix:bt470bg,range:limited 
Others         mv:Q-pel 
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet101_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet101_Weights.DEFAULT` to get the most up-to-date weights. 
  warnings.warn(msg) 
 
Warning: F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. 
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") 
 
Output 429 frames in 29.29 seconds (14.65 fps)0% 
 
encoded 858 frames, 21.79 fps, 2979.71 kbps, 12.19 MB 
encode time 0:00:39, CPU: 0.0%, GPU: 10.2%, GPUClock: 2796MHz, VEClock: 2171MHz 
frame type IDR   4 
frame type I     4,  total size   0.19 MB 
frame type P   216,  total size   0.07 MB 
frame type B   638,  total size  11.93 MB
  
tooo sleepy,...  this looks wrong, check whether the output are really correctly playable in your examples.
 
encoded 429 frames, 14.48 fps, 1135.94 kbps, 2.32 MB (429 is the correct frame count)
 
encoded 858 frames, 21.79 fps, 2979.71 kbps, 12.19 MB (same input but double the number of frames)
 
Cu Selur
	  
	
	
---- 
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page. 
 
	
	
 
 
	
	
	
		
	Posts: 987 
	Threads: 81 
	Joined: Feb 2020
	
	 
 
	
	
		I released a new version:  https://github.com/dan64/vs-deoldify/rel...tag/v1.0.2
I applied the following changes:
 - updated the readme.
 
 
- filtered out the torch warnings
 
 
- added the new parameter "n_threads" to set the number of threads used by numpy (default=8)
 
 
 
These changes should enable the encoding using vsPipe.
 
Dan
	  
	
	
	
	
 
 
	
	
	
		
	Posts: 987 
	Threads: 81 
	Joined: Feb 2020
	
	 
 
	
	
		The new dev version with vs-deoldify 1.0.2 is working.    
Thanks, 
Dan
  
 (01.03.2024, 22:17)Dan64 Wrote:  ffmpeg with pipe: 
encoded 5188 frames in 446.18s (11.63 fps), 15130.72 kb/s, Avg QP:37.53 
 
vsPipe: 
encoded 5188 frames in 440.83s (11.77 fps), 15130.72 kb/s, Avg QP:37.53 
 
ffmpeg.exe -f vapoursynth: 
encoded 2594 frames in 416.57s (6.23 fps), 463.80 kb/s, Avg QP:32.24
  
This is a big improvement!    
The encoding speed of Jupiter version of Deoldify on my PC is about 5.6 fps 
 
Dan 
  The fps speed reported in the  raw mode is wrong.  For some reason in raw mode is reported that the number of frames encoded is 5188, while in reality are the half, 2594. And this the reason why the reported fps speed doubled.  
  But the total encoding time is almost the same: 446s, 440s, 416s.     
  Yesterday was too tired to observe it. The raw mode is not introducing any encoding speed increase.    
Dan
	  
	
	
	
	
 
 
	
	
	
		
	Posts: 12.057 
	Threads: 66 
	Joined: May 2017
	
	 
 
	
		
		
		02.03.2024, 13:37 
(This post was last modified: 02.03.2024, 13:39 by Selur.)
		
	 
	
		I agree, I also did some tests and I too can't detect any real speed difference (that isn't in the normal error range). 
 
Cu Selur 
 
Ps.: also includes vs-deoldify in the torch-addon.
	 
	
	
---- 
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page. 
 
	
	
 
 
	
	
	
		
	Posts: 12.057 
	Threads: 66 
	Joined: May 2017
	
	 
 
	
		
		
		02.03.2024, 15:48 
(This post was last modified: 02.03.2024, 16:00 by Selur.)
		
	 
	
		btw. using Merge combining ddcolor and deoldify surprisingly does look interesting: 
file
![[Image: grafik.png]](https://i.ibb.co/p3Xm0P1/grafik.png) 
(not good enough to get integrated into Hybrid)
 
Cu Selur
	  
	
	
---- 
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page. 
 
	
	
 
 
	
	
	
		
	Posts: 987 
	Threads: 81 
	Joined: Feb 2020
	
	 
 
	
	
		If you are looking to the perfect colorizer, I think that it will be necessary wait too many years. 
To me the result look good enought. In Stable Diffusion it is possible to set for every "filter" a weight that they call "visibility". 
I understand that implementing this feature in Hybrid for every filter is a mess. 
But at least you can consider the possibility to Merge the 2 filters in some way.
 
Thanks, 
Dan
 
P.S. 
In meanwhile I posted this request to rigaya:  https://github.com/rigaya/NVEnc/issues/564
	 
	
	
	
	
 
 
	 
 |