Hybrid job stalls at endcoding with Deoldify - wpsandy - 01.09.2024
Hi, love the tool and how easy it is to set up a workflow so far.
For whatever reason whenever I select DeOldify the job never starts processing at the encoding video step. I have processed jobs with just DDColor just fine. I see it fire up multiple threads, engage the GPU and run through the job successfully.
I tried to set up a flow that was the bare minimum with DeOldify ( x264 average sample/single pass, QTGMC, no audio conversion ) and it gets stuck in the same spot every time. 4 threads are consuming CPU time, but there isn't any indication of what's going on. First couple of attempts and I have left the job in this state for over an hour with no frames processed because I wasn't sure if it needed to do some background work.
I did verify that the models for DeOldify do exist at C:\Program Files\Hybrid\64bit\Vapoursynth\Lib\site-packages\vsdeoldify\models.
I am running what I thought was the most recent stable release, plus I followed the instructions to install the Windows addons to be able to pick up the most recent version of Ddcolor, deoldify.
I have looked through the debug log but I'm not seeing what the issue might be. I'm chalking that up to my inexperience with the tool
Any suggestions on what I have gotten screwed up?
RE: Hybrid job stalls at endcoding with Deoldify - Selur - 01.09.2024
Looking at the used script:
# Imports
import vapoursynth as vs
# getting Vapoursynth core
import ctypes
import sys
import os
core = vs.core
# Import scripts folder
scriptPath = 'C:/Program Files/Hybrid/64bit/vsscripts'
sys.path.insert(0, os.path.abspath(scriptPath))
# Loading Support Files
Dllref = ctypes.windll.LoadLibrary("C:/Program Files/Hybrid/64bit/vsfilters/Support/libfftw3f-3.dll")
# loading plugins
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/GrainFilter/RemoveGrain/RemoveGrainVS.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/GrainFilter/AddGrain/AddGrain.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/DenoiseFilter/DFTTest/DFTTest.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/DenoiseFilter/NEO_FFT3DFilter/neo-fft3d.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/Support/EEDI3m.dll")# vsQTGMC
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/ResizeFilter/nnedi3/vsznedi3.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/Support/libmvtools.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/Support/scenechange.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/Support/fmtconv.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/MiscFilter/MiscFilters/MiscFilters.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/DeinterlaceFilter/Bwdif/Bwdif.dll")
core.std.LoadPlugin(path="C:/Program Files/Hybrid/64bit/vsfilters/SourceFilter/LSmashSource/LSMASHSource.dll")
# Import scripts
import havsfunc
import validate
# Source: 'C:\Users\andys\Desktop\Greg\Original Clip.avi'
# Current color space: YUV422P8, bit depth: 8, resolution: 640x480, frame rate: 29.97fps, scanorder: bottom field first, yuv luminance scale: limited, matrix: 470bg
# Loading C:\Users\andys\Desktop\Greg\Original Clip.avi using LWLibavSource
clip = core.lsmas.LWLibavSource(source="C:/Users/andys/Desktop/Greg/Original Clip.avi", format="YUV422P8", stream_index=1, cache=0, prefer_hw=0)
frame = clip.get_frame(0)
# Setting detected color matrix (470bg).
clip = core.std.SetFrameProps(clip=clip, _Matrix=5)
# setting color transfer (170), if it is not set.
if validate.transferIsInvalid(clip):
clip = core.std.SetFrameProps(clip=clip, _Transfer=6)
# setting color primaries info (to 470), if it is not set.
if validate.primariesIsInvalid(clip):
clip = core.std.SetFrameProps(clip=clip, _Primaries=5)
# setting color range to TV (limited) range.
clip = core.std.SetFrameProps(clip=clip, _ColorRange=1)
# making sure frame rate is set to 29.97fps
clip = core.std.AssumeFPS(clip=clip, fpsnum=30000, fpsden=1001)
# making sure the detected scan type is set (detected: bottom field first)
clip = core.std.SetFrameProps(clip=clip, _FieldBased=1) # bff
# Deinterlacing using QTGMC
clip = havsfunc.QTGMC(Input=clip, Preset="Fast", TFF=False) # new fps: 29.97
# Making sure content is preceived as frame based
clip = core.std.SetFrameProps(clip=clip, _FieldBased=0) # progressive
clip = clip[::2] # selecting previously even frames
# changing color matrix from '470bg' to '709' for vsDeOldify
clip = core.resize.Bicubic(clip, matrix_in_s="470bg", matrix_s="709")
# changing range from limited to full range for vsDeOldify
clip = core.resize.Bicubic(clip, range_in_s="limited", range_s="full")
# setting color range to PC (full) range.
clip = core.std.SetFrameProps(clip=clip, _ColorRange=0)
# adjusting color space from YUV422P8 to RGB24 for vsDeOldify
clip = core.resize.Bicubic(clip=clip, format=vs.RGB24, matrix_in_s="709", range_s="full")
# adding colors using DeOldify
from vsdeoldify import HAVC_ddeoldify
clipRef = HAVC_ddeoldify(clip=clip, deoldify_p=[0, 24, 1, 0], ddcolor_p=[1, 24, 1, 0, True], ddtweak=True, sc_threshold=0.10, sc_min_freq=500)
from vsdeoldify import HAVC_deepex
clip = HAVC_deepex(clip=clip, clip_ref=clipRef, ref_merge=0, dark="True", smooth=True)
from vsdeoldify import HAVC_stabilizer
clip = HAVC_stabilizer(clip=clip, stab=True, colormap="none", render_factor=24)
# internally changing color matrix for YUV<>RGB to '470bg' undoing color matrix change for vsDeOldify
# changing range from full to limited range for vsDeOldify
clip = core.resize.Bicubic(clip, range_in_s="full", range_s="limited")
# adjusting output color from: RGB24 to YUV420P8 for x264Model
clip = core.resize.Bicubic(clip=clip, format=vs.YUV420P8, matrix_s="470bg", range_s="limited")
# set output frame rate to 29.97fps (progressive)
clip = core.std.AssumeFPS(clip=clip, fpsnum=30000, fpsden=1001)
# output
clip.set_output()
I see not problem and the encoding call:
"C:\Program Files\Hybrid\64bit\Vapoursynth\vspipe.exe" "C:\Users\andys\AppData\Local\Temp\encodingTempSynthSkript_2024-08-31@20_18_48_1310_0.vpy" - -c y4m | "C:\Program Files\Hybrid\64bit\x264.exe" --preset veryfast --bitrate 1500 --profile high --level 5.1 --ref 3 --direct auto --b-adapt 0 --sync-lookahead 24 --ratetol 2.00 --qcomp 0.50 --rc-lookahead 40 --qpmax 51 --partitions i4x4,p8x8,b8x8 --no-fast-pskip --subme 5 --aq-mode 0 --vbv-maxrate 1500 --vbv-bufsize 300000 --sar 1:1 --non-deterministic --range tv --colormatrix bt470bg --demuxer y4m --input-range tv --fps 30000/1001 --output-depth 8 --output "C:\Users\andys\AppData\Local\Temp\2024-08-31@20_18_48_1310_02.264" -
seems fine too.
According to the log, x264 didn't get any frames (wehn you killed the thread after 15min), so I suspect the issue is with processing of the Vapoursynth script.
Tested this with a similar source (same general resolution&co) and these settings and here encoding worked fine. VRAM usage spiked at ~13,5GB, encoding can at 4.5fps on a Geforce RTX 4800 with 16GB RAM.
What GPU are you using? Depending on your available VRAM, you might have to tweak your settings to use less VRAM.
Does the Vapoursynth Preview work? (it should show some details if it does not)
You can also try the dev version (+ addons in the experimental folder; I used these)
Quote:4 threads are consuming CPU time, but there isn't any indication of what's going on.
For me, it takes ~25 seconds before the first image appears when using Vapoursynth Preview.
Before aborting, check the usage of your GPU, since the engine creation will mainly run on the GPU.
Cu Selur
Ps.: Also try whether disabling your antivirus software helps. There have been issues where some antivirus software unnecessarily stalled the processing.
RE: Hybrid job stalls at endcoding with Deoldify - wpsandy - 01.09.2024
(01.09.2024, 06:08)Selur Wrote: Quote:According to the log, x264 didn't get any frames (wehn you killed the thread after 15min), so I suspect the issue is with processing of the Vapoursynth script.
Tested this with a similar source (same general resolution&co) and these settings and here encoding worked fine. VRAM usage spiked at ~13,5GB, encoding can at 4.5fps on a Geforce RTX 4800 with 16GB RAM.
What GPU are you using? Depending on your available VRAM, you might have to tweak your settings to use less VRAM.
I'm using a RTX 3070 FE. When running the job, it would load up to 2.8 GB of 2.8 GB of Dedicated GPU Memory with 3.0 GB of GPU memory consumed ( 0.2GB of Shared used)
Does the Vapoursynth Preview work? (it should show some details if it does not)
You can also try the dev version (+ addons in the experimental folder; I used these)
Quote:Quote:4 threads are consuming CPU time, but there isn't any indication of what's going on.
For me, it takes ~25 seconds before the first image appears when using Vapoursynth Preview.
Before aborting, check the usage of your GPU, since the engine creation will mainly run on the GPU.
The preview never shows up. It's a similar behavior to the full job. CPU ramps up a few threads, I can see memory consumption on the GPU go up modestly a few blips of GPU processing but then it sits there. Tried this with and without antivirus and that didn't seem to change the behavior.
I'll give the dev version a go this evening when I get back to the house. Thank you for looking at this.
RE: Hybrid job stalls at endcoding with Deoldify - Selur - 01.09.2024
Fingers crossed.
Quote:The preview never shows up. It's a similar behavior.
If the Vapoursynth Preview doesn't work, neither will the encoding.
RTX 3070 FE got 8GB of VRAM => try adjusting the settings to use less VRAM.
Cu Selur
Ps.: please, try to not make full quotes, especially not with text in the middle that really makes it hard to read,...
RE: Hybrid job stalls at endcoding with Deoldify - wpsandy - 02.09.2024
Apologies about the quoting. I'll definitely keep that in mind.
Gave the dev tree a go and the same result. Debug log attached on that run.
I made sure to turn the antivirus off, plus deactivated GPU acceleration in Windows as well.
When looking at the performance under task manager, the VRAM consumption of the process that was running never went over 1GB of use. As an experiment, I turned on "CUDA - Sysmem Fallback policy" in the Nvidia control panel and it did not seem to have an effect.
As for altering the settings that will affect VRAM, I have no idea what I'm doing. I tried the very flow preset and tried a very fast preset and both seemed to behave the same. Which of the settings would have the most impact on reducing VRAM consumption?
RE: Hybrid job stalls at endcoding with Deoldify - Selur - 02.09.2024
Debug output is of a job processing again, not of you calling the Vapoursynth Preview, so no additional infos.
=> create a debug output where you try to open the Vapoursynth Preview, check the taskmgr whether vsviewer is running, don't close until it's gone.
Quote: Which of the settings would have the most impact on reducing VRAM consumption?
No clue. Never tested it.
I did a quick test (with a 640x320 source): - 'very fast' peaked at 14.4GB here.
- 'faster' peaked at 14.3GB here.
- 'fast' peaked at 15.2GB here
- 'medium' peaked at 14.6GB here
- 'slow' peaked at 15.3GB here
- 'veryslow' peaked at 14.6GB here
- 'placebo' peaked at 15.5GB here
so the presets, seem all similar high.
(using a 4k source doesn't really change the peak vram usage during the initial phase and just seems to increase the vram usage during the processing; and slow things down)
Dan64 might be able to help here. (I posted here trying to get his attention.)
Cu Selur
Ps.: using preset 'custom' and lowering the 'render factor' to 10 lowered the peak usage to 7.3 GB, so that might help
RE: Hybrid job stalls at endcoding with Deoldify - wpsandy - 02.09.2024
I missed the preview part of that. I really should slow down my reading.
I let the preview run for ~30 minutes and it never came back. Took a peek at the debug log and it at least looks like it left some breadcrumbs on some unhandled exceptions. ( attached)
RE: Hybrid job stalls at endcoding with Deoldify - Selur - 02.09.2024
At the end, there are some warnings: (wrong char set, is the reason those look strange in the debug output)
C : \ P r o g r a m F i l e s \ H y b r i d \ 6 4 b i t \ V a p o u r s y n t h \ L i b \ s i t e - p a c k a g e s \ k o r n i a \ f e a t u r e \ l i g h t g l u e . p y : 4 4 : F u t u r e W a r n i n g : ` t o r c h . c u d a . a m p . c u s t o m _ f w d ( a r g s . . . ) ` i s d e p r e c a t e d . P l e a s e u s e ` t o r c h . a m p . c u s t o m _ f w d ( a r g s . . . , d e v i c e _ t y p e = ' c u d a ' ) ` i n s t e a d .
@ t o r c h . c u d a . a m p . c u s t o m _ f w d ( c a s t _ i n p u t s = t o r c h . f l o a t 3 2 )
C : \ P r o g r a m F i l e s \ H y b r i d \ 6 4 b i t \ V a p o u r s y n t h \ L i b \ s i t e - p a c k a g e s \ k o r n i a \ f e a t u r e \ l i g h t g l u e . p y : 4 4 : F u t u r e W a r n i n g : ` t o r c h . c u d a . a m p . c u s t o m _ f w d ( a r g s . . . ) ` i s d e p r e c a t e d . P l e a s e u s e ` t o r c h . a m p . c u s t o m _ f w d ( a r g s . . . , d e v i c e _ t y p e = ' c u d a ' ) ` i n s t e a d .
@ t o r c h . c u d a . a m p . c u s t o m _ f w d ( c a s t _ i n p u t s = t o r c h . f l o a t 3 2 )
C : \ P r o g r a m F i l e s \ H y b r i d \ 6 4 b i t \ V a p o u r s y n t h \ L i b \ s i t e - p a c k a g e s \ v s d e o l d i f y \ d e e p e x \ m o d e l s \ v g g 1 9 _ g r a y . p y : 1 3 0 : F u t u r e W a r n i n g : Y o u a r e u s i n g ` t o r c h . l o a d ` w i t h ` w e i g h t s _ o n l y = F a l s e ` ( t h e c u r r e n t d e f a u l t v a l u e ) , w h i c h u s e s t h e d e f a u l t p i c k l e m o d u l e i m p l i c i t l y . I t i s p o s s i b l e t o c o n s t r u c t m a l i c i o u s p i c k l e d a t a w h i c h w i l l e x e c u t e a r b i t r a r y c o d e d u r i n g u n p i c k l i n g ( S e e h t t p s : / / g i t h u b . c o m / p y t o r c h / p y t o r c h / b l o b / m a i n / S E C U R I T Y . m d # u n t r u s t e d - m o d e l s f o r m o r e d e t a i l s ) . I n a f u t u r e r e l e a s e , t h e d e f a u l t v a l u e f o r ` w e i g h t s _ o n l y ` w i l l b e f l i p p e d t o ` T r u e ` . T h i s l i m i t s t h e f u n c t i o n s t h a t c o u l d b e e x e c u t e d d u r i n g u n p i c k l i n g . A r b i t r a r y o b j e c t s w i l l n o l o n g e r b e a l l o w e d t o b e l o a d e d v i a t h i s m o d e u n l e s s t h e y a r e e x p l i c i t l y a l l o w l i s t e d b y t h e u s e r v i a ` t o r c h . s e r i a l i z a t i o n . a d d _ s a f e _ g l o b a l s ` . W e r e c o m m e n d y o u s t a r t s e t t i n g ` w e i g h t s _ o n l y = T r u e ` f o r a n y u s e c a s e w h e r e y o u d o n ' t h a v e f u l l c o n t r o l o f t h e l o a d e d f i l e . P l e a s e o p e n a n i s s u e o n G i t H u b f o r a n y i s s u e s r e l a t e d t o t h i s e x p e r i m e n t a l f e a t u r e .
m o d e l . l o a d _ s t a t e _ d i c t ( t o r c h . l o a d ( v g g 1 9 _ g r a y _ p a t h ) )
C : \ P r o g r a m F i l e s \ H y b r i d \ 6 4 b i t \ V a p o u r s y n t h \ L i b \ s i t e - p a c k a g e s \ v s d e o l d i f y \ d e e p e x \ m o d e l s \ v g g 1 9 _ g r a y . p y : 1 3 0 : F u t u r e W a r n i n g : Y o u a r e u s i n g ` t o r c h . l o a d ` w i t h ` w e i g h t s _ o n l y = F a l s e ` ( t h e c u r r e n t d e f a u l t v a l u e ) , w h i c h u s e s t h e d e f a u l t p i c k l e m o d u l e i m p l i c i t l y . I t i s p o s s i b l e t o c o n s t r u c t m a l i c i o u s p i c k l e d a t a w h i c h w i l l e x e c u t e a r b i t r a r y c o d e d u r i n g u n p i c k l i n g ( S e e h t t p s : / / g i t h u b . c o m / p y t o r c h / p y t o r c h / b l o b / m a i n / S E C U R I T Y . m d # u n t r u s t e d - m o d e l s f o r m o r e d e t a i l s ) . I n a f u t u r e r e l e a s e , t h e d e f a u l t v a l u e f o r ` w e i g h t s _ o n l y ` w i l l b e f l i p p e d t o ` T r u e ` . T h i s l i m i t s t h e f u n c t i o n s t h a t c o u l d b e e x e c u t e d d u r i n g u n p i c k l i n g . A r b i t r a r y o b j e c t s w i l l n o l o n g e r b e a l l o w e d t o b e l o a d e d v i a t h i s m o d e u n l e s s t h e y a r e e x p l i c i t l y a l l o w l i s t e d b y t h e u s e r v i a ` t o r c h . s e r i a l i z a t i o n . a d d _ s a f e _ g l o b a l s ` . W e r e c o m m e n d y o u s t a r t s e t t i n g ` w e i g h t s _ o n l y = T r u e ` f o r a n y u s e c a s e w h e r e y o u d o n ' t h a v e f u l l c o n t r o l o f t h e l o a d e d f i l e . P l e a s e o p e n a n i s s u e o n G i t H u b f o r a n y i s s u e s r e l a t e d t o t h i s e x p e r i m e n t a l f e a t u r e .
m o d e l . l o a d _ s t a t e _ d i c t ( t o r c h . l o a d ( v g g 1 9 _ g r a y _ p a t h ) )
C : \ P r o g r a m F i l e s \ H y b r i d \ 6 4 b i t \ V a p o u r s y n t h \ L i b \ s i t e - p a c k a g e s \ t o r c h \ n n \ u t i l s \ w e i g h t _ n o r m . p y : 1 4 3 : F u t u r e W a r n i n g : ` t o r c h . n n . u t i l s . w e i g h t _ n o r m ` i s d e p r e c a t e d i n f a v o r o f ` t o r c h . n n . u t i l s . p a r a m e t r i z a t i o n s . w e i g h t _ n o r m ` .
W e i g h t N o r m . a p p l y ( m o d u l e , n a m e , d i m )
C : \ P r o g r a m F i l e s \ H y b r i d \ 6 4 b i t \ V a p o u r s y n t h \ L i b \ s i t e - p a c k a g e s \ t o r c h \ n n \ u t i l s \ w e i g h t _ n o r m . p y : 1 4 3 : F u t u r e W a r n i n g : ` t o r c h . n n . u t i l s . w e i g h t _ n o r m ` i s d e p r e c a t e d i n f a v o r o f ` t o r c h . n n . u t i l s . p a r a m e t r i z a t i o n s . w e i g h t _ n o r m ` .
W e i g h t N o r m . a p p l y ( m o d u l e , n a m e , d i m )
which also are my debug output, so those are not the problem.
Did you try lowering the render factor to 10?
Quote:I let the preview run for ~30 minutes and it never came back.
Was it still running in the taskmgr?
Cu Selur
RE: Hybrid job stalls at endcoding with Deoldify - wpsandy - 02.09.2024
The viewer was still running...it was consuming CPU time, had ~ 1GB of VRAM allocated, ~3GB of system memory allocated and ~ 6% CPU use.
Letting the Renderfactor 10 run now. I'll report back, but at the moment, the memory and VRAM consumption looks the same.
RE: Hybrid job stalls at endcoding with Deoldify - Selur - 02.09.2024
At the beginning vram usage&co is low here too, but picks up after a bit.
|