10.02.2024, 00:32
10.02.2024, 11:27
Hello Selur,
I noted that there are situations where the "retinaface" detector is introducing strong artifacts.
You can find an example here:
https://imgsli.com/MjM5MDQz
the only combination with no artifacts (for the sample used) are
CF Detector: dlib
CF Detector: retinaface with only center
The second detector has the disadvantage to not enhance the faces that are in background.
You can find an example here:
https://imgsli.com/MjM5MDQ2
In this case "retinaface (only center)" is worse than "dlib"
The big problem is that "dlib" is very very very slow (speed 0.005 fps).
Do you know if the version available in Hybrid is CUDA enabled ? because I checked the flag dlib.DLIB_USE_CUDA and it is False.
Thanks,
Dan
P.S.
I attached a sample that when encoded with CF shows the artifacts
I noted that there are situations where the "retinaface" detector is introducing strong artifacts.
You can find an example here:
https://imgsli.com/MjM5MDQz
the only combination with no artifacts (for the sample used) are
CF Detector: dlib
CF Detector: retinaface with only center
The second detector has the disadvantage to not enhance the faces that are in background.
You can find an example here:
https://imgsli.com/MjM5MDQ2
In this case "retinaface (only center)" is worse than "dlib"
The big problem is that "dlib" is very very very slow (speed 0.005 fps).
Do you know if the version available in Hybrid is CUDA enabled ? because I checked the flag dlib.DLIB_USE_CUDA and it is False.
Thanks,
Dan
P.S.
I attached a sample that when encoded with CF shows the artifacts
10.02.2024, 11:43
HolyWu only links a a dlib wheel file (see: https://github.com/HolyWu/vs-codeformer/releases)
which is probably from https://github.com/z-mahmud22/Dlib_Windows_Python3.xor https://github.com/sachadee/Dlib
No clue whether those are compiled with dlib or whether vs-codeformer uses any functionality of the library that is speed up through cuda.
You would have to ask HolyWu.
Reading: https://github.com/eddiehe99/dlib-whl
I suspect that cuda is used since the CUDA libraries should be in the dll path.
Cu Selur
which is probably from https://github.com/z-mahmud22/Dlib_Windows_Python3.xor https://github.com/sachadee/Dlib
No clue whether those are compiled with dlib or whether vs-codeformer uses any functionality of the library that is speed up through cuda.
You would have to ask HolyWu.
Reading: https://github.com/eddiehe99/dlib-whl
I suspect that cuda is used since the CUDA libraries should be in the dll path.
Cu Selur
10.02.2024, 19:21
In the script I added the following code
and I got
I suspect that is not used.
In eddiehe99 dib-whl version there is the following section regarding CUDA configuration:
but in Hybrid are missing both cudnn.lib and cudart.lib.
So I think that CUDA is not enabled.
Code:
import dlib
core.log_message(2,'DLIB_USE_CUDA: ' + str(dlib.DLIB_USE_CUDA))
core.log_message(2,'cuda.get_num_devices: ' + str(dlib.cuda.get_num_devices()))
and I got
Code:
DLIB_USE_CUDA: False
cuda.get_num_devices: 1
I suspect that is not used.
In eddiehe99 dib-whl version there is the following section regarding CUDA configuration:
Code:
With CUDA
If you use CUDA, configure the code:
if "ON" == "ON":
add_lib_to_dll_path(
"C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/vXX.X/lib/x64/cudnn.lib"
)
add_lib_to_dll_path(
"C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/vXX.X/lib/x64/cudart.lib"
)
The XX.X depends on your situation or the whole filepath may be different based on your installation configuration.
but in Hybrid are missing both cudnn.lib and cudart.lib.
So I think that CUDA is not enabled.
10.02.2024, 19:37
.lib files are for compiling, 64bit\Vapoursynth\torch_dependencies\bin contains the dlls that get loaded.
Like I wrote you would have to ask HolyWu to be sure.
Cu Selur
Like I wrote you would have to ask HolyWu to be sure.
Cu Selur
10.02.2024, 20:02
11.02.2024, 16:57
Fingers crossed that HolyWu can help.
Cu Selur
Cu Selur
12.02.2024, 22:11
Hello Selur,
I solved the problem by building "dlib" with CUDA enabled.
I attached the new dlib wheel, feel free to try it.
With this version the encoding speed increased from 0.05fps to 2.5fps -> 50x faster!
I encoded the dlib by setting the CUDA capabilities to 8.0 (good for RTX 30 and above).
I suspect that the few dlib versions with CUDA support available, were compiled with CUDA capabilities equal to 5.0 (because this is the default in cmake).
For compatibility reason I compiled "dlib" against CUDA SDK v11.4 (good for RTX 30 and above).
In the case the pip installer refuse to install the wheel because is not compatible.
You have to rename the weel extension from ".whl" to ".zip" so that you can edit the file __init__.py in dlib folder.
Change the following code:
to match your SDK installation. I not tried it, I hope that it will work
Dan
I solved the problem by building "dlib" with CUDA enabled.
I attached the new dlib wheel, feel free to try it.
With this version the encoding speed increased from 0.05fps to 2.5fps -> 50x faster!
I encoded the dlib by setting the CUDA capabilities to 8.0 (good for RTX 30 and above).
I suspect that the few dlib versions with CUDA support available, were compiled with CUDA capabilities equal to 5.0 (because this is the default in cmake).
For compatibility reason I compiled "dlib" against CUDA SDK v11.4 (good for RTX 30 and above).
In the case the pip installer refuse to install the wheel because is not compatible.
You have to rename the weel extension from ".whl" to ".zip" so that you can edit the file __init__.py in dlib folder.
Change the following code:
Code:
if 'ON' == 'ON':
add_lib_to_dll_path('C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.4/lib/x64/cudnn.lib')
add_lib_to_dll_path('C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.4/lib/x64/cudart.lib')
Dan
13.02.2024, 21:00
Will try it tomorrow evening.
Does it work if the files from Hybrid\64bit\Vapoursynth\torch_dependencies\bin are used?
(not planing to install the CUDA SDK)
Cu Selur
Does it work if the files from Hybrid\64bit\Vapoursynth\torch_dependencies\bin are used?
(not planing to install the CUDA SDK)
Cu Selur
13.02.2024, 21:51
(13.02.2024, 21:00)Selur Wrote: [ -> ]Does it work if the files from Hybrid\64bit\Vapoursynth\torch_dependencies\bin are used?
(not planing to install the CUDA SDK)
I'm using the Hybrid environment which should be based on CUDA 12.x, and the filter is using that libraries.
CUDA is backward compatible with the previous versions and so CUDA 11.4 is a good starting point.
To be sure I renamed the folder with my SDK installation (which was necessary to compile dblib) in
Code:
'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.4
and the filter worked perfectly, so my answer to your question is that the Hybrid\64bit\Vapoursynth\torch_dependencies\bin.
I'm not sure if during the installation of the wheel python checks the existence of the libraries. But given that "python" is very lazy on this kind of controls, I expect that the installation should proceed smoothly.
Dan
Hello Selur,
the fixing of CUDA problem, raised another issue.
There are situations (or clips) where VSpipe fails to fully encode the movie.
The error is the following:
Error: fwrite() call failed when writing frame: xxx, plane: 0, errno: 32
I was unable to fix this issue.
This problem started to happen when I enabled CUDA in dlib.
I suspect that some library raise some kind of error.
I tried to fix the problem by adding to the script the following code
Code:
# Blind Face Restoration using CodeFormer
from vscodeformer import codeformer as CodeFormer
try:
clipEx = CodeFormer(clip=clip, upscale=1, detector=1, weight=1.0, num_streams=1) # 720x390
except Exception as e:
vs.core.log_message(2,'Codeformer Error: ' + str(e))
clipEx = CodeFormer(clip=clip, upscale=1, detector=0, weight=1.0, num_streams=1)
finally:
clip = clipEx
maybe the error is raised before or after I not checked all the code.
But what is really strange is that using "VsViewer" to encode the same video, the encoding proceed smoothly.
I obtained the same result using the original Codeformer script:
Code:
inference_codeformer.py
So It seems that the problem is limited to VSpipe.
But given that also "VsViewer" is using the "pipes" to perform the encoding I expected to see the same problem on "VsViewer".
I know that you worked on "VsViewer", maybe this the reason why "VsViewer" is working better that "VSpipe".
Do you have any idea on how the problem can be fixed ?
It is theoretically possible to extend "VsViewer" so that can be launched using the command line like "VSpipe" ?
I guess that "VsViewer" is more robust because the "pipe" is directly controlled by the program, while with "VSpipe" is the OS that control the pipe.
I attached an archive containing:
1) the script used
2) the clip used
3) the log.
I hope that this can help.
Thanks,
Dan