Selur's Little Message Board - Deoldify Vapoursynth filter

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143

@Dan64: could you create a new spartial_correlation_sampler build with Python 3.13 and cu130?

Cu Selur

Torch version ?

torch 2.10.0+cu130
see: pip list

Cu Selur

Hello Selur,

I attached the zipped wheel.
I cannot test it because I don't have already switched to python 3.13 and torch 2.10 with CUDA 13.0.
Let me know if is working on your side.

Dan

Works!
Thanks!
One less hurdle.
(HolyWu's filter don't work with TRT+FP16 and dlib cuda version doesn't work)

Cu Selur

Hello Selur,

let me know if the attached dlib build is working on your side.

Dan

Thanks, but sadly not:

Code:
2026-02-15 14:24:46.364
Error on frame 0 request:

Traceback (most recent call last):
File "src/cython/vapoursynth.pyx", line 3222, in vapoursynth.publicFunction
File "src/cython/vapoursynth.pyx", line 3224, in vapoursynth.publicFunction
File "src/cython/vapoursynth.pyx", line 837, in vapoursynth.FuncData.__call__
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\utils\_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\__init__.py", line 122, in inference
face_helper[local_index].get_face_landmarks_5(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
only_center_face=only_center_face, resize=640, eye_dist_threshold=5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\face_restoration_helper.py", line 218, in get_face_landmarks_5
return self.get_face_landmarks_5_dlib(only_keep_largest)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\face_restoration_helper.py", line 182, in get_face_landmarks_5_dlib
det_faces = self.face_detector(self.input_img, scale)
RuntimeError: Error while calling cudaOccupancyMaxPotentialBlockSize(&num_blocks,&num_threads,K) in file D:\PProjects\dlib\dlib\cuda\cuda_utils.h:164. code: 209, reason: no kernel image is available for execution on the device

Cu Selur

Try the attached script and let me know.

Dan

P.S.

check also if the following paths are available on your side (file: dlib\__init__.py):

if 'ON' == 'ON':
add_lib_to_dll_path('C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/lib/x64/cudnn.lib')
add_lib_to_dll_path('C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/lib/x64/cudart.lib')

I placed the .dat into the 'Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\models' folder and renamed it to 'mmod_human_face_detector-4cb19393.dat' to replace the existing one.
Tried Vapoursynth:

Code:
2026-02-15 16:02:50.773
Error on frame 0 request:

Traceback (most recent call last):
File "src/cython/vapoursynth.pyx", line 3222, in vapoursynth.publicFunction
File "src/cython/vapoursynth.pyx", line 3224, in vapoursynth.publicFunction
File "src/cython/vapoursynth.pyx", line 837, in vapoursynth.FuncData.__call__
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\utils\_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\__init__.py", line 122, in inference
face_helper[local_index].get_face_landmarks_5(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
only_center_face=only_center_face, resize=640, eye_dist_threshold=5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\face_restoration_helper.py", line 218, in get_face_landmarks_5
return self.get_face_landmarks_5_dlib(only_keep_largest)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vscodeformer\face_restoration_helper.py", line 182, in get_face_landmarks_5_dlib
det_faces = self.face_detector(self.input_img, scale)
RuntimeError: Error while calling cudaOccupancyMaxPotentialBlockSize(&num_blocks,&num_threads,K) in file D:\PProjects\dlib\dlib\cuda\cuda_utils.h:164. code: 209, reason: no kernel image is available for execution on the device

then i tried:

Code:
F:\Hybrid\64bit\Vapoursynth>python test_dlib_cuda_13.py
Traceback (most recent call last):
  File "F:\Hybrid\64bit\Vapoursynth\test_dlib_cuda_13.py", line 1, in <module>
    import dlib
  File "F:\Hybrid\64bit\Vapoursynth\Lib\site-packages\dlib\__init__.py", line 28, in <module>
    from _dlib_pybind11 import *
ImportError: DLL load failed while importing _dlib_pybind11: Das angegebene Modul wurde nicht gefunden.

Cu Selur

I included mmod_human_face_detector.dat because was used by the script python test_dlib_cuda_13.py. CodeFormer is using the same model but with a different name and location.
But in this case the test script fails when dlib is imported. This means that in your PC is missing CUDA 13 or is not located in 'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0'.
In the case it is installed this means that is missing cuDNN.
You need:

1) download: cudnn-windows-x86_64-9.15.1.9_cuda13-archive.zip
2) extract the cuDNN package (it's a .zip file, not a real installer). Copy the files directly into the CUDA directory:

Code:
# Header
copy cudnn-windows-x86_64-*-archive\include\cudnn*.h ^
     "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\"

# Library
copy cudnn-windows-x86_64-*-archive\lib\x64\cudnn*.lib ^
     "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\lib\x64\"

# DLL (for CUDA 13.x, put them in bin\x64\)
copy cudnn-windows-x86_64-*-archive\bin\cudnn64_*.dll ^
     "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\x64\"

This should fix the issue.

Dan

P.S.
as shown in release-compatibility-matrix

[Image: attachment.php?aid=3501]

for pytorch version 2.10 with CUDA 13.0 is used CUDNN 9.15.1.9 (note that CUDA 13.0 is still considered experimental)