This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

Using Stable Diffision models for Colorization
#21
Sadly, there is no as-soon-as-possible, until there really is a meaningful way to add it.
¯\_(ツ)_/¯
So, unless there is a Vapoursynth Filter which allows interacting to connect models like comfyui or there is a better way to connect to confyui from Vapoursynth I can't really add this to Hybrid.

Cu Selur
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#22
Tongue 
(09.02.2026, 19:57)Dan64 Wrote: By using only python code, and adding some optimization trick, I was able to lower the coloring speed from 22 sec/image (ComfyUI) to 4 sec/image (Python only). So the speed now is not really an issue. I decided to no release an HAVC extension with  Qwen-iE because the HW requirements to get this speed are very high (RTX 5070TI and 64 GB RAM). If in the future will be released a DiT model with lower HW requirements able to color to a speed of 4 sec/image or better, I will evaluate the possibility to add it to HAVC.

Dan

Hi, Dan64
if you find some free time and desire, could you share the optimized python code for coloring the frames from folder to folder?

as far as I understand you are using this model lately - Nunchaku Qwen Image Edit 2511?

congratulations on a job well done, looking forward to HAVC 5.8.0  Tongue
Reply
#23
Hi didris,

   I just released this project: DiTServerRPC
   It is a XML-RPC server that exposes a GPU-accelerated colorization pipeline for black-and-white images and video frames. 
   It is built on top of Nunchaku SVDQuant FP4/INT4 transformer and the `Qwen-Image-Edit-2511` diffusion model.

  In the project there are the instructions to install the server.
  The server can use both FP4 (RTX 50-Series, Blackwell) and INT4 (RTX 30/40-Series Ampere/Ada Lovelace) quantization.

   You can execute the server with the command (select fp4 or int4) 

.venv\Scripts\activate
# RTX 50-Series
python dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_fp4.json

# RTX 30 / 40-Series
python dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_int4.json

   Once the server is running, open another terminal in the project directory and run

.venv\Scripts\activate
# RTX 50-Series
python dit_client_pair_example.py --pipeline-config qwen_config_fp4.json --use-shm

# RTX 30 / 40-Series
python dit_client_pair_example.py --pipeline-config qwen_config_int4.json --use-shm

  Depending on your GPU you should obtain an inference speed between 4-5 sec. per image.

Dan

P.S.
If you like this project please star it.
Reply
#24
Congrats Big Grin
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#25
Hello Selur,

  I just released a DiT colorize server, to perform colorization using DiT models (for the moment only qwen-image-edit is supported), see this post: #23 
  It would be helpful if you could test it, just to know if it works properly on your end.

  If it works satisfactorily on your end, I could add DiT colorization in the next HAVC release.

  Having implemented a client/server architecture, in HVAC it will be necessary to add only the client part, which is lightweight and has no dependencies.

  Please let me know.

Dan
Reply
#26
Will try to test it end of next week,... busy with tons of (rather urgent) stuff in real life which will hopefully be all cleared up / finished mid next week, but I'll give it a spin then. (assuming nothing unexpected happens)

Cu Selur
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#27
Many thanks for your efforts, Dan

You are truly a wizard. I gave you a star of course. I will try it next week.
I managed to get it working with my python code - currently it encodes at 5 seconds per frame, but the result for a football match is not very impressive - there is a lot of confusion in the team colors - most likely a special prompt should be invented. I use this qwen_image_edit_2511_fp8_e4m3fn_scaled_lightning_comfyui.safetensors. If you manage to integrate it into a hybrid it will be absolutely amazing.
Reply
#28
Back at home,..

Not wanting to directly mess with the current torch add-on, here's what I did:
  • opened a terminal inside 'Hybrid\64bit\Vapoursynth'
  • put the content of the repository into DiTServerRPC-main using:
    git clone https://github.com/dan64/DiTServerRPC.git DiTServerRPC-main
  • changed into 'DiTServerRPC-main' folder
    cd DiTServerRPC-main
  • installed venv (portable Python usually isn't build with venv)
    ..\python -m pip install virtualenv
  • created the venv
    ..\python.exe -m virtualenv .venv
  • activated the venv:
    .venv\Scripts\activate
  • Installed the dependencies into the venv:
    • pip install torch==2.9.1+cu128 torchvision==0.24.1+cu128 torchaudio==2.9.1+cu128 --index-url https://download.pytorch.org/whl/cu128
    • pip install https://github.com/nunchaku-ai/nunchaku/..._amd64.whl
    • pip show nunchaku
    • python patch_nunchaku.py
    • python patch_nunchaku.py --check
    • pip install packages\diffusers-0.37.0.dev0-py3-none-any.whl
    • python -c "import diffusers; print(diffusers.__version__)"
    • pip install transformers==4.57.6 accelerate==1.12.0 huggingface_hub>=0.26.0 Pillow>=10.0.0
  • started the server
    python dit_rpc_server.py
  • stopped the server an starte it with the preload:
    python dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_int4.json
    that started to download a bunch of other stuff,....
  • ran the test script
    Opened another terminal where I navigated to 'Hybrid\64bit\Vapoursynth' and called
    python DiTServerRPC-main\dit_client_example.py --pipeline-config DiTServerRPC-main\qwen_config_fp4.json --use-shm
that ended with:
[INFO] Connecting to http://127.0.0.1:8765/ ...
[INFO] Server is reachable.
[INFO] Transport: shared memory
[INFO] Pipeline already loaded on server.
[INFO] Reading input image: F:\Hybrid\64bit\Vapoursynth\DiTServerRPC-main\assets\santa_bw.png
[INFO] Colorizing (1184x880 px) ...
[INFO] Inference time : 11.89s
[INFO] Round-trip time: 11.93s
[INFO] Saved: F:\Hybrid\64bit\Vapoursynth\DiTServerRPC-main\assets\santa_colorized.png
So far so good.
I then stopped the server, and called start_server.cmd:
(.venv) F:\Hybrid\64bit\Vapoursynth\DiTServerRPC-main>start_server.cmd
Der Befehl "erver.cmd" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "age:" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "rt_server.cmd" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "rt_server.cmd" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "rt_server.cmd" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "it" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "-----------------------------------------------------------------------" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "CONFIGURATION" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "------------------------------------------------------------------------" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "nda" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "NDA_ENV" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "plicit" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "xample:" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "PYTHON_EXE" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "Directory" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "eave" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "SERVER_DIR" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "Host" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "HOST" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "PORT" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "Optional" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "xample:" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "LOGFILE" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "---------------------------------------------------------------------------" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "RGUMENT" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "---------------------------------------------------------------------------" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "t" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "/i" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "/i" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "/i" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "f" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
[ERROR] Unknown precision argument: "". Use "fp4" or "int4".
=> That didn't work.

Does it really make sense to add this to Hybrid? If yes, in what way?
Adding it to the torch add-on seems like a bad idea, since updates&co could break stuff to easy. (also it's huge)
So only way, this does seem to make sense would be to create a separate add-on and depending on whether it is present or not additional options could be available in HAVC; assuming the plan is to use this in HAVC.

Cu Selur

Ps.: 'DiTServerRPC-main'-folder is ~5GB in size.
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#29
installation was not complicated, but I also could not get it to work this way.
I installed it on embedded python - there are quite a few models - C:\Users\YOUR_USERNAME\.cache\huggingface\hub\ - 30.8 GB

Then I created two python files in the installation folder E:\DiTServerRPC
rpc_wrapper.py
from dit_client_example import main as _run_single
def colorize_image(image_path: str):
    """
    Wrapper around working CLI logic.
    Returns bytes of output image.
    """
    # It just uses the existing working flow
    # (we do not touch the RPC logic)
    return _run_single(image_path)

batch_colorize_turbo.py
import xmlrpc.client
import io
from pathlib import Path
from PIL import Image
import time
# =========================
# CONFIG
# =========================
HOST = "127.0.0.1"
PORT = 8765
INPUT_DIR = Path("assets")
OUTPUT_DIR = Path("output")
OUTPUT_DIR.mkdir(exist_ok=True)
SUPPORTED = {".png", ".jpg", ".jpeg", ".webp"}
PROMPT = "Colorize this photo, natural skin tones, cinematic lighting"
STEPS = 4
# =========================
# HELPERS
# =========================
def pil_to_bytes(img):
    buf = io.BytesIO()
    img.save(buf, format="PNG")
    return buf.getvalue()
def bytes_to_pil(data):
    raw = data.data if hasattr(data, "data") else data
    return Image.open(io.BytesIO(raw)).convert("RGB")
# =========================
# MAIN
# =========================
def main():
    print("🚀 Connecting...")
    server = xmlrpc.client.ServerProxy(
        f"http://{HOST}:{PORT}/",
        use_builtin_types=True
    )
    server.ping()
    print("✅ Server OK")
    images = sorted([
        p for p in INPUT_DIR.iterdir()
        if p.suffix.lower() in SUPPORTED
    ])
    print(f"🚀 Found {len(images)} images")
    start = time.perf_counter()
    # 🔥 IMPORTANT: SERIAL (GPU-safe)
    for img_path in images:
        print(f"[RPC] {img_path.name}")
        img = Image.open(img_path).convert("RGB")
        try:
            result = server.colorize_frame(
                pil_to_bytes(img),
                PROMPT,
                0,
                STEPS
            )
            if not result["ok"]:
                print(f"❌ ERROR: {result['msg']}")
                continue
            out = bytes_to_pil(result["data"])
            out_path = OUTPUT_DIR / img_path.name
            out.save(out_path)
            print(f"✅ Saved: {out_path}")
        except Exception as e:
            print(f"❌ ERROR {img_path.name}: {e}")
    print(f"\n⚡ TOTAL TIME: {time.perf_counter() - start:.2f}s")
if __name__ == "__main__":
    main()

I added this function to the existing dit_client_example.py
colorize_image()

I run it with this command in powershell
cd E:\DiTServerRPC

E:\python_embeded\python.exe dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_fp4.json

then in another powershell window
cd E:\DiTServerRPC

E:\python_embeded\python.exe batch_colorize_turbo.py

What it does is the following - it takes the frames one by one from the folder "E:\DiTServerRPC\assets" and processes them one after the other automatically in the folder "E:\DiTServerRPC\output" while keeping the same name, resolution and jpg format.

The result is an average of 9 seconds per image, uses an average of 23 GB of gpu memory and 39 GB of ram during the process /my card is rtx5090/, It is probably possible to improve the time, but it will be at the expense of quality. It does not colorize quite evenly - on different frames the same thing sometimes colors it differently - probably a lot depends on what is set in the prompt.

Once again, thanks to Dan and Selur for what they have done.
Reply
#30
.cache/huggingface is 'only' 26.4GB here, but yes for a portable version those downloads need to end somewhere else,...
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)