This forum uses cookies

***Selur*** · 10.02.2026, 06:20

Sadly, there is no as-soon-as-possible, until there really is a meaningful way to add it.
¯\_(ツ)_/¯
So, unless there is a Vapoursynth Filter which allows interacting to connect models like comfyui or there is a better way to connect to confyui from Vapoursynth I can't really add this to Hybrid.

Cu Selur

didris · 01.05.2026, 11:01

(09.02.2026, 19:57)Dan64 Wrote: By using only python code, and adding some optimization trick, I was able to lower the coloring speed from 22 sec/image (ComfyUI) to 4 sec/image (Python only). So the speed now is not really an issue. I decided to no release an HAVC extension with Qwen-iE because the HW requirements to get this speed are very high (RTX 5070TI and 64 GB RAM). If in the future will be released a DiT model with lower HW requirements able to color to a speed of 4 sec/image or better, I will evaluate the possibility to add it to HAVC.

Dan

Hi, Dan64
if you find some free time and desire, could you share the optimized python code for coloring the frames from folder to folder?

as far as I understand you are using this model lately - Nunchaku Qwen Image Edit 2511?

congratulations on a job well done, looking forward to HAVC 5.8.0 Tongue

Dan64 · 10.05.2026, 10:32

Hi didris,

I just released this project: DiTServerRPC
It is a XML-RPC server that exposes a GPU-accelerated colorization pipeline for black-and-white images and video frames.
It is built on top of Nunchaku SVDQuant FP4/INT4 transformer and the `Qwen-Image-Edit-2511` diffusion model.

In the project there are the instructions to install the server.
The server can use both FP4 (RTX 50-Series, Blackwell) and INT4 (RTX 30/40-Series Ampere/Ada Lovelace) quantization.

You can execute the server with the command (select fp4 or int4)

.venv\Scripts\activate

# RTX 50-Series

python dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_fp4.json

# RTX 30 / 40-Series

python dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_int4.json

Once the server is running, open another terminal in the project directory and run

.venv\Scripts\activate

# RTX 50-Series

python dit_client_pair_example.py --pipeline-config qwen_config_fp4.json --use-shm

# RTX 30 / 40-Series

python dit_client_pair_example.py --pipeline-config qwen_config_int4.json --use-shm

Depending on your GPU you should obtain an inference speed between 4-5 sec. per image.

Dan

P.S.
If you like this project please star it.

***Selur*** · 10.05.2026, 10:39

Congrats

Dan64 · 10.05.2026, 10:42

Hello Selur,

I just released a DiT colorize server, to perform colorization using DiT models (for the moment only qwen-image-edit is supported), see this post: #23
It would be helpful if you could test it, just to know if it works properly on your end.

If it works satisfactorily on your end, I could add DiT colorization in the next HAVC release.

Having implemented a client/server architecture, in HVAC it will be necessary to add only the client part, which is lightweight and has no dependencies.

Please let me know.

Dan

***Selur*** · 10.05.2026, 10:49

Will try to test it end of next week,... busy with tons of (rather urgent) stuff in real life which will hopefully be all cleared up / finished mid next week, but I'll give it a spin then. (assuming nothing unexpected happens)

Cu Selur

didris · 10.05.2026, 22:04

Many thanks for your efforts, Dan

You are truly a wizard. I gave you a star of course. I will try it next week.
I managed to get it working with my python code - currently it encodes at 5 seconds per frame, but the result for a football match is not very impressive - there is a lot of confusion in the team colors - most likely a special prompt should be invented. I use this qwen_image_edit_2511_fp8_e4m3fn_scaled_lightning_comfyui.safetensors. If you manage to integrate it into a hybrid it will be absolutely amazing.

***Selur*** · (This post was last modified: 11.05.2026, 15:46 by Selur.)

Back at home,..

Not wanting to directly mess with the current torch add-on, here's what I did:

opened a terminal inside 'Hybrid\64bit\Vapoursynth'
put the content of the repository into DiTServerRPC-main using:

git clone https://github.com/dan64/DiTServerRPC.git DiTServerRPC-main
changed into 'DiTServerRPC-main' folder

cd DiTServerRPC-main
installed venv (portable Python usually isn't build with venv)

..\python -m pip install virtualenv
created the venv

..\python.exe -m virtualenv .venv
activated the venv:

.venv\Scripts\activate
Installed the dependencies into the venv:
- pip install torch==2.9.1+cu128 torchvision==0.24.1+cu128 torchaudio==2.9.1+cu128 --index-url https://download.pytorch.org/whl/cu128
- pip install https://github.com/nunchaku-ai/nunchaku/..._amd64.whl
- pip show nunchaku
- python patch_nunchaku.py
- python patch_nunchaku.py --check
- pip install packages\diffusers-0.37.0.dev0-py3-none-any.whl
- python -c "import diffusers; print(diffusers.__version__)"
- pip install transformers==4.57.6 accelerate==1.12.0 huggingface_hub>=0.26.0 Pillow>=10.0.0
started the server

python dit_rpc_server.py
stopped the server an starte it with the preload:

python dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_int4.json
that started to download a bunch of other stuff,....
ran the test script
Opened another terminal where I navigated to 'Hybrid\64bit\Vapoursynth' and called

python DiTServerRPC-main\dit_client_example.py --pipeline-config DiTServerRPC-main\qwen_config_fp4.json --use-shm

that ended with:

[INFO] Connecting to http://127.0.0.1:8765/ ...

[INFO] Server is reachable.

[INFO] Transport: shared memory

[INFO] Pipeline already loaded on server.

[INFO] Reading input image: F:\Hybrid\64bit\Vapoursynth\DiTServerRPC-main\assets\santa_bw.png

[INFO] Colorizing (1184x880 px) ...

[INFO] Inference time : 11.89s

[INFO] Round-trip time: 11.93s

[INFO] Saved: F:\Hybrid\64bit\Vapoursynth\DiTServerRPC-main\assets\santa_colorized.png

So far so good.
I then stopped the server, and called start_server.cmd:

(.venv) F:\Hybrid\64bit\Vapoursynth\DiTServerRPC-main>start_server.cmd

Der Befehl "erver.cmd" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "age:" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "rt_server.cmd" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "rt_server.cmd" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "rt_server.cmd" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "it" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "-----------------------------------------------------------------------" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "CONFIGURATION" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "------------------------------------------------------------------------" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "nda" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "NDA_ENV" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "plicit" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "xample:" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "PYTHON_EXE" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "Directory" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "eave" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "SERVER_DIR" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "Host" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "HOST" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "PORT" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "Optional" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "xample:" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "LOGFILE" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "---------------------------------------------------------------------------" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "RGUMENT" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "---------------------------------------------------------------------------" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "t" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "/i" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "/i" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "/i" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

Der Befehl "f" ist entweder falsch geschrieben oder

konnte nicht gefunden werden.

[ERROR] Unknown precision argument: "". Use "fp4" or "int4".

=> That didn't work.

Does it really make sense to add this to Hybrid? If yes, in what way?
Adding it to the torch add-on seems like a bad idea, since updates&co could break stuff to easy. (also it's huge)
So only way, this does seem to make sense would be to create a separate add-on and depending on whether it is present or not additional options could be available in HAVC; assuming the plan is to use this in HAVC.

Cu Selur

Ps.: 'DiTServerRPC-main'-folder is ~5GB in size.

didris · 11.05.2026, 18:12

installation was not complicated, but I also could not get it to work this way.
I installed it on embedded python - there are quite a few models - C:\Users\YOUR_USERNAME\.cache\huggingface\hub\ - 30.8 GB

Then I created two python files in the installation folder E:\DiTServerRPC
rpc_wrapper.py

from dit_client_example import main as _run_single

def colorize_image(image_path: str):

    """

    Wrapper around working CLI logic.

    Returns bytes of output image.

    """

    # It just uses the existing working flow

    # (we do not touch the RPC logic)

    return _run_single(image_path)

batch_colorize_turbo.py

import xmlrpc.client

import io

from pathlib import Path

from PIL import Image

import time

# =========================

# CONFIG

# =========================

HOST = "127.0.0.1"

PORT = 8765

INPUT_DIR = Path("assets")

OUTPUT_DIR = Path("output")

OUTPUT_DIR.mkdir(exist_ok=True)

SUPPORTED = {".png", ".jpg", ".jpeg", ".webp"}

PROMPT = "Colorize this photo, natural skin tones, cinematic lighting"

STEPS = 4

# =========================

# HELPERS

# =========================

def pil_to_bytes(img):

    buf = io.BytesIO()

    img.save(buf, format="PNG")

    return buf.getvalue()

def bytes_to_pil(data):

    raw = data.data if hasattr(data, "data") else data

    return Image.open(io.BytesIO(raw)).convert("RGB")

# =========================

# MAIN

# =========================

def main():

    print("🚀 Connecting...")

    server = xmlrpc.client.ServerProxy(

        f"http://{HOST}:{PORT}/",

        use_builtin_types=True

    )

    server.ping()

    print("✅ Server OK")

    images = sorted([

        p for p in INPUT_DIR.iterdir()

        if p.suffix.lower() in SUPPORTED

    ])

    print(f"🚀 Found {len(images)} images")

    start = time.perf_counter()

    # 🔥 IMPORTANT: SERIAL (GPU-safe)

    for img_path in images:

        print(f"[RPC] {img_path.name}")

        img = Image.open(img_path).convert("RGB")

        try:

            result = server.colorize_frame(

                pil_to_bytes(img),

                PROMPT,

                0,

                STEPS

            )

            if not result["ok"]:

                print(f"❌ ERROR: {result['msg']}")

                continue

            out = bytes_to_pil(result["data"])

            out_path = OUTPUT_DIR / img_path.name

            out.save(out_path)

            print(f"✅ Saved: {out_path}")

        except Exception as e:

            print(f"❌ ERROR {img_path.name}: {e}")

    print(f"\n⚡ TOTAL TIME: {time.perf_counter() - start:.2f}s")

if __name__ == "__main__":

    main()

I added this function to the existing dit_client_example.py

colorize_image()

I run it with this command in powershell

cd E:\DiTServerRPC

E:\python_embeded\python.exe dit_rpc_server.py --load-pipeline --pipeline-config qwen_config_fp4.json

then in another powershell window

cd E:\DiTServerRPC

E:\python_embeded\python.exe batch_colorize_turbo.py

What it does is the following - it takes the frames one by one from the folder "E:\DiTServerRPC\assets" and processes them one after the other automatically in the folder "E:\DiTServerRPC\output" while keeping the same name, resolution and jpg format.

The result is an average of 9 seconds per image, uses an average of 23 GB of gpu memory and 39 GB of ram during the process /my card is rtx5090/, It is probably possible to improve the time, but it will be at the expense of quality. It does not colorize quite evenly - on different frames the same thing sometimes colors it differently - probably a lot depends on what is set in the prompt.

Once again, thanks to Dan and Selur for what they have done.

***Selur*** · 11.05.2026, 18:17

.cache/huggingface is 'only' 26.4GB here, but yes for a portable version those downloads need to end somewhere else,...

Login
Username:
Password:	Lost Password?
	Remember me