This forum uses cookies

Dan64 · 27.12.2025, 20:46

Good News: Finally I was able to build a working prototype.
Bad News: Hardware requirements are very high: RTX 50x GPU (Blackwell) and 64GB RAM.

The main recent innovation in the Diffusion models family is the introduction of Transformer technology (the same technology used by the current LLM).
The introduction of Transformer technology has really improved significantly the Diffusion models that are now called DiT (Diffusion Transformer).

But Transformers models are memory-hungry, as evidenced by the recent RAM shortage and the relatively crazy increase in RAM prices (fortunately I upgraded my PC some months before the RAM shortage).

I can do nothing to solve this problem, I hope that in the coming months will be released models with lower hardware requirements.

But the problem is not easy to solve, for example these are the RAM requirements for the model that I'm using

[Image: attachment.php?aid=3463]

as it is possible to see even if the storage size of the model on the disk is about 23GB (not too much for a DiT model) the RAM usage is about 2x, 46GB, on top of this is necessary to add the RAM used by OS and background programs, about 17GB, and the total RAM necessary to perform the colorization is 63 GB.

But the use of DiT models for colorization is really a game changer, as shown in the image below

[Image: attachment.php?aid=3462]

I recently started colorizing my old B&W films with the latest version of HAVC. Naturally, I had to make a lot of color compromises, but when I colorized the film Miracle on 34th Street (1947) and saw Santa Claus wearing a gray/brown costume, I rebelled and began to delve into the technology for colorizing photos with DiT models, and the results I achieved are astonishing.

As a test, I tried colorizing the film Miracle on 34th Street (138687 frames) using the following pipeline:

1) export of reference frames, in this case I obtained about 3000+ frames (time 25m)
2) selection of key reference frames, I obtained 994 key frames (time 20m)
3) colorization of the key frames using Nunchaku Qwen Image Edit 2509 (only python code, not using ComfyUI), at a speed of 12sec/frame (time 3h30m)
4) colorization of full movie using HAVC(ColorMNet) with Vivid=False (time 2h40m)

The total time required to colorize the film was about 6h55m

The time necessary to colorize the movie with HAVC with preset slower was 6h37m

So the new approach requires only 18m more respect to the old HAVC approach.

It is possible to see the colorized film at this link (Santa Claus is perfectly colored): miracle-on-34th-street-colorized-1947

Dan

Login
Username:
Password:	Lost Password?
	Remember me