This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

Using Stable Diffision models for Colorization
#95
(10.06.2026, 11:05)safshe Wrote: Based on my initial testing, here are my observations and a few feature suggestions:


1. Temporal Color Consistency Issue
While the model successfully colorizes and outputs all images, consistency across frames is a major issue. The model tends to colorize the same shot differently from frame to frame. For example, in a tracking shot where a man walks from a distance toward the camera, his shirt color shifts multiple times throughout the sequence.
2. Shot-Based Segmentation & Keyframe Reference (Feature Suggestion)
Since we already have scene detection capabilities, we could leverage it to fix this consistency problem. Here is a potential workflow:
  • Automated Batching: The system could automatically segment each detected shot into its own dedicated folder.
  • Keyframe Guidance: Once the first image (or a chosen keyframe) is colorized, the model could use it as a reference for the remaining frames in that folder. A prompt fallback like "colorize the remaining images using the color profile and references from the first image" could drastically improve uniformity.
  • Granular Control via GUI: Adding a dedicated GUI tab for shot management would be incredibly helpful. If a specific shot's colorization fails or looks off, we could easily navigate to that shot's folder via the interface and re-run the colorization for just that sequence.
3. Manual Reference & Control Net Tab (Feature Suggestion)
For scenes requiring high accuracy—such as maintaining the specific historical colors of an institutional logo, emblem, or uniform—we need a way to manually intervene.
  • It would be fantastic to have an additional tab where we can upload a specific external reference image.
  • We could then instruct the model with a prompt like: "colorize this sequence, but match the emblem's colors exactly to the attached reference image."
Implementing these features would give us the comprehensive, granular control needed to restore and colorize video content with professional accuracy.

Hi safshe,

  I thank you for you observations. I already tried to find a solution to some of the questions raised in your post and you can find my thoughts below

1. Temporal Color Consistency Issue

The only "reasonable" solution to this problem is to enforce color consistency by manually looking to the colored frames in folder "ref_qwen". If are missing reference frames, you can manually add them to the folder "ref_tht10" and re-run the colorization task. The program will colorize only the missing frames (no need to start re-colorization  from zero). If the are frames with inconsistent colors you can remove or modify them.   

2. Shot-Based Segmentation & Keyframe Reference (Feature Suggestion)

In the program is already implemented a scene-detection algorithm that I consider quite good. The algorithm identifies scene boundaries by analyzing structural differences between frames rather than relying solely on raw pixel changes. The core method computes frame differences between temporally offset frames and enhances them using an edge mask built from Kirsch and TCanny operators. This produces an edge-weighted difference metric, which emphasizes meaningful structural changes (e.g., object boundaries) while reducing sensitivity to noise or flat regions. Scene changes are detected when both:
  • the global frame difference, and
  • the edge-weighted difference
exceed configurable thresholds, while also respecting a minimum distance between cuts. Additional safeguards include:
  • luma filtering, which rejects frames that are too dark or too bright,
  • override conditions for very strong changes or external detector hints.
Optionally, a second stage refines detections using SSIM and histogram comparison, removing false positives when consecutive frames are still perceptually similar.
The algorithm annotates each frame with scene-change flags and metadata, providing both detection results and information about the decision process.  

I developed this algorithm because I was unable to find a good scene-detection in the open-code world. The Shot-Based Segmentation & Keyframe Reference colorization are already managed by the 2 tasks: 1) Extract Reference Frames 2) Colorize Frames. But,  as wrote in my previous answer, to obtain a perfect result is necessary a manual adjustment. Don't hope to be able to do that automatically.   

3. Manual Reference & Control Net Tab (Feature Suggestion)

I already tried to change the prompt to enforce color consistency but the results were bad. For example because in a clip the car was colored both in blue and in red, I asked in the prompt to always colorize the cars in blue, the result was that Qwen added a blue car even in frames where the car was missing. I also tried to provide in input to Qwen, 2 images, asking to the model to colorize the first images using the colors available in the second image. The result was bad, it seems the Qwen was not trained to properly solve this type of prompts. Unless in the future will be available models trained to enforce color consistency the only viable solution is the one described at point 1.

Dan
Reply


Messages In This Thread
RE: Using Stable Diffision models for Colorization - by Dan64 - 10.06.2026, 16:05

Forum Jump:


Users browsing this thread: 2 Guest(s)