Selur's Little Message Board

Full Version: AV delay issue
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Selur - I am having an issue with an encode that I have narrowed down to having something to do with the audio and video sync. If I encode the video, the output file jumps to 15 seconds immediately upon loading, skipping over the opening text. When I reduce the delay ms to 0, it opens and plays the text but plays a bunch static until the 15 second mark. Debug and output files and input attached.

Input/output videos - https://drive.google.com/drive/folders/1...sp=sharing
MediaInfo reports:
Code:
Audio #1 Count : 321 Count of stream of this kind : 2 Kind of stream : Audio Kind of stream : Audio Stream identifier : 0 Stream identifier : 1 StreamOrder : 1 ID : 2 ID : 2 ID in the original source medium : 189-128 ID in the original source medium : 189 (0xBD)128 (0x80) Unique ID : 16563033064968054267 Format : AC-3 Format : AC-3 Format/Info : Audio Coding 3 Format/Url : https://en.wikipedia.org/wiki/AC3 Commercial name : Dolby Digital Commercial name : Dolby Digital Format settings, Endianness : Big Codec ID : A_AC3 Duration : 15136.000000 Duration : 15 s 136 ms Duration : 15 s 136 ms Duration : 15 s 136 ms Duration : 00:00:15.136 Duration : 00:00:15.136 Bit rate mode : CBR Bit rate mode : Constant Bit rate : 448000 Bit rate : 448 kb/s Channel(s) : 6 Channel(s) : 6 channels Channel positions : Front: L C R, Side: L R, LFE Channel positions : 3/2/0.1 Channel layout : L R C LFE Ls Rs Samples per frame : 1536 Sampling rate : 48000 Sampling rate : 48.0 kHz Samples count : 726528 Frame rate : 31.250 Frame rate : 31.250 FPS (1536 SPF) Frame count : 473 Compression mode : Lossy Compression mode : Lossy Delay : 14948 Delay : 14 s 948 ms Delay : 14 s 948 ms Delay : 14 s 948 ms Delay : 00:00:14.948 Delay : 00:00:14.948 Delay, origin : Container Delay, origin : Container Delay relative to video : 14948 Delay relative to video : 14 s 948 ms Delay relative to video : 14 s 948 ms Delay relative to video : 14 s 948 ms Delay relative to video : 00:00:14.948 Delay relative to video : 00:00:14.948 Stream size : 847616 Stream size : 828 KiB (3%) Stream size : 828 KiB Stream size : 828 KiB Stream size : 828 KiB Stream size : 827.8 KiB Stream size : 828 KiB (3%) Proportion of this stream : 0.03317 Title : Surround 5.1 Language : en Language : English Language : English Language : en Language : eng Language : en Service kind : CM Service kind : Complete Main Default : Yes Default : Yes Forced : No Forced : No Original source medium : DVD-Video bsid : 8 Dialog Normalization : -27 Dialog Normalization : -27 dB compr : -0.28 compr : -0.28 dB acmod : 7 lfeon : 1 cmixlev : -3.0 cmixlev : -3.0 dB surmixlev : -3 dB surmixlev : -3 dB mixlevel : 85 mixlevel : 85 dB roomtyp : Small dialnorm_Average : -27 dialnorm_Average : -27 dB dialnorm_Minimum : -27 dialnorm_Minimum : -27 dB dialnorm_Maximum : -27 dialnorm_Maximum : -27 dB dialnorm_Count : 372 compr_Average : 4.63 compr_Average : 4.63 dB compr_Minimum : 0.53 compr_Minimum : 0.53 dB compr_Maximum : 5.74 compr_Maximum : 5.74 dB compr_Count : 364 dynrng_Average : 4.62 dynrng_Average : 4.62 dB dynrng_Minimum : 0.00 dynrng_Minimum : 0.00 dB dynrng_Maximum : 5.88 dynrng_Maximum : 5.88 dB dynrng_Count : 372
for the source. The important part is:
Code:
Delay : 14948 Delay : 14 s 948 ms Delay : 14 s 948 ms Delay : 14 s 948 ms Delay : 00:00:14.948 Delay : 00:00:14.948 Delay, origin : Container Delay, origin : Container Delay relative to video : 14948 Delay relative to video : 14 s 948 ms Delay relative to video : 14 s 948 ms Delay relative to video : 14 s 948 ms Delay relative to video : 00:00:14.948 Delay relative to video : 00:00:14.948
When loading the source Hybrid reports:
" Using container length (30.084) instead of audio length(15.136)"
Mediainfo reports the video length as:
Code:
Duration : 29996.000000 Duration : 29 s 996 ms Duration : 29 s 996 ms Duration : 29 s 996 ms Duration : 00:00:29.996 Duration : 00:00:29:29 Duration : 00:00:29.996 (00:00:29:29)
the audio length as:
Code:
Duration : 15136.000000 Duration : 15 s 136 ms Duration : 15 s 136 ms Duration : 15 s 136 ms Duration : 00:00:15.136 Duration : 00:00:15.136
and the container:
Code:
Duration : 30084 Duration : 30 s 84 ms Duration : 30 s 84 ms Duration : 30 s 84 ms Duration : 00:00:30.084 Duration : 00:00:29:29 Duration : 00:00:30.084 (00:00:29:29)

Looking at the video frames there are ~1800 frames after bobbing, so 30 seconds seems correct.
So the general handling seem correct to me.
Audio is ~15seconds, Video is ~30 seconds. By design the audio starts ~15seconds after the video.

You then tell Hybrid that the delay is 0ms not 14948ms.
Hybrid then does not pass through the delay to the output but still multiplexes ~30 seconds of video with ~15 seconds of audio.
Telling Hybrid to 'keep intermediate' files (or looking at the debug output) one can see Hybrid does change the audio length in any way.
The created audio file has the same length as the original audio file.

Funny thing is, when I simply set the audio delay to 0 I get a file without any noticeable noise https://www.mediafire.com/file/4iycx7sbc...y.mp4/file.
That is what I initially thought,.. but then I concentrated and my old ears (really not what they used to be, and they were never really good) and noticed there is a slight hissing in the audio, it's not totally silent. Big Grin
Opening the original with i.e. audacity and zooming in you can see the audio isn't totally silent, there is some noise.
I then told audacity to normalize the audio to 0.0db, I see:
[Image: grafik.png]
I also did the same with the 5.1 audio:
[Image: grafik.png]
=> This is your static noise you are hearing.

So looking at the calls you used for the audio conversion, there were three:
Code:
encoder call: ffmpeg -y -threads 8 -f sox -i - -c:a aac -strict -2 -b:a 512k -ar 48000 -channel_layout stereo "C:\Users\Computer\AppData\Local\Temp\iId_2_aid_1_lang_en_DELAY_-43ms_2025-08-16@20_02_17_7510_02.aac" decode call: ffmpeg -y -threads 8 -loglevel fatal -nostdin -i "C:\Users\Computer\AppData\Local\Temp\iId_2_aid_1_lang_en_DELAY_-43ms_2025-08-16@20_02_17_7510_01.ac3" -ac 2 -ar 48000 -f sox - filtering call: sox --multi-threaded --temp "C:\Users\Computer\AppData\Local\Temp\2025-08-16@20_02_17_751001" --buffer 524288 -S -t sox - -b 32 -t sox - gain -n 4.00
The magic is in the filtering call. Smile What it does, is it applies a +4 dB gain with normalization (no clipping), which causes the near silent hissing to become noticeable. Smile

So what happened is:
a. you removed the delay and told Hybrid to mux a 15 second audio with 30 second video. (so the audio should start at the beginning and end after 15 seconds)
b. your told Hybrid to boost the audio volume
=> Seems like Hybrid did everything as requested. And the audio normalization just increased the audio way more than you expected. Smile


Cu Selur
Is there any way to just cut that section of audio out of the beginning in Hybrid?
No.