Quote:Did something significant / fundamental changed about the codecs ?
In the last few years: No.
Quote:Pretty shocked 4K is still a thing for modern cpu's even for gpu (Nvenc) !?
Converting 4k(HDR,h.265) -> 2k (SDR,H.264) using just using the chips on the graphic card:
Code:
"F:\Hybrid\64bit\NVEncC.exe" --avhw -i "G:\TestClips&Co\files\HDR\HDR10\4K sun HDR test.mp4" --fps 25.000 --codec h264 --profile high --level auto --sar 1:1 --lookahead 32 --vbr 0 --vbr-quality 18.00 --aq --aq-strength 5 --aq-temporal --gop-len 0 --ref 6 --multiref-l0 3 --multiref-l1 3 --bframes 3 --direct auto --bref-mode auto --no-b-adapt --mv-precision Q-pel --cabac --deblock --preset quality --colorrange limited --colormatrix bt709 --vpp-colorspace hdr2sdr=mobius,source_peak=1000,ldr_nits=100,transition=0.3,peak=1 --vpp-resize auto --output-res 1920x1080 --vpp-gauss disabled --cuda-schedule sync --output "J:\tmp\4K sun HDR test_1_2024-06-30@07_13_41_7010_01.264"
--------------------------------------------------------------------------------
J:\tmp\4K sun HDR test_1_2024-06-30@07_13_41_7010_01.264
--------------------------------------------------------------------------------
NVEncC (x64) 7.57 (r2924) by rigaya, Jun 29 2024 14:09:42 (VC 1929/Win)
OS Version Windows 11 x64 (22631) [UTF-8]
CPU AMD Ryzen 9 7950X 16-Core Processor [5.52GHz] (16C/32T)
GPU #0: NVIDIA GeForce RTX 4080 (9728 cores, 2505 MHz)[PCIe4x16][555.99]
NVENC / CUDA NVENC API 12.2, CUDA 12.5, schedule mode: sync
Input Buffers CUDA, 44 frames
Input Info avcuvid: H.265/HEVC, 3840x2160, 25/1 fps
Vpp Filters colorspace: cspconv(p010 -> yuv444(16bit))
matrix:bt2020nc->GBR
transfer:smpte2084->linear
hdr2sdr(mobius): source_peak=1000.00 ldr_nits=100.00
transition 0.30, peak 1.00
desat base 0.18, strength 0.75, exp 1.50
prim:bt2020->bt709
transfer:linear->bt709
matrix:GBR->bt709
cspconv(yuv444(16bit) -> yv12)
resize(bicubic): 3840x2160 -> 1920x1080
cspconv(yv12 -> nv12)
Output Info H.264/AVC high @ Level auto
1920x1080p 1:1 25.000fps (25/1fps)
Encoder Preset quality
Rate Control VBR
Multipass none
Bitrate 0 kbps (Max: 162000 kbps)
Target Quality 18.00
QP range I:0-51 P:0-51 B:0-51
QP Offset cb:0 cr:0
VBV buf size auto
Split Enc Mode auto
Lookahead on, 32 frames, Adaptive I Insert
GOP length 250 frames
B frames 3 frames [ref mode: middle]
Ref frames 6 frames, MultiRef L0:3 L1:3
AQ on (spatial, temporal, strength 5)
VUI matrix:bt709,range:limited
Others mv:Q-pel cabac deblock adapt-transform:auto bdirect:auto
encoded 125 frames, 154.32 fps, 13202.87 kbps, 7.87 MB
encode time 0:00:00, CPU: 0.5, GPU: 82.0, VE: 53.0, VD: 49.0, GPUClock: 2505MHz, VEClock: 2055MHz
frame type IDR 1
frame type I 1, total size 0.20 MB
frame type P 31, total size 3.33 MB
frame type B 93, total size 4.34 MB
2024-06-30@07_13_41_7010_01_video finished after 00:00:01.719
finished...
I get 150fps here. Which seems like a decent speed.
If it's too slow for your taste, you should:
- figure out whether it's the decoding, filtering or encoding that is slow.
- if it's the decoding, try different decoders.
- if it's the filtering, try adjusting the filter order (or example move filters behind the resizer) and try different filters and settings.
- if it's the encoding, try tweaking your settings.
Using Vapoursynth:
Code:
# Imports
import vapoursynth as vs
# getting Vapoursynth core
import sys
import os
core = vs.core
# Import scripts folder
scriptPath = 'F:/Hybrid/64bit/vsscripts'
sys.path.insert(0, os.path.abspath(scriptPath))
# loading plugins
core.std.LoadPlugin(path="F:/Hybrid/64bit/vsfilters/Support/fmtconv.dll")
core.std.LoadPlugin(path="F:/Hybrid/64bit/vsfilters/ColorFilter/DGHDRtoSDR/DGHDRtoSDR.dll")
core.std.LoadPlugin(path="F:/Hybrid/64bit/vsfilters/SourceFilter/DGDecNV/DGDecodeNV.dll")
# Import scripts
import validate
# Source: 'G:\TestClips&Co\files\HDR\HDR10\4K sun HDR test.mp4'
# Current color space: YUV420P10, bit depth: 10, resolution: 3840x2160, frame rate: 25fps, scanorder: progressive, yuv luminance scale: limited, matrix: 2020ncl, transfer: smpte2084, primaries: bt.2020
# Loading G:\TestClips&Co\files\HDR\HDR10\4K sun HDR test.mp4 using DGSource
clip = core.dgdecodenv.DGSource("J:/tmp/mp4_103cd4c1d7cbc771969218d2162207ff_853323747.dgi")# 25 fps, scanorder: progressive
frame = clip.get_frame(0)
# Setting detected color matrix (2020ncl).
clip = core.std.SetFrameProps(clip=clip, _Matrix=9)
# setting color transfer (2084), if it is not set.
if validate.transferIsInvalid(clip):
clip = core.std.SetFrameProps(clip=clip, _Transfer=16)
# setting color primaries info (to 2020), if it is not set.
if validate.primariesIsInvalid(clip):
clip = core.std.SetFrameProps(clip=clip, _Primaries=9)
# setting color range to TV (limited) range.
clip = core.std.SetFrameProps(clip=clip, _ColorRange=1)
# making sure frame rate is set to 25fps
clip = core.std.AssumeFPS(clip=clip, fpsnum=25, fpsden=1)
# making sure the detected scan type is set (detected: progressive)
clip = core.std.SetFrameProps(clip=clip, _FieldBased=0) # progressive
# adjusting color using HDR to SDR (DG)
clip = core.dghdrtosdr.DGHDRtoSDR(clip=clip, impl="255", mode="pq", fulldepth=True)
# Resizing using 10 - bicubic spline
clip = core.fmtc.resample(clip=clip, kernel="spline16", w=1920, h=1080, interlaced=False, interlacedd=False) # resolution 1920x1080 before YUV420P16 after YUV420P16
# adjusting output color from: YUV420P16 to YUV420P8 for x264Model
clip = core.resize.Bicubic(clip=clip, format=vs.YUV420P8, range_s="limited", dither_type="error_diffusion")
# set output frame rate to 25fps (progressive)
clip = core.std.AssumeFPS(clip=clip, fpsnum=25, fpsden=1)
# output
clip.set_output()
and x264:
Code:
"F:\Hybrid\64bit\x264.exe" --preset veryfast --crf 18.00 --profile high --level 5.1 --ref 3 --direct auto --b-adapt 0 --sync-lookahead 48 --qcomp 0.50 --rc-lookahead 40 --qpmax 51 --partitions i4x4,p8x8,b8x8 --no-fast-pskip --subme 5 --aq-mode 0 --vbv-maxrate 300000 --vbv-bufsize 300000 --sar 1:1 --non-deterministic --range tv --colormatrix bt709 --demuxer y4m --input-range tv --fps 25/1 --output-depth 8 --output "J:\tmp\2024-06-30@07_15_12_2510_03.264" -
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
x264 [info]: profile High, level 5.1, 4:2:0, 8-bit
x264 [info]: frame I:1 Avg QP:21.97 size: 57170
x264 [info]: frame P:31 Avg QP:21.27 size: 25223
x264 [info]: frame B:93 Avg QP:24.33 size: 8126
x264 [info]: consecutive B-frames: 0.8% 0.0% 0.0% 99.2%
x264 [info]: mb I I16..4: 32.8% 52.5% 14.7%
x264 [info]: mb P I16..4: 21.5% 0.0% 5.7% P16..4: 28.1% 4.1% 1.5% 0.0% 0.0% skip:39.2%
x264 [info]: mb B I16..4: 0.8% 0.0% 1.3% B16..8: 7.5% 1.9% 0.5% direct: 1.2% skip:86.8% L0:34.3% L1:43.1% BI:22.6%
x264 [info]: 8x8 transform intra:4.6% inter:30.7%
x264 [info]: direct mvs spatial:80.6% temporal:19.4%
x264 [info]: coded y,uvDC,uvAC intra: 28.6% 27.9% 11.4% inter: 3.5% 2.2% 0.0%
x264 [info]: i16 v,h,dc,p: 28% 38% 12% 22%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 7% 36% 45% 2% 2% 1% 4% 1% 3%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 9% 27% 17% 5% 10% 6% 13% 4% 9%
x264 [info]: i8c dc,h,v,p: 63% 26% 7% 3%
x264 [info]: Weighted P-Frames: Y:6.5% UV:0.0%
x264 [info]: ref P L0: 58.9% 23.4% 17.8%
x264 [info]: ref B L0: 88.3% 10.1% 1.5%
x264 [info]: ref B L1: 95.6% 4.4%
x264 [info]: kb/s:2551.74
encoded 125 frames, 100.89 fps, 2551.74 kb/s
2024-06-30@07_15_12_2510_03_video finished after 00:00:01.935
finished...
I get 100fps.
Quote:Furthermore, cpu utilizes like 30 - 50% , Gpu 12 - 25% .. so very low resources are required for some reason , if you ask me !?
No worry, I won't ask you since you clearly didn't do some testing to figure out where your 'problems' are.
Using LibavSMASHSource(cpu) instead of DGDecNV, I get ~63fps.
Using LibavSMASHSource(gpu) instead of DGDecNV, I get ~63fps.
Using FFMS2 instead of DGDecNV, I get ~63fps.
Since I suspect the main slow down is due to the HDR->SDR conversion (x264 encoding):
Using HDR to SDR (DG) with DGDecNV, I get ~100fps.
Using HDRToSDR with DGDecNV, I get ~11fps.
Using ToneMap with DGDecNV, I get ~48fps.
Using ToneMap (Placebo) with DGDecNV, I get ~50fps.
Using TimeCuve with DGDecNV, I get ~80fps.
Using HDR to SDR (DG) with DGDecNV and using DGDec for Resizing, I get ~206fps. (this way resizing is done during the decoding)
Using additional filtering can slow things down again,... but with some testing you should be able to figure out what are bottlenecks and what are alternatives.
At the end it comes down to what filters&co do you want and what speed penalty you have to pay to use them.
=> Hybrid offers a variety of screws to turn and tweak stuff. You might benefit from testing different decoder-, filter-, encoder-settings.
Cu Selur
Ps.: I adjusted the title, since yours was crap.