This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

[HELP] perf. RTX 3060 ti slower than GTX 1070 - NVEnc
#17
Looking at the debug output:
The decoding call is:
"C:\Program Files\Hybrid\64bit\ffmpeg.exe" -y -loglevel fatal -noautorotate -nostdin -threads 9 -i "J:\Download 10 To\Film 4K UHD\The.Protege.2021.2160p.UHD-001.mkv" -map 0:0 -an -sn -vf zscale=rangein=tv:range=tv -pix_fmt yuv420p10le -strict -1 -vsync 0 -f yuv4mpegpipe -
the encoding call is:
"C:\Program Files\Hybrid\64bit\NVEncC.exe" --y4m -i - --fps 23.976 --codec h265 --profile main10 --level 5.1 --tier high --sar 1:1 --lookahead 32 --output-depth 10 --vbrhq 19769 --max-bitrate 10000 --gop-len 0 --ref 3 --bframes 0 --no-b-adapt --mv-precision Q-pel --preset quality --colorrange limited --colorprim bt2020 --transfer smpte2084 --colormatrix bt2020c --max-cll 1000,923 --master-display G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1) --cuda-schedule sync --psnr --ssim --output "J:\Download 10 To\Film 4K UHD\The.Prote test 1_2022-01-06@19_41_20_0610_02.265"
NVEncC reports:
OS Version     Windows 8 x64 (9200) [UTF-8]
CPU            Intel Xeon(R) E5-2697 v3 @ 2.60GHz [TB: 3.40GHz] (14C/28T)
GPU            #0: NVIDIA GeForce RTX 3060 Ti (4864 cores, 1695 MHz)[PCIe3x16][497.29]
NVENC / CUDA   NVENC API 11.1, CUDA 11.5, schedule mode: sync
Input Buffers  CUDA, 41 frames
Input Info     y4m(yv12(10bit))->p010 [AVX2], 3840x2160, 24000/1001 fps
Vpp Filters    copyHtoD
               ssim psnr (yv12(10bit))
Output Info    H.265/HEVC main10 @ Level 5.1
               3840x2160p 1:1 23.976fps (24000/1001fps)
Encoder Preset quality
Rate Control   VBR
Multipass      2pass-full
Bitrate        19769 kbps (Max: 10000 kbps)
Target Quality auto
Initial QP     I:20  P:23  B:25
QP Offset      cb:0  cr:0
VBV buf size   auto
Lookahead      on, 32 frames, Adaptive I Insert
GOP length     240 frames
B frames       0 frames [ref mode: disabled]
Ref frames     3 frames, MultiRef L0:auto L1:auto
AQ             off
CU max / min   auto / auto
VUI            matrix:bt2020c,colorprim:bt2020,transfer:smpte2084,range:limited
MasteringDisp  G(0.265000 0.690000) B(0.150000 0.060000) R(0.680000 0.320000)
               WP(0.312700 0.329000) L(1000.000000 0.000100)
MaxCLL/MaxFALL 1000/923
Others         mv:Q-pel repeat-headers
and
NVEnc output: 10541 frames: 21.38 fps, 9682 kb/s, GPU 7%, VE 100%, VD 5%

Options to speed up:
a. enable "NVEnc->Harwdare->Only use encoder" and recreate your job, changing setting without creating a new job does nothing.
(this way the video decoder chip will be used and the decoded content will directly be send to the encoder)
or
b. enable "Config->Input->Decoding->Use gpu for decoding"
(this way the video decoder chip will be used through ffmpeg and the decoded content then be processed and send to the encoder)

atm. the decoding is done with the ffmpeg software decoder (so your cpu).

Cu Selur
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply


Messages In This Thread
RE: perf. RTX 3060 ti slower than GTX 1070 - NVEnc - by Selur - 06.01.2022, 21:46

Forum Jump:


Users browsing this thread: 3 Guest(s)