This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

Hybrid works only on 50% performance.
#1
Hi,

Currently i am trying to figure out GPU acceleration and 60FPS conv. via InterFrame(avisynth)
I managed to make it work wonderfully but i noticed weird thing.

If i use alone GPU via NVenc x264 ecoding my GPU works on 100% 2000+ frames
When i use alone InterFrame it uses 100% cpu power.

Now i combine those two. So i get NVEnc and InterFrame filter i check GPU box aaaaand...
I get around 250+ frames which is good but my CPU perf is at 50% and GPU at 12%

From what i read InterFrame is only partially done on GPU and it relies on CPU a lot. Thing is that it doesn't use 100% cpu when i use it with NVEnc and GPU seems to wait constantly for InterFrame to finish its work (that is why 12%)

Ok then my assumption was that Hybrid needs to do something like shuffle memory or fire some scripts whatever that keeps it from going at 100% and it is normal, meaning despite of how it looks like it goes at 100%.

BUT in job queue i can set up 2 parallel tasks and it... WORKS. i can convert two video files at the same time with same speed and then my CPU goes to 100% util. and GPU around 30%.

So my question is how i can increase speed of single conversion with NVEnc and InterFrame filter from AviSynth ? Obviously it has something to do with settings but i can't figure out which ones...
Reply
#2
Quote:From what i read InterFrame is only partially done on GPU and it relies on CPU a lot.
If you don't enable the gpu option, InterFrame is done on the CPU alone.

Quote:If i use alone GPU via NVenc x264 ecoding my GPU works on 100% 2000+ frames
I understand what you mean, but you encode to H.264 (MPEG-4 AVC) not x264. x264 is another encoder which also encoder so H.264. Wink

Quote:Thing is that it doesn't use 100% cpu when i use it with NVEnc and GPU seems to wait constantly for InterFrame to finish its work (that is why 12%)
Sounds about right.

Quote:So my question is how i can increase speed of single conversion with NVEnc and InterFrame filter from AviSynth ? Obviously it has something to do with settings but i can't figure out which ones ...
So my question is how i can increase speed of single conversion with NVEnc and InterFrame filter from AviSynth ? Obviously it has something to do with settings but i can't figure out which ones...
Warning: There might not be a solution to speed things up. Smile
What comes to mind:
1. using faster settings in InterFrame
2. Assuming you do no other filtering in Avisynth, it might not be InterFrame which is the bottleneck. It might be the decoders used in Avisynth (how is the speed when you enable 'Config->Internals->Always use Avisynth') and don't use Interframe. (This way Avisynth will be used, even when it's not needed.)
Assuming the decoder is the bottleneck using another source filter might help.

Since I normally don't use NVEncC, I tried the following with my Ryzen 7 X1800 and my Geforce GTX 980 ti.
Encode a (full hd, 1920x1080 23.976fps progressive H.264 1080 frames inside an mkv container) source:
Note that speed can fluctuate quite a bit so small changes don't say much and I only did one encode with one source for each filter thus no averaging or similar was done.
  • Decoder: ffmpeg + Avisynth (FFVideoSource as source filter) and I got: ~187fps.
  • Decoder: avs2yuv + Avisynth (FFVideoSource as source filter) and I got: ~207fps.
  • Decoder: ffmpeg + Avisynth (LWLibavVideoSource used as source filter) and I got: ~185fps.
  • Decoder: avs2yuv + Avisynth (LWLibavVideoSource as source filter) and I got: ~205fps.
  • Decoder: ffmpeg + Avisynth (FFMS2000 - FFVideoSource as source filter) and I got: ~186fps.
  • Decoder: avs2yuv + Avisynth (FFMS2000 - FFVideoSource as source filter) and I got: ~200fps.
  • Decoder: ffmpeg + Avisynth (DGDecNV - DGSource as source filter) and I got: ~195fps.
  • Decoder: avs2yuv + Avisynth (DGDecNV - DGSource as source filter) and I got: ~209fps.
  • Decoder: ffmpeg + Avisynth (FFVideoSource as source filter and InterFrame 60fps gpu enabled) and I got: ~65fps.
  • Decoder: avs2yuv + Avisynth (FFVideoSource as source filter and InterFrame 60fps gpu enabled) and I got: ~51fps.
  • Decoder: ffmpeg + Avisynth (LWLibavVideoSource used as source filter and InterFrame 60fps gpu enabled) and I got: ~63fps.
  • Decoder: avs2yuv + Avisynth (LWLibavVideoSource as source filter and InterFrame 60fps gpu enabled) and I got: ~49fps.
  • Decoder: ffmpeg + Avisynth (FFMS2000 - FFVideoSource as source filter and InterFrame 60fps gpu enabled) and I got: ~63fps.
  • Decoder: avs2yuv + Avisynth (FFMS2000 - FFVideoSource as source filter and InterFrame 60fps gpu enabled) and I got: ~52fps.
  • Decoder: ffmpeg + Avisynth (DGDecNV - DGSource as source filter and InterFrame 60fps gpu enabled) and I got: failed, InterFrame and DGDecNV don't play well with each other when using Avisynth
  • Decoder: avs2yuv + Avisynth (DGDecNV - DGSource as source filter and InterFrame 60fps gpu enabled) and I got: failed, InterFrame and DGDecNV don't play well with each other when using Avisynth
  • Decoder: vspipe + Vapourynth (DGDecNV - DGSource as source filter and InterFrame 60fps gpu enabled, intelligent overwrite) and I got: ~64fps
since my cpu is rather fast I also tried:
  • Decoder: ffmpeg + Avisynth (FFVideoSource as source filter and InterFrame 60fps gpu disabled) and I got: ~94fps.
  • Decoder: avs2yuv + Avisynth (FFVideoSource as source filter and InterFrame 60fps gpu disabled) and I got: ~68fps.
  • Decoder: ffmpeg + Avisynth (LWLibavVideoSource used as source filter and InterFrame 60fps gpu disabled) and I got: ~92fps.
  • Decoder: avs2yuv + Avisynth (LWLibavVideoSource as source filter and InterFrame 60fps gpu disabled) and I got: ~70fps.
  • Decoder: ffmpeg + Avisynth (FFMS2000 - FFVideoSource as source filter and InterFrame 60fps gpu disabled) and I got: ~96fps.
  • Decoder: avs2yuv + Avisynth (FFMS2000 - FFVideoSource as source filter and InterFrame 60fps gpu disabled) and I got: ~72fps.
  • Decoder: ffmpeg + Avisynth (DGDecNV - DGSource as source filter and InterFrame 60fps gpu disabled) and I got: ~98fps.
  • Decoder: avs2yuv + Avisynth (DGDecNV - DGSource as source filter and InterFrame 60fps gpu disabled) and I got: ~101fps.
  • Decoder: vspipe + Vapourynth (DGDecNV - DGSource as source filter and InterFrame 60fps gpu disabled, intelligent overwrite) and I got: ~108fps

So looking at all this, depending on your cpu, not using the gpu with InterFrame might help the most when using in combination with NVEncC.
Switching the source filter, doesn't help much in general so your best bet for speed up is probably using faster/other InterFrame settings.

Remember this was only one with one source and only one encode per setting combination, so this might not really tell much, especially when another CPU or GPU is used.
-> you probably should do some testing on your own unless someone has the same setup as you, benchmarks&co can be quite misleading when using things that combine gpu&cpu combined with different tools and threading strategies.

Cu Selur

Ps.: Sorry, this got a bit long, but I wanted to show that there is no easy answer to this. Smile
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#3
(24.03.2018, 08:40)Selur Wrote: Warning: There might not be a solution to speed things up. Smile
What comes to mind:
1. using faster settings in InterFrame
2. Assuming you do no other filtering in Avisynth, it might not be InterFrame which is the bottleneck. It might be the decoders used in Avisynth (how is the speed when you enable 'Config->Internals->Always use Avisynth') and don't use Interframe. (This way Avisynth will be used, even when it's not needed.)
Assuming the decoder is the bottleneck using another source filter might help.

Thank you for the reply. I think you missed important part of what i said.

When i do like i described on 50% and about 12% of GPU is used (single task)

When i do no gpu option 100% of CPU (no interframe)
When i do only gpu 100% of GPU is used (no interframe)

Like i said the weird part here is that i can just run SECOND task AT THE SAME TIME which should not be possible if whole thing would work as fast as it could. I can run second task at the same time and first one will not slow down at all. Which means Avisynth or InterFrame has some setting in it that limits amount of stuff it can do with one task.

So effectively can do 2 video conversion with interframe but when i only one task it does it at 50% of speed as if limited by something.

EDIT:

I made video showing this issue:

https://www.youtube.com/watch?v=qxrenokE...e=youtu.be

EDIT2:

i5-3570K
GTX980
8GB ram

Some testing (converting 720p video)
Nvec(x264):
InterFace with GPU box checked ~250FPS 50%CPU 12%GPU
InterFace without GPU box checked ~90FPS 50% CPU
X264:
InterFace with GPU box checked ~250FPS 50% CPU 12%GPU
InterFace without GPU box checked 90FPS 50% CPU

So it clearly shows Interframe or Avisynth is limiting somewhere CPU power.
If it wouldn't be the case then i couldn't run second task and get it working as fast as first.

EDIT2:

Maybe the issue is here:
SetMemoryMax(768)
SetMTMode(5,4) # changing MT mode
LoadCPlugin("C:\PROGRA~1\Hybrid\32bit\AVISYN~1\ffms2.dll")
LoadPlugin("C:\PROGRA~1\Hybrid\32bit\AVISYN~1\svpflow1.dll")
LoadPlugin("C:\PROGRA~1\Hybrid\32bit\AVISYN~1\svpflow2.dll")
Import("C:\Program Files\Hybrid\32bit\avisynthPlugins\InterFrame2.avsi")
# loading source: C:\Users\Perkel\Desktop\movie.mp4
# input luminance scale tv
FFVideoSource("C:\Users\Perkel\Desktop\movie.mp4")
# current resolution: 720x480
SetMTMode(2) # changing MT mode
InterFrame(Preset="Faster",NewNum=60,NewDen=1,OverrideAlgo=23,Cores=8)
# filtering
return last


SetMemoryMax(768)
SetMTMode(5,4) # changing MT mode

looks like it is using only 768mb of memory which maybe results in stuffed buffer and 50% of performance ? How can i increase it in Hybrid ?

Also second SetMTMode for InterFrame has (2) instead of (5,4) like above. Maybe it limits to 2 cores conversion ?
Reply
#4
Quote:Like i said the weird part here is that i can just run SECOND task AT THE SAME TIME which should not be possible if whole thing would work as fast as it could.
That only would be true if there wasn't a bottleneck.
Quote: I can run second task at the same time and first one will not slow down at all.
Since none of the tasks uses enough CPU/GPU power to limit the other.
Quote:So effectively can do 2 video conversion with interframe but when i only one task it does it at 50% of speed as if limited by something.
Do you understand what I mean when I use the word 'bottleneck'?
Quote:Which means Avisynth of InterFrame has some setting in it that limits amount of stuff it can do with one task.
Or the Interframe/SVP process itself simply is the bottleneck the way it is implemented atm.

-> Sadly, I don't see how anyone aside from the SVP developers can help you since from what it looks to me the problem is Interframe/SVP.

Cu Selur
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#5
(24.03.2018, 11:40)Selur Wrote: -> Sadly, I don't see how anyone aside from the SVP developers can help you since from what it looks to me the problem is Interframe/SVP.

Cu Selur

Hmm, i will try to ask there. Also look at what i have in last edit.

SetMTMode(2) # changing MT mode


For some reason InterFrame has it at (2) instead of (5,4) like at start of code. Maybe this is a factor ?

edit: I read doc on SetMT and it looks like it is SVP issue as it has nothing to do with that. I'll try to write those devs..
Reply
#6
You can disable MT which should be faster than using mode 5 or higher.
+ I agree MT is probably not the issue. Partially because this also happens with Vapoursynth where no MT is used. Wink
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#7
(24.03.2018, 12:14)Selur Wrote: You can disable MT which should be faster than using mode 5 or higher.
+ I agree MT is probably not the issue. Partially because this also happens with Vapoursynth where no MT is used. Wink

Also i think it is not actually SVP issue but InterFrame/Avisynth issue.
I mean i can just turn on player get 4k movie in it and SVP will convert it to 60 fps on fly and it uses around 100% of my cpu (which is understandable for 4k res).

So clearly there is something between SVP <----------> Interframe/Avisynth

Imo SVP is not an issue so it should be either Interframe or Avisynth.
Reply
#8
Just noticed, there is a newer version of the Interframe script available then the on currently used in Hybrid,... maybe that changes stuff,...
-> nope (if you want you can replace the InterFrame2.avsi with the one from http://www.spirton.com/interframe/)
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply
#9
(24.03.2018, 12:49)Selur Wrote: Just noticed, there is a newer version of the Interframe script available then the on currently used in Hybrid,... maybe that changes stuff,...
-> nope (if you want you can replace the InterFrame2.avsi with the one from http://www.spirton.com/interframe/)

Going back to Avisynth and Hybrid.
I noticed something.

In doc about InterFrame those are example values:


Quote:Cores=4
SetMemoryMax(512)
SetMTMode(3, Cores)
LoadPlugin("svpflow1.dll")
LoadPlugin("svpflow2.dll")
Import("InterFrame2.avsi")
dss2("video.mkv", fps=23.976).ConvertToYV12()
SetMTMode(2)
InterFrame(Cores=Cores)


There are 2 SetMTMode functions.

Here is what Hybrid does:


Quote:SetMemoryMax(1536)
SetMTMode(5,4) # changing MT mode
LoadCPlugin("C:\PROGRA~1\Hybrid\32bit\AVISYN~1\ffms2.dll")
LoadPlugin("C:\PROGRA~1\Hybrid\32bit\AVISYN~1\svpflow1.dll")
LoadPlugin("C:\PROGRA~1\Hybrid\32bit\AVISYN~1\svpflow2.dll")
Import("C:\Program Files\Hybrid\32bit\avisynthPlugins\InterFrame2.avsi")
# loading source: C:\Users\Perkel\Desktop\videoplayback.mp4
# input luminance scale tv
FFVideoSource("C:\Users\Perkel\Desktop\VIDEOP~1.MP4")
# current resolution: 3840x2160
SetMTMode(2) # changing MT mode
InterFrame(GPU=true,Preset="Faster",NewNum=60,NewDen=1,OverrideAlgo=23,Cores=8)
# filtering
return last


In doc about SetMTMode:
Quote:there are 6 modes:
  • Mode 1 is the fastest but only works with a few filter
  • Mode 2 should work with most filters but uses more memory
  • Mode 3 should work with some of the filters that don't work with mode 2 but is slower
  • Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory
  • Mode 5 is slowest (slower than not using SetMTMode) but should work with all filters that don't require linear frameserving (that is, the frames come in order (frame 0,1,2 ... last)).
  • Mode 6 is a modified mode 5 that might be slightly faster


Maybe this is the issue ? 5 by doc is the slowest. Though i am shooting blind here.
Reply
#10
Quote:Maybe this is the issue ? 5 by doc is the slowest. Though i am shooting blind here.
You are.
----
Dev versions are in the 'experimental'-folder of my GoogleDrive, which is linked on the download page.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)