Jump to content

GPU Transcoding (Intel QuickSync and nVidia NVENC)


witteschnitte

Recommended Posts

techywarrior

I'll have to play a bit.  But what happens if the user starts to play a file but then FF 3/4 through the video?  Does it just finish that one minute segment and then jump to where the user currently is and start from there?

Not sure on that :) I've noticed that sometimes it can take a little bit to fast forward so it could even be as bad as requiring the transcode to reach that point (but I don't think that's the case).

 

I suppose I could try it out and tell you but I'm actually at work right now and supposed to be programming some other stuff :)

Link to comment
Share on other sites

techywarrior

Btw, this proxy app idea sounds really cool. I'm interested to see how you/it progress. I think this would also alleviate Luke's apprehension about including GPU transcoding as it could be handled by the proxy and not require config settings in MBS. That way there would be no difference to MB in regards to the host OS (since not all will support the GPU option).

 

And if Luke is really opposed to adding in any config options at all there could always be an XML file that people could modify to control options in the proxy.

Link to comment
Share on other sites

the proxy option is cleanest for MBS because right now this can't really be a core feature until it can be universally supported on all of our operating systems. And I also don't really like putting custom code into the server for things that the server is not actually using. It also has some challenge though. Rather than focus on what mbs uses from ffmpeg, i would instead make sure it can handle anything by first routing everything to ffmpeg and then opting into alternatives only under certain conditions. Otherwise if we decide to use some new feature or parameter, it could cause this to break abruptly. 

  • Like 1
Link to comment
Share on other sites

Yea, what I'm thinking is a drop in proxy that for now in my testing will do nothing but in turn pass the command line to the real ffmpeg. No playing with the command line or anything.  I first have to see how much this breaks things to figure out the scope of what will be needed and how much redirected stdin,stdout, and stderr will be needed and then work this part out.

 

Once the above is working and MB3 can't tell the difference of using the real ffmpeg vs the proxy then a few people can drop it into there system to verify "results as normal".  Then we can start to play with on the fly changing of stuff to QuickSync and things will get fun!

 

Carlo

  • Like 2
Link to comment
Share on other sites

Hi,

 

Sorry if this sidetracks the discussion, not what I'm meaning to do - but I tried to set this up using NVENC on my 750 Ti (per the Wiki, and it says to post here ... :(). I followed the instructions there, putting the API.dll in place, new ffmpeg and config.xml. Everything says it's working (and log shows using libnvnenc (said libx264 before) - but CPU load doesn't really drop, and GPU load is very low (< 5%, per GPU-Z). I also see that on my Windows Phone client and IE as a client - both cases now fail to play back at all. I just changed config.xml back (keeping ffmpeg and API.dll), going back to libx264 -> all is working again.

 

Thoughts?

 

Thanks!

 

Link to comment
Share on other sites

Hi,

 

I'm running "Version 3.0.5518.28564". Could this mismatch be the issue? BTW, I do see right now that "Version 3.0.5519.2656 is now available for download.".

 

A bit confused about the CPU load comment. Isn't the purpose of using HW acceleration to offload the CPU (so CPU usage should drop substantially)?

 

Thanks!

Link to comment
Share on other sites

Makes sense, will wait for the updated DLL - thanks for the pointer! BTW, where will it be noted (so I know when it's ready)?

 

FYI, I do see the FPS go up even now (though GPU load is still quite low). I guess I was going by the Wiki note "Hopefully you will see an increase in your FPS, or at least a fall in your CPU usage". I'm after the CPU usage drop, as pegging the CPU causes other issues on the machine ... :(.

 

Thanks again.

Link to comment
Share on other sites

BTW, if you have a hint about extra throttling it would be appreciated ... ;).

 

It looks like setting the number of threads will do this, but not sure where to add this custom setting. The settings in the GUI are for quality / speed (performance) tradeoffs - they make sense, but I don't want to change quality, rather just don't grab all the CPU horsepower. I'm OK with lower FPS ...

 

Make sense?

Link to comment
Share on other sites

FYI, yep - threads does the trick. Tried it, and I can limit the CPU usage (and directly the FPS, without changing quality). I just need to figure out how to implement this now ... ;-).

 

Also FYI, I tried manually running ffmpeg, same command line as MB uses - with both libx264 and libnvenc. The odd thing - the output (file) from libnvenc is broken it seems, but libx264 works fine (same parameters, only codec changed). Thoughts?

 

Thanks!!!

Link to comment
Share on other sites

Hey guys, just wanted to drop you guys a note.

I had planned on working on the "proxy" for ffmpeg over the weekend but didn't get to it due to family stuff.  I did however start working on this today.

 

Spent the first 5 hours "wresting" with C# trying to be able to pull the command line back exactly as entered only MS conveniently pulls out double quotes and "escapes" some things "for you". :(

 

Finally got past this and was able to build a basic proxy that can take the command line (as in the log files) and it calls ffmpeg correctly and ffmpeg does it thing as expected. So far so good.

 

Next up was to rename ffmpeg.exe to MBffmpeg.exe and then drop in the proxy as ffmpeg.exe so that MB3 would call the proxy.  As expected it doesn't work correctly YET.

The proxy does call ffmpeg and ffmpeg is busy in the background doing it's thing.  However, the proxy right now is NOT doing any stderr, stdin, stdout redirection so MB3 isn't aware that it's working.  This is what I expected to happen based on what Luke said earlier.

 

I've been programming for about 14 hours (work stuff and then the proxy) so far today and want to take a break to grab some dinner and then I'll play for while.

 

I basically just wanted to drop you guys a note to let you know I've started it (few days later than expected) and what progress has been made.

 

Before going further on this I may pull down the code for MB3 which I haven't done yet.  Once I can compile MB3 and be able to look at the code calling ffmpeg.exe I'll be in a better position to figure out what needs to be done.

 

Later,

Carlo

  • Like 3
Link to comment
Share on other sites

scanner50
cayars, on 10 Feb 2015 - 7:02 PM, said:cayars, on 10 Feb 2015 - 7:02 PM, said:cayars, on 10 Feb 2015 - 7:02 PM, said:cayars, on 10 Feb 2015 - 7:02 PM, said:

Hey guys, just wanted to drop you guys a note.

I had planned on working on the "proxy" for ffmpeg over the weekend but didn't get to it due to family stuff.  I did however start working on this today.

 

Spent the first 5 hours "wresting" with C# trying to be able to pull the command line back exactly as entered only MS conveniently pulls out double quotes and "escapes" some things "for you". :(

 

Finally got past this and was able to build a basic proxy that can take the command line (as in the log files) and it calls ffmpeg correctly and ffmpeg does it thing as expected. So far so good.

 

Next up was to rename ffmpeg.exe to MBffmpeg.exe and then drop in the proxy as ffmpeg.exe so that MB3 would call the proxy.  As expected it doesn't work correctly YET.

The proxy does call ffmpeg and ffmpeg is busy in the background doing it's thing.  However, the proxy right now is NOT doing any stderr, stdin, stdout redirection so MB3 isn't aware that it's working.  This is what I expected to happen based on what Luke said earlier.

 

I've been programming for about 14 hours (work stuff and then the proxy) so far today and want to take a break to grab some dinner and then I'll play for while.

 

I basically just wanted to drop you guys a note to let you know I've started it (few days later than expected) and what progress has been made.

 

Before going further on this I may pull down the code for MB3 which I haven't done yet.  Once I can compile MB3 and be able to look at the code calling ffmpeg.exe I'll be in a better position to figure out what needs to be done.

 

Later,

Carlo

 

I love this guy's willingness and ability to attack this thing, just what I've been waiting for! @ebr: any chance you can send him that $40 donation I made recently? :)

Edited by rainking430
  • Like 1
Link to comment
Share on other sites

Luke and ebr are deserving of your donation, not me.  I just want to help in areas that they may not have time or where I may have a level of expertise that can benefit the project.  This GPU encoding isn't going to be used by everyone so it really would be a waste of their overall time (for now) since it's only available on Windows for the most part. For me on the other hand I need this BAD.  I typically stream 6 or more videos at a time during prime-time and it's only getting worse on my system (currently still mainly Plex).  So this will definitely help my specific situation and make the change to MB3 that much easier!

 

Here's a "bigger picture" of what I have in mind (eventually):

 

I had been working on a universal encoder/remuxer for Plex in Python (not a language I'm that familiar with - but add ins for Plex are python) and have ran the last 2000 files added to my system through it.  In a nutshell it will take most any video file and convert it to MP4 with h.264.  It can create an AAC audio track if needed for universal compatibility, can remove audio tracks not in a language you want. Can add or remove subtitles, can pull subtitles from the video and create SRT from them. It can either remux the file if the video is already in h.264 or will transcode if needed. So it's basically a "universal" remux/encoder that will give you files in a standard format.  It uses ffmpeg as the guts and can with settings emulate Handbrake profiles (software or QuickSync).  This is currently working.

 

My end goal is to rewrite this in c# where I'm far more comfortable and add the ability for it to also generate the BIF (Roku trick) files. Even in python I have this setup to use multiple computers on the LAN for doing the work. So the end goal would be the basic "proxy" or "Intercept" program that sits between MB3 and ffmpeg along with the functionality just mentioned for those that want to use it.  In an ideal world all encoding would get handled/schedule via the "Intercept/proxy" and it could manage real time transcodes vs encoding files ready to be added to your library.  It will dynamically be able to adjust priority and/or cores used by each process to effectively control "background" vs "foreground/real-time" encodes and hopefully be able to use multiple computers, multiple GPUs, multiple CPUs.

 

But the first step is just the basics of allowing the "Intercept" program to sit between MB3 and ffmpeg which is what I'm working on now and making additional progress.

 

Carlo

 

PS Thanks for mentioning donations as I'm off to make my donation right now! (edit: donation completed)

Edited by cayars
  • Like 2
Link to comment
Share on other sites

scanner50
cayars, on 11 Feb 2015 - 07:53 AM, said:

Luke and ebr are deserving of your donation, not me. 

 

Yeah, you're right, they do deserve it! I'm just ecstatic that someone like you has finally come along to try to make this a more mature feature, and that was my way of expressing my appreciation. :)

 

 

PS Thanks for mentioning donations as I'm off to make my donation right now! (edit: donation completed)

 

Yeah man, MB and it's awesome developers and coder community are worth every penny.

Link to comment
Share on other sites

Man I'm a dummy. :)  I actually had this working yesterday but needed to change one tiny little thing.  I had already setup redirection of stderr and stdout but like a dummy in my program was writing both to stdout.  So from the console it looked correct.  In ffmpeg both stdout and stderr will appear on the console unless you specifically redirect it.  Today I opened up the code looked at:

 

Console.WriteLine(line);

changed it to

Console.Error.WriteLine(line);

 

and now it works.  So basically this is what I've done. Went to C:\Users\cayars\AppData\Roaming\MediaBrowser-Server\ffmpeg\20150110 which is where the current used version of ffmpeg.exe is located on my test system.

Renamed ffmpeg.exe to MBffmpeg.exe

Dropped in the intercept as ffmpeg.exe

done

 

Here's the first few lines of what shows up in the MB3 transcode log file:

 

C:\Users\cayars\AppData\Roaming\MediaBrowser-Server\ffmpeg\20150110\ffmpeg.exe -fflags +genpts -i file:"\\PLEX\F\Movies\#\5 Days of War (2011)\5 Days Of War (2011).mp4" -map 0:0 -map 0:1 -map -0:s -codec:v:0 libx264 -force_key_frames expr:gte(t,n_forced*5) -vf "scale=min(iw\,720):trunc(ow/dar/2)*2" -pix_fmt yuv420p -preset superfast -subq 0 -crf 23 -maxrate 808001 -bufsize 1616002 -vsync vfr -profile:v high -level 41 -map_metadata -1 -threads 0 -codec:a:0 aac -strict experimental -ac 2 -ab 192000 -af "aresample=async=1" -f mp4 -movflags frag_keyframe+empty_moov -y "C:\Users\cayars\AppData\Roaming\MediaBrowser-Server\transcoding-temp\a3273607535ac4132ce64dbe6a29045f.mp4"

ffmpeg intercept version 0.01
MBffmpeg.exe -fflags +genpts -i file:"\\PLEX\F\Movies\#\5 Days of War (2011)\5 Days Of War (2011).mp4" -map 0:0 -map 0:1 -map -0:s -codec:v:0 libx264 -force_key_frames expr:gte(t,n_forced*5) -vf "scale=min(iw\,720):trunc(ow/dar/2)*2" -pix_fmt yuv420p -preset superfast -subq 0 -crf 23 -maxrate 808001 -bufsize 1616002 -vsync vfr -profile:v high -level 41 -map_metadata -1 -threads 0 -codec:a:0 aac -strict experimental -ac 2 -ab 192000 -af "aresample=async=1" -f mp4 -movflags frag_keyframe+empty_moov -y "C:\Users\cayars\AppData\Roaming\MediaBrowser-Server\transcoding-temp\a3273607535ac4132ce64dbe6a29045f.mp4"

ffmpeg version N-68994-g4df01d5 Copyright © 2000-2015 the FFmpeg developers
built on Jan 9 2015 22:13:35 with gcc 4.9.2 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-lzma --enable-decklink --enable-zlib

<continues as usual>

 

The first and third parts are typically what you would see in the logs.  The 2nd part starting with ffmpeg intercept is the only section added by the intercept program to the output (hence logged) and just shows what is actually going to be executed. I figured I should add this so in the future we can tell if it's using ffmpeg directly or via the proxy/intercept program.

 

I haven't tried switching out any parameters yet or anything but that is next and I don't think there will be any issues.  Of course my dev machine is an i7 Dell notebook with ATI graphics. :)  So I'll have to build then test/run on a couple of other machines I have which have nVideo GPUs and another with QuickSync graphics.

 

Before I "pollute" this thread to much would you guys prefer I start a different thread or just keep updating this thread?

 

Carlo

Edited by cayars
Link to comment
Share on other sites

mjb2000,

Back in message 63 you mentioned:

So far I have made these change:

If "-preset [libx264 speed]" is specified, this is honoured by h264_qsv (It was originally expecting 1-7)

If "-level 40" is specified, this is honoured by h264_qsv (it was originally expecting 4.0)

If "-crf ##" is specified, this is treated by h264_qsv as the -qpb -qpi -qpp parameters (QuickSync allows different quant settings for i and b frames)

 

Would it be possible to get a "virgin" build of ffmpeg.exe without the changes above?

Basically a version of ffmpeg with QuickSync and NVENC support but without any modification like the above.

Link to comment
Share on other sites

I've been playing with this for a while from the command line.

First I played back a video inside of MB3 and went into the logs and grabbed what it was passing to ffmpeg.

 

Next I run this at the command prompt and watch the frames per second and CPU use.

Next I use the version of ffmpeg provided here in the thread and modify it enough to run.

 

On CPU only I get 118 fps and with GPU QS I get 46 fps

next I strip the GPU command line down to the minimum and get around 150 fps but it's not doing what MB3 needs it to do.

 

In either case the CPU is close to maxed out.  On the GPU version I do see it being used when watching in GPU-Z.

 

So either the CPU option SUPERFAST is faster than GPU transcoding or I think there is a probelm with this version of ffmpeg.

 

Can any of you guys try this yourself and let me know what you find.  At this point I'm at a standstill.

 

Carlo

Link to comment
Share on other sites

Hi,

 

You did exactly what I did ... :). Grabbed the command line, ran the conversion. I do see that the GPU is ~ 2x the CPU rate (with the CPU maxed out). What GPU do you have? And what CPU?

 

FYI, you can limit the CPU load, using threads. At least what I added to do this was "-threads 1", and it limited the CPU load to 50% (and cut FPS ~ 50% also). Make sense? Try this, hopefully it works for you.

 

BTW, are you able to get libnvenc to output a usable file? My output isn't working, and metadata is messed up. If I change libnvenc to libx264, all is good. If you don't mind - what command line are you using (to get a valid output?)?

 

Thanks!

Link to comment
Share on other sites

I should have tried this before writing the intercept program :(

 

Really no point in limiting the threads if it cuts down on the FPS.

 

I didn't even try libnvence after finding out the quicksync isn't really working on two different machines.  Both of which run Handbrake much faster with little CPU in QuickSync mode. Hmm, now that I think about it I haven't tried HB in SuperFast mode so I could be comparing apples to oranges. 

 

I tried a quick Google search but came up empty.  Anyone found another source of GPU enabled ffmpeg we could test?

If we can't find a GPU version that runs faster or uses less CPU then what we have now running SUPERFAST there probably isn't a point to this.

 

Carlo

 

PS sorry forgot part of the answer to your question.  Both are i5 test machines with integrated Intel HD Graphics.

Edited by cayars
Link to comment
Share on other sites

Sorry for the slow reply - kids events tonight, didn't get a chance to look at this. Will try more this weekend - but I'm after the same thing as you ... and did the same search ... ;). I also think that the GPU should be faster, and offload the CPU. That's the purpose of HW acceleration.

 

It does look like "stock" ffmpeg includes this support now, at least to some level. I'd be interested to try to build it, but it sounds like a bit of a mess (dependency-wise). 

 

Thoughts?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...