Jump to content

GPU Transcoding (Intel QuickSync and nVidia NVENC)


witteschnitte

Recommended Posts

nagle3092

This would be a dream for nas's like my Qnap TS-451 which uses a J1800. Granted I encode everything at x264 for local playback but this would be great for watching content away from my network. Its great to see the community doing so much for media browser. Keep up the great work guys/gals.

Link to comment
Share on other sites

I would really like to test this out but my media server is a Zino HD with all AMD hardware inside.

 

Instead about OpenCL it is true that ffmpeg uses it for some filters, but libx264 should use it inside the encoding algorithm.

Link to comment
Share on other sites

Just curious how this will eventually work. When quik sync is enabled is that the only form of transcoding that will be done?

 

Or will the media browser server have access to use x264 and quik sync as needed?

 

My 4670k handles multiple streams very smooth but it has high cpu usage the first 10 minutes into a movie.

 

I know using quik sync in handbrake drops the cpu usage from 100% to 30% but I do not know how well quik sync can handle multiple streams.

Link to comment
Share on other sites

mjb2000

This is awesome!

Looking forward to it already :D

 

Don't just look forward to it - Give it a try :)

 

Take a look at the Wiki page to see how to customise your MediaBrowser installation to test out QuickSync encoding.

 

 

What you guys think about OpenCL for guys like me? I do not have Quicksync CPU  :rolleyes:

 

Instead about OpenCL it is true that ffmpeg uses it for some filters, but libx264 should use it inside the encoding algorithm.

 

I will take a look to see what OpenCL can offer us. So far I have read that it only has an impact on the Look Ahead mechanism of x264 - which although will help, it's the number crunching encoding process that takes most of the processing power, so I doubt we'll see amazing improvements.

Link to comment
Share on other sites

mjb2000

Just curious how this will eventually work. When quik sync is enabled is that the only form of transcoding that will be done?

 

Or will the media browser server have access to use x264 and quik sync as needed?

 

My 4670k handles multiple streams very smooth but it has high cpu usage the first 10 minutes into a movie.

 

I know using quik sync in handbrake drops the cpu usage from 100% to 30% but I do not know how well quik sync can handle multiple streams.

 

AFAIK It's not possible to seamlessly switch between libx264 and QuickSync within a single stream, so if both technologies were to be implemented simultaneously, then it would probably have to be along the lines of counting the number of transcode tasks and if it's above a certain level then launch the next transcode task using a different codec.

 

In terms of the spike at the start, I guess this is due to the fact that ffmpeg tries to encode the movies as fast as possible. @@Luke - Under what circumstances is the -re command used (for real-time encoding) and what is the argument for not using it for all transcodes?

 

My very basic J1900 chip can handle at least 3 simultaneous streams - but there is a software fallback included within QuickSync, so where the GPU hardware is unavailable the video is encoded using a software encoder (although I doubt it is as good as x264). I haven't tested this feature as I don't know how to pretend the GPU is busy and force qsv_h264 to use software encoding.

  • Like 2
Link to comment
Share on other sites

mjb2000

Sorry to have got your hopes up (I didn't start the original thread, so can't rename it I'm afraid!).

 

A few people have been looking for options involving OpenCL which would be supported by AMD and other GPUs, but I have spent the afternoon looking in to it but couldn't get some of the various pre-requesits to work within my build environment, so for now I don't think it's going to happen. To be honest with you I'm no expert at this stuff, I just pulled together a few resources from some GitHub projects and was able to get QuickSync working - Hopefully there are others out there who can take up the reigns to extend GPU encoding to other platforms and chips.

 

One thing I'd say is that it seems that OpenCL will only bring mdoest performance improvements since it doesn't accellerate the h264 encoding process. Reading around, it seems AMD do have a solution, I'm not sure if it's possible to integrate it in to ffmpeg?

  • Like 1
Link to comment
Share on other sites

AFAIK It's not possible to seamlessly switch between libx264 and QuickSync within a single stream, so if both technologies were to be implemented simultaneously, then it would probably have to be along the lines of counting the number of transcode tasks and if it's above a certain level then launch the next transcode task using a different codec.

 

In terms of the spike at the start, I guess this is due to the fact that ffmpeg tries to encode the movies as fast as possible. @@Luke - Under what circumstances is the -re command used (for real-time encoding) and what is the argument for not using it for all transcodes?

 

My very basic J1900 chip can handle at least 3 simultaneous streams - but there is a software fallback included within QuickSync, so where the GPU hardware is unavailable the video is encoded using a software encoder (although I doubt it is as good as x264). I haven't tested this feature as I don't know how to pretend the GPU is busy and force qsv_h264 to use software encoding.

Thanks for reply,

 

Well the spike in usage is ok. It never stutters or lags even with 5 users (2 usually direct stream).

 

That was what i had in mind. It would be great if the cpu could run a few trancodes x264 (regular) and the another using quik sync if needed. Even if that was a setting the user could controlled based on system performance/needs.

Edited by Wirerat
Link to comment
Share on other sites

Still not working for me :(

 

Metadata:
    encoder         : libebml v0.7.9 + libmatroska v0.8.1
    creation_time   : 2010-01-18 14:26:10
  Duration: 02:37:49.47, start: 0.000000, bitrate: 7934 kb/s
    Stream #0:0: Video: h264 (High), yuv420p, 1280x534 [sAR 1:1 DAR 640:267], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s (default)
    Metadata:
      title           : English  DTS 5.1 @ 1.5 Mbps
    Stream #0:2(eng): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
    Metadata:
      title           : English Comment AC3 2.0 @ 192kbps
[h264_qsv @ 057fac00] MFXInit(): -3
Output #0, hls, to 'C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\e19d0293c598498dcdd0de51659f5344.m3u8':
    Stream #0:0: Video: h264, none, q=2-31, 128 kb/s, SAR 1:1 DAR 0:0, 23.98 fps (default)
    Metadata:
      encoder         : Lavc56.14.100 h264_qsv
    Stream #0:1: Audio: aac, 0 channels, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (dts (dca) -> aac (native))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Link to comment
Share on other sites

mjb2000

 

Still not working for me :(

 

Metadata:
    encoder         : libebml v0.7.9 + libmatroska v0.8.1
    creation_time   : 2010-01-18 14:26:10
  Duration: 02:37:49.47, start: 0.000000, bitrate: 7934 kb/s
    Stream #0:0: Video: h264 (High), yuv420p, 1280x534 [sAR 1:1 DAR 640:267], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s (default)
    Metadata:
      title           : English  DTS 5.1 @ 1.5 Mbps
    Stream #0:2(eng): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
    Metadata:
      title           : English Comment AC3 2.0 @ 192kbps
[h264_qsv @ 057fac00] MFXInit(): -3
Output #0, hls, to 'C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\e19d0293c598498dcdd0de51659f5344.m3u8':
    Stream #0:0: Video: h264, none, q=2-31, 128 kb/s, SAR 1:1 DAR 0:0, 23.98 fps (default)
    Metadata:
      encoder         : Lavc56.14.100 h264_qsv
    Stream #0:1: Audio: aac, 0 channels, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (dts (dca) -> aac (native))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

 

 

Can you include the ffmpeg command line that was executed (at the top of the log file - Obscure the file name if you wish).

 

Also - Are you using the latest version of ffmpeg (which I uploaded around 9 hours ago?)

Link to comment
Share on other sites

Did not have the new ffmpeg. Updated and here is another run. It always creates 2 identical transcoding logs:

 

C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\ffmpeg\20150110\ffmpeg.exe -fflags +genpts -i "http://localhost:8096/mediabrowser/videos/7ca84710f9dd279a0736d0c726715b15/stream?static=true&Throttle=true&mediaSourceId=7ca84710f9dd279a0736d0c726715b15&transcodingJobId=fdc7bdd2f0db4a5b864325505c64b1b9" -map_metadata -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_qsv  -maxrate 5362000 -bufsize 10724000 -vsync vfr -level 40 -force_key_frames expr:gte(t,n_forced*6) -vf "scale=trunc(min(iw\,1920)/2)*2:trunc(min((iw/dar)\,1080)/2)*2" -copyts -flags -global_header -codec:a:0 aac -strict experimental -ac 6 -ab 510000 -af "adelay=1,aresample=async=1" -hls_time 6 -start_number 0 -hls_list_size 0 -y "C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\7da9824315f8bcb22d40a8309caa68ef.m3u8"
 
 
ffmpeg version N-68994-g713e3bb Copyright © 2000-2015 the FFmpeg developers
  built on Jan 13 2015 13:06:15 with gcc 4.9.2 (Rev2, Built by MSYS2 project)
  configuration: --enable-libmfx --enable-iconv --arch=x86 --disable-debug --disable-shared --disable-doc --disable-w32threads --enable-gpl --enable-version3 --enable-runtime-cpudetect --enable-avfilter --enable-bzlib --enable-zlib --enable-decklink --enable-librtmp --enable-gnutls --enable-avisynth --enable-frei0r --enable-filter=frei0r --enable-libbluray --enable-libcaca --enable-libopenjpeg --enable-fontconfig --enable-libfreetype --enable-libass --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-libschroedinger --enable-libsoxr --enable-libtwolame --enable-libspeex --enable-libtheora --enable-libutvideo --enable-libvorbis --enable-libvo-aacenc --enable-libopus --enable-libvidstab --enable-libvpx --enable-libwavpack --enable-libxavs --enable-libx264 --enable-libx265 --enable-libxvid --enable-libzvbi
  libavutil      54. 16.100 / 54. 16.100
  libavcodec     56. 20.100 / 56. 20.100
  libavformat    56. 18.100 / 56. 18.100
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  7.100 /  5.  7.100
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
  Metadata:
    encoder         : libebml v0.7.9 + libmatroska v0.8.1
    creation_time   : 2010-07-03 01:38:03
  Duration: 01:41:36.13, start: 0.000000, bitrate: 6150 kb/s
    Stream #0:0(eng): Video: h264 (High), yuv420p, 1280x544, SAR 1:1 DAR 40:17, 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s (default)
    Stream #0:2(eng): Subtitle: subrip (default)
[h264_qsv @ 05a16f00] MFXInit(): -3
Output #0, hls, to 'C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\7da9824315f8bcb22d40a8309caa68ef.m3u8':
    Stream #0:0: Video: h264, none, q=2-31, 128 kb/s, SAR 1:1 DAR 0:0, 23.98 fps (default)
    Metadata:
      encoder         : Lavc56.20.100 h264_qsv
    Stream #0:1: Audio: aac, 0 channels, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.20.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (dts (dca) -> aac (native))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Link to comment
Share on other sites

mjb2000

That's interesting... 

 

See the "/2)*2" in the scale function - That suggests that this is a slightly older version of the MediaBrowser Dev branch - Can you check which if you are on the latest dev build which included some changes for QuickSync handling. (It is possible you are on the latest version and I missed a function in the source code and need to update something else). FYI, I am looking for that number "2" to be "32" if the command is being issued correctly for QSV.

 

If you are on the latest version can you confirm what device you are streaming to?

 

M

Link to comment
Share on other sites

mjb2000

OK - That should should be working, so I must have missed something. I will take a look at the functions that are running here and get this pushed up for the next Dev release...

Link to comment
Share on other sites

dark_slayer

One thing I'd say is that it seems that OpenCL will only bring mdoest performance improvements since it doesn't accellerate the h264 encoding process

mjb2000 this is just awesome that you've pulled this together without being an expert [emoji14]

 

As someone who's not an expert by any means myself it's quite inspiring. Also, without knowing this for certain, QS was designed by Intel to be a killer video encoder. It's a hardware circuit or ASIC (AFAIK) included in some of their processors.

 

If it was just GPU offloading there wouldn't be much interesting going on since the Intel GPU is weak as pee. This is similar in my mind to those who were GPU mining bitcoins with AMD GPUs. Even a lowly 6870 from AMD could beat the Titan from nvidia because AMD had a hardware circuit random number generator that kills at bitcoin hash generations. Same effect here as I see it, and simple GPU offloading won't offer as great of results without an overly powerful GPU. Long story short . . . I agree with your findings on opencl for all of those conjectures

 

 

Finally, nice wiki as well. Also noticed the mixed results in windows 8 comment, and it reminds me of how plexHT and their android app force a frame size at different bit rate settings. Off the top of my head it's something like 480p for 1Mbps and 720p for 2-4Mbps. They won't allow the frame to stay 1080p until you get up to around 6Mbps I think.

Link to comment
Share on other sites

mjb2000

Cheers :)

 

Also, without knowing this for certain, QS was designed by Intel to be a killer video encoder. It's a hardware circuit or ASIC (AFAIK) included in some of their processors.

 

I think you're right, it's not just the GPU doing some number crunching - It's some dedicated hardware in there that it is able to offload h264 encoding to. This explains why it's only able to help with h264 and MPEG2 and also why other Intel chips with GPUs aren't able to do QuickSync (they have the GPU but not the extra bit required for hardware encoding).

 

Also noticed the mixed results in windows 8 comment, and it reminds me of how plexHT and their android app force a frame size at different bit rate settings. Off the top of my head it's something like 480p for 1Mbps and 720p for 2-4Mbps. They won't allow the frame to stay 1080p until you get up to around 6Mbps I think.

 

I don't think it is related - the issues don't come from bandwidth vs framesize, it just seems if ffmpeg passes the video straight to QuickSync without processing it first (scaling it) then every so often the output video jumps back exactly 4 frame - very weird. It should be possible to resolve though, these issues don't exist in HandBrake.

 

As for Plex matching framesize with bitrate, it does kind of make sense. For an average video there will always be a Bits Per Pixel (bpp) sweetspot, but then every vide is different - If there is very little motion in a video then 6mbps at 1080p might be fine, but in an action movie a very blocky encode of 1080p@6mbps will probably look a lot worse than 720p@6mbps. If you could do true variable bit rate (2 pass) then you could allocate more bandwidth to the brief action scenes and 6mbps over the length of the entire video would probably be OK - But this wouldn't work for streaming, where we can't have large spikes in the bitrate. So although the lack of control is frustrating, I can understand their reasoning behind it.

  • Like 1
Link to comment
Share on other sites

for the next dev build i will have added most of your changes. i am integrating them manually so i can review them closely. i have not looked at the rounding stuff yet, but everything else will  be there.

  • Like 2
Link to comment
Share on other sites

mjb2000

Im on 3.0.5491.1249 streaming to a nexus 5

 

Sorry, mylle, I got confused. As Luke mentioned above, my code has kind of been accepted, but has not yet made it in to the latest Dev release. When this happens I'll let you know and update the Wiki

 

M

Link to comment
Share on other sites

brendabryg

Sorry to have got your hopes up (I didn't start the original thread, so can't rename it I'm afraid!).

 

A few people have been looking for options involving OpenCL which would be supported by AMD and other GPUs, but I have spent the afternoon looking in to it but couldn't get some of the various pre-requesits to work within my build environment, so for now I don't think it's going to happen. To be honest with you I'm no expert at this stuff, I just pulled together a few resources from some GitHub projects and was able to get QuickSync working - Hopefully there are others out there who can take up the reigns to extend GPU encoding to other platforms and chips.

 

One thing I'd say is that it seems that OpenCL will only bring mdoest performance improvements since it doesn't accellerate the h264 encoding process. Reading around, it seems AMD do have a solution, I'm not sure if it's possible to integrate it in to ffmpeg?

 

Hi I've been watching this thread for a while now and registered to suggest another avenue.  How about Nvidia's NVENC encoder that is similar to quicksync in that it is a dedicated harware asic in their newest gpu's (Maxwell and up) for transcoding real time. This dedicated hardware doesn't use the main part of the GPU at all leaving it free for any other tasks.

 

There are some ffmpeg builds with NVENC now.  My research indicates it is somewhere inbetween Quicksync and OpenCL as far as speed and quality goes.  This would be a perfect option for those without Intel CPU's or older harware.  Pop in a $100 GTX 750 card and have transcoding offloaded to the dedicated encoder. One link mentioned 230fps for a GTX750ti encoding 1080p.  That's good for a few simultaneous encodes.

 

Here's a few links from my research:

https://forums.plex.tv/index.php/topic/131490-nvenc-support-in-transcoder/

 

http://ffmpeg.org/doxygen/trunk/nvenc_8c_source.html

 

http://blog.medialooks.com/814EAo

 

http://forum.doom9.org/showthread.php?t=170915&page=2

 

Is this something that could be integrated similar to how you have done for quicksync?  I don't have a nvenc capable card in my main mediabrowser server right now, but do in another machine I could possibly test with.  Thanks for getting this hardware transcoding thing going again.

Link to comment
Share on other sites

dark_slayer

That makes a lot of sense. My gaming PC has a maxwell GPU and it must encode like a beast. This was part of their game streaming addition, and this must be why the specific set of GPUs must be in use for their game streaming to work. I just tried it out with limelight on my Nexus Player and I left the game settings high enough to stress the GPU but it didn't impact streaming as you said. Didn't know that's what was going on with nvidia

Link to comment
Share on other sites

denethor

Sorry to have got your hopes up (I didn't start the original thread, so can't rename it I'm afraid!).

 

A few people have been looking for options involving OpenCL which would be supported by AMD and other GPUs, but I have spent the afternoon looking in to it but couldn't get some of the various pre-requesits to work within my build environment,

Thank you for your time @@mjb2000 . Really appreciated.

  • Like 1
Link to comment
Share on other sites

mjb2000

This would be a perfect option for those without Intel CPU's or older harware.  Pop in a $100 GTX 750 card and have transcoding offloaded to the dedicated encoder. 

 

That is a great point brendabryg. I think this will definitely add a lot of value.

 

I have been able to build ffmpeg with nvenc support, but so far I am not able to use it (I get stream 0:0 errors similar to those experienced by others above who are having difficulties with h264_qsv).

 

I've contacted the author in case he can provide some tips on the commands required to make it work.

 

If I can get a result with this then we might have a bit of an issue when it comes to licensing. I have had to configure ffmpeg with "--enable-nonfree" which means it can't be distributed as a binary. For people to be able to use it they'd have to download the source code and compile it themselves.

 

If we do make more progress I will see if anyone here can read through and make sense of the nVidia SDK license to see if this really has be distributed. (I think that fact that MB is free software works in our favour).

 

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...