Jump to content


Photo

GPU Transcoding (Intel QuickSync and nVidia NVENC)

GPU quicksync transcoding hardware acceleration

  • Please log in to reply
1453 replies to this topic

#81 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 13 January 2015 - 09:32 AM

Just curious how this will eventually work. When quik sync is enabled is that the only form of transcoding that will be done?

Or will the media browser server have access to use x264 and quik sync as needed?

My 4670k handles multiple streams very smooth but it has high cpu usage the first 10 minutes into a movie.

I know using quik sync in handbrake drops the cpu usage from 100% to 30% but I do not know how well quik sync can handle multiple streams.

 

AFAIK It's not possible to seamlessly switch between libx264 and QuickSync within a single stream, so if both technologies were to be implemented simultaneously, then it would probably have to be along the lines of counting the number of transcode tasks and if it's above a certain level then launch the next transcode task using a different codec.

 

In terms of the spike at the start, I guess this is due to the fact that ffmpeg tries to encode the movies as fast as possible. @Luke - Under what circumstances is the -re command used (for real-time encoding) and what is the argument for not using it for all transcodes?

 

My very basic J1900 chip can handle at least 3 simultaneous streams - but there is a software fallback included within QuickSync, so where the GPU hardware is unavailable the video is encoded using a software encoder (although I doubt it is as good as x264). I haven't tested this feature as I don't know how to pretend the GPU is busy and force qsv_h264 to use software encoding.


  • dark_slayer and Wirerat like this

#82 ajplante OFFLINE  

ajplante

    Member

  • Members
  • 11 posts

Posted 13 January 2015 - 03:04 PM

Looks cool but then I saw it was based on Intel technology so I guess there is no chance of supporting this with AMD CPU's?



#83 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 13 January 2015 - 03:51 PM

Sorry to have got your hopes up (I didn't start the original thread, so can't rename it I'm afraid!).

 

A few people have been looking for options involving OpenCL which would be supported by AMD and other GPUs, but I have spent the afternoon looking in to it but couldn't get some of the various pre-requesits to work within my build environment, so for now I don't think it's going to happen. To be honest with you I'm no expert at this stuff, I just pulled together a few resources from some GitHub projects and was able to get QuickSync working - Hopefully there are others out there who can take up the reigns to extend GPU encoding to other platforms and chips.

 

One thing I'd say is that it seems that OpenCL will only bring mdoest performance improvements since it doesn't accellerate the h264 encoding process. Reading around, it seems AMD do have a solution, I'm not sure if it's possible to integrate it in to ffmpeg?


  • denethor likes this

#84 Wirerat OFFLINE  

Wirerat

    Newbie

  • Members
  • 2 posts

Posted 13 January 2015 - 04:49 PM

AFAIK It's not possible to seamlessly switch between libx264 and QuickSync within a single stream, so if both technologies were to be implemented simultaneously, then it would probably have to be along the lines of counting the number of transcode tasks and if it's above a certain level then launch the next transcode task using a different codec.

In terms of the spike at the start, I guess this is due to the fact that ffmpeg tries to encode the movies as fast as possible. @Luke - Under what circumstances is the -re command used (for real-time encoding) and what is the argument for not using it for all transcodes?

My very basic J1900 chip can handle at least 3 simultaneous streams - but there is a software fallback included within QuickSync, so where the GPU hardware is unavailable the video is encoded using a software encoder (although I doubt it is as good as x264). I haven't tested this feature as I don't know how to pretend the GPU is busy and force qsv_h264 to use software encoding.

Thanks for reply,

Well the spike in usage is ok. It never stutters or lags even with 5 users (2 usually direct stream).

That was what i had in mind. It would be great if the cpu could run a few trancodes x264 (regular) and the another using quik sync if needed. Even if that was a setting the user could controlled based on system performance/needs.

Edited by Wirerat, 13 January 2015 - 04:57 PM.


#85 mylle OFFLINE  

mylle

    Member

  • Members
  • 21 posts
  • Local time: 06:23 PM

Posted 13 January 2015 - 05:22 PM

Still not working for me :(

 

Metadata:
    encoder         : libebml v0.7.9 + libmatroska v0.8.1
    creation_time   : 2010-01-18 14:26:10
  Duration: 02:37:49.47, start: 0.000000, bitrate: 7934 kb/s
    Stream #0:0: Video: h264 (High), yuv420p, 1280x534 [SAR 1:1 DAR 640:267], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s (default)
    Metadata:
      title           : English  DTS 5.1 @ 1.5 Mbps
    Stream #0:2(eng): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
    Metadata:
      title           : English Comment AC3 2.0 @ 192kbps
[h264_qsv @ 057fac00] MFXInit(): -3
Output #0, hls, to 'C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\e19d0293c598498dcdd0de51659f5344.m3u8':
    Stream #0:0: Video: h264, none, q=2-31, 128 kb/s, SAR 1:1 DAR 0:0, 23.98 fps (default)
    Metadata:
      encoder         : Lavc56.14.100 h264_qsv
    Stream #0:1: Audio: aac, 0 channels, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (dts (dca) -> aac (native))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height


#86 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 13 January 2015 - 05:38 PM

 

Still not working for me :(

 

Metadata:
    encoder         : libebml v0.7.9 + libmatroska v0.8.1
    creation_time   : 2010-01-18 14:26:10
  Duration: 02:37:49.47, start: 0.000000, bitrate: 7934 kb/s
    Stream #0:0: Video: h264 (High), yuv420p, 1280x534 [SAR 1:1 DAR 640:267], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s (default)
    Metadata:
      title           : English  DTS 5.1 @ 1.5 Mbps
    Stream #0:2(eng): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
    Metadata:
      title           : English Comment AC3 2.0 @ 192kbps
[h264_qsv @ 057fac00] MFXInit(): -3
Output #0, hls, to 'C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\e19d0293c598498dcdd0de51659f5344.m3u8':
    Stream #0:0: Video: h264, none, q=2-31, 128 kb/s, SAR 1:1 DAR 0:0, 23.98 fps (default)
    Metadata:
      encoder         : Lavc56.14.100 h264_qsv
    Stream #0:1: Audio: aac, 0 channels, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.14.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (dts (dca) -> aac (native))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

 

 

Can you include the ffmpeg command line that was executed (at the top of the log file - Obscure the file name if you wish).

 

Also - Are you using the latest version of ffmpeg (which I uploaded around 9 hours ago?)



#87 mylle OFFLINE  

mylle

    Member

  • Members
  • 21 posts
  • Local time: 06:23 PM

Posted 13 January 2015 - 06:06 PM

Did not have the new ffmpeg. Updated and here is another run. It always creates 2 identical transcoding logs:

 

C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\ffmpeg\20150110\ffmpeg.exe -fflags +genpts -i "http://localhost:809...64325505c64b1b9" -map_metadata -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_qsv  -maxrate 5362000 -bufsize 10724000 -vsync vfr -level 40 -force_key_frames expr:gte(t,n_forced*6) -vf "scale=trunc(min(iw\,1920)/2)*2:trunc(min((iw/dar)\,1080)/2)*2" -copyts -flags -global_header -codec:a:0 aac -strict experimental -ac 6 -ab 510000 -af "adelay=1,aresample=async=1" -hls_time 6 -start_number 0 -hls_list_size 0 -y "C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\7da9824315f8bcb22d40a8309caa68ef.m3u8"
 
 
ffmpeg version N-68994-g713e3bb Copyright © 2000-2015 the FFmpeg developers
  built on Jan 13 2015 13:06:15 with gcc 4.9.2 (Rev2, Built by MSYS2 project)
  configuration: --enable-libmfx --enable-iconv --arch=x86 --disable-debug --disable-shared --disable-doc --disable-w32threads --enable-gpl --enable-version3 --enable-runtime-cpudetect --enable-avfilter --enable-bzlib --enable-zlib --enable-decklink --enable-librtmp --enable-gnutls --enable-avisynth --enable-frei0r --enable-filter=frei0r --enable-libbluray --enable-libcaca --enable-libopenjpeg --enable-fontconfig --enable-libfreetype --enable-libass --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-libschroedinger --enable-libsoxr --enable-libtwolame --enable-libspeex --enable-libtheora --enable-libutvideo --enable-libvorbis --enable-libvo-aacenc --enable-libopus --enable-libvidstab --enable-libvpx --enable-libwavpack --enable-libxavs --enable-libx264 --enable-libx265 --enable-libxvid --enable-libzvbi
  libavutil      54. 16.100 / 54. 16.100
  libavcodec     56. 20.100 / 56. 20.100
  libavformat    56. 18.100 / 56. 18.100
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  7.100 /  5.  7.100
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, matroska,webm, from 'http://localhost:809...325505c64b1b9':
  Metadata:
    encoder         : libebml v0.7.9 + libmatroska v0.8.1
    creation_time   : 2010-07-03 01:38:03
  Duration: 01:41:36.13, start: 0.000000, bitrate: 6150 kb/s
    Stream #0:0(eng): Video: h264 (High), yuv420p, 1280x544, SAR 1:1 DAR 40:17, 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s (default)
    Stream #0:2(eng): Subtitle: subrip (default)
[h264_qsv @ 05a16f00] MFXInit(): -3
Output #0, hls, to 'C:\Users\Administrator\AppData\Roaming\MediaBrowser-Server\transcoding-temp\streaming\7da9824315f8bcb22d40a8309caa68ef.m3u8':
    Stream #0:0: Video: h264, none, q=2-31, 128 kb/s, SAR 1:1 DAR 0:0, 23.98 fps (default)
    Metadata:
      encoder         : Lavc56.20.100 h264_qsv
    Stream #0:1: Audio: aac, 0 channels, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.20.100 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (dts (dca) -> aac (native))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height


#88 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 13 January 2015 - 06:12 PM

That's interesting... 

 

See the "/2)*2" in the scale function - That suggests that this is a slightly older version of the MediaBrowser Dev branch - Can you check which if you are on the latest dev build which included some changes for QuickSync handling. (It is possible you are on the latest version and I missed a function in the source code and need to update something else). FYI, I am looking for that number "2" to be "32" if the command is being issued correctly for QSV.

 

If you are on the latest version can you confirm what device you are streaming to?

 

M



#89 mylle OFFLINE  

mylle

    Member

  • Members
  • 21 posts
  • Local time: 06:23 PM

Posted 13 January 2015 - 06:20 PM

Im on 3.0.5491.1249 streaming to a nexus 5

#90 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 13 January 2015 - 06:22 PM

OK - That should should be working, so I must have missed something. I will take a look at the functions that are running here and get this pushed up for the next Dev release...



#91 mylle OFFLINE  

mylle

    Member

  • Members
  • 21 posts
  • Local time: 06:23 PM

Posted 13 January 2015 - 06:23 PM

Thank you :)

#92 dark_slayer OFFLINE  

dark_slayer

    Advanced Member

  • Developers
  • 721 posts
  • Local time: 06:23 PM

Posted 13 January 2015 - 07:03 PM

One thing I'd say is that it seems that OpenCL will only bring mdoest performance improvements since it doesn't accellerate the h264 encoding process


mjb2000 this is just awesome that you've pulled this together without being an expert

As someone who's not an expert by any means myself it's quite inspiring. Also, without knowing this for certain, QS was designed by Intel to be a killer video encoder. It's a hardware circuit or ASIC (AFAIK) included in some of their processors.

If it was just GPU offloading there wouldn't be much interesting going on since the Intel GPU is weak as pee. This is similar in my mind to those who were GPU mining bitcoins with AMD GPUs. Even a lowly 6870 from AMD could beat the Titan from nvidia because AMD had a hardware circuit random number generator that kills at bitcoin hash generations. Same effect here as I see it, and simple GPU offloading won't offer as great of results without an overly powerful GPU. Long story short . . . I agree with your findings on opencl for all of those conjectures


Finally, nice wiki as well. Also noticed the mixed results in windows 8 comment, and it reminds me of how plexHT and their android app force a frame size at different bit rate settings. Off the top of my head it's something like 480p for 1Mbps and 720p for 2-4Mbps. They won't allow the frame to stay 1080p until you get up to around 6Mbps I think.

#93 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 13 January 2015 - 08:17 PM

Cheers :)

 

Also, without knowing this for certain, QS was designed by Intel to be a killer video encoder. It's a hardware circuit or ASIC (AFAIK) included in some of their processors.

 

I think you're right, it's not just the GPU doing some number crunching - It's some dedicated hardware in there that it is able to offload h264 encoding to. This explains why it's only able to help with h264 and MPEG2 and also why other Intel chips with GPUs aren't able to do QuickSync (they have the GPU but not the extra bit required for hardware encoding).

 

Also noticed the mixed results in windows 8 comment, and it reminds me of how plexHT and their android app force a frame size at different bit rate settings. Off the top of my head it's something like 480p for 1Mbps and 720p for 2-4Mbps. They won't allow the frame to stay 1080p until you get up to around 6Mbps I think.

 

I don't think it is related - the issues don't come from bandwidth vs framesize, it just seems if ffmpeg passes the video straight to QuickSync without processing it first (scaling it) then every so often the output video jumps back exactly 4 frame - very weird. It should be possible to resolve though, these issues don't exist in HandBrake.

 

As for Plex matching framesize with bitrate, it does kind of make sense. For an average video there will always be a Bits Per Pixel (bpp) sweetspot, but then every vide is different - If there is very little motion in a video then 6mbps at 1080p might be fine, but in an action movie a very blocky encode of 1080p@6mbps will probably look a lot worse than 720p@6mbps. If you could do true variable bit rate (2 pass) then you could allocate more bandwidth to the brief action scenes and 6mbps over the length of the entire video would probably be OK - But this wouldn't work for streaming, where we can't have large spikes in the bitrate. So although the lack of control is frustrating, I can understand their reasoning behind it.


  • dark_slayer likes this

#94 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 134369 posts
  • Local time: 02:23 PM

Posted 14 January 2015 - 12:15 AM

for the next dev build i will have added most of your changes. i am integrating them manually so i can review them closely. i have not looked at the rounding stuff yet, but everything else will  be there.


  • dark_slayer and mjb2000 like this

#95 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 14 January 2015 - 07:14 AM

Im on 3.0.5491.1249 streaming to a nexus 5

 

Sorry, mylle, I got confused. As Luke mentioned above, my code has kind of been accepted, but has not yet made it in to the latest Dev release. When this happens I'll let you know and update the Wiki

 

M



#96 brendabryg OFFLINE  

brendabryg

    Newbie

  • Members
  • 4 posts
  • Local time: 06:23 PM

Posted 14 January 2015 - 10:52 AM

Sorry to have got your hopes up (I didn't start the original thread, so can't rename it I'm afraid!).

 

A few people have been looking for options involving OpenCL which would be supported by AMD and other GPUs, but I have spent the afternoon looking in to it but couldn't get some of the various pre-requesits to work within my build environment, so for now I don't think it's going to happen. To be honest with you I'm no expert at this stuff, I just pulled together a few resources from some GitHub projects and was able to get QuickSync working - Hopefully there are others out there who can take up the reigns to extend GPU encoding to other platforms and chips.

 

One thing I'd say is that it seems that OpenCL will only bring mdoest performance improvements since it doesn't accellerate the h264 encoding process. Reading around, it seems AMD do have a solution, I'm not sure if it's possible to integrate it in to ffmpeg?

 

Hi I've been watching this thread for a while now and registered to suggest another avenue.  How about Nvidia's NVENC encoder that is similar to quicksync in that it is a dedicated harware asic in their newest gpu's (Maxwell and up) for transcoding real time. This dedicated hardware doesn't use the main part of the GPU at all leaving it free for any other tasks.

 

There are some ffmpeg builds with NVENC now.  My research indicates it is somewhere inbetween Quicksync and OpenCL as far as speed and quality goes.  This would be a perfect option for those without Intel CPU's or older harware.  Pop in a $100 GTX 750 card and have transcoding offloaded to the dedicated encoder. One link mentioned 230fps for a GTX750ti encoding 1080p.  That's good for a few simultaneous encodes.

 

Here's a few links from my research:

https://forums.plex....-in-transcoder/

 

http://ffmpeg.org/do..._8c_source.html

 

http://blog.medialooks.com/814EAo

 

http://forum.doom9.o...t=170915&page=2

 

Is this something that could be integrated similar to how you have done for quicksync?  I don't have a nvenc capable card in my main mediabrowser server right now, but do in another machine I could possibly test with.  Thanks for getting this hardware transcoding thing going again.



#97 dark_slayer OFFLINE  

dark_slayer

    Advanced Member

  • Developers
  • 721 posts
  • Local time: 06:23 PM

Posted 14 January 2015 - 02:37 PM

That makes a lot of sense. My gaming PC has a maxwell GPU and it must encode like a beast. This was part of their game streaming addition, and this must be why the specific set of GPUs must be in use for their game streaming to work. I just tried it out with limelight on my Nexus Player and I left the game settings high enough to stress the GPU but it didn't impact streaming as you said. Didn't know that's what was going on with nvidia

#98 denethor OFFLINE  

denethor

    Advanced Member

  • Members
  • 338 posts
  • Local time: 09:23 PM
  • LocationIstanbul,TR

Posted 14 January 2015 - 04:06 PM

Sorry to have got your hopes up (I didn't start the original thread, so can't rename it I'm afraid!).

 

A few people have been looking for options involving OpenCL which would be supported by AMD and other GPUs, but I have spent the afternoon looking in to it but couldn't get some of the various pre-requesits to work within my build environment,

Thank you for your time @mjb2000 . Really appreciated.


  • mjb2000 likes this

#99 mjb2000 OFFLINE  

mjb2000

    Advanced Member

  • Members
  • 98 posts
  • Local time: 06:23 PM
  • LocationUnited Kingdom

Posted 14 January 2015 - 05:10 PM

This would be a perfect option for those without Intel CPU's or older harware.  Pop in a $100 GTX 750 card and have transcoding offloaded to the dedicated encoder. 

 

That is a great point brendabryg. I think this will definitely add a lot of value.

 

I have been able to build ffmpeg with nvenc support, but so far I am not able to use it (I get stream 0:0 errors similar to those experienced by others above who are having difficulties with h264_qsv).

 

I've contacted the author in case he can provide some tips on the commands required to make it work.

 

If I can get a result with this then we might have a bit of an issue when it comes to licensing. I have had to configure ffmpeg with "--enable-nonfree" which means it can't be distributed as a binary. For people to be able to use it they'd have to download the source code and compile it themselves.

 

If we do make more progress I will see if anyone here can read through and make sense of the nVidia SDK license to see if this really has be distributed. (I think that fact that MB is free software works in our favour).

 


  • dark_slayer and brendabryg like this

#100 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 134369 posts
  • Local time: 02:23 PM

Posted 14 January 2015 - 06:41 PM

the changes i made are in the dev build i posted last night


  • mjb2000 likes this





Also tagged with one or more of these keywords: GPU, quicksync, transcoding, hardware acceleration

1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users