Jump to content

GPU Transcoding (Intel QuickSync and nVidia NVENC)


witteschnitte

Recommended Posts

Gerrit507

I have a weird problem with live tv. One particular channel isn't loading and it sometimes even crashes the whole server. The cannel is "Sky Sport Bundesliga HD 2". All the other Sky Sport channels seem to work fine. The channel is playing from Kodi or VLC. In the log you can see a ResourceNotFoundException right after I klicked on play. This is only happening with vaapi enabled.

 

Ubuntu Server 16.04, Kernel 4.8

Emby 3.2.50.7 beta (most recent deb)

TVHeadend Plugin 1.1.1.1

 

Thank you in advance

serverlog1.txt

Edited by Gerrit507
Link to comment
Share on other sites

Gerrit507

Try refreshing the guide in case that might help.

It didn't help unfortunately

 

edit: my fault, the channel was off-air when I tried it the second time. I've also deleted and readded the channel. It will be on-air today again. Could be part of the problem, that this channel is only partially available and sends nothing most of the time.

Edited by Gerrit507
Link to comment
Share on other sites

Waldonnis

I ran across a situation that made me to look into keyframe intervals when using hardware encoding and found some interesting stuff.  It looks like -force_key_frames is being ignored for the video stream when hardware encoders are used (at least on Windows; not sure about vaapi).  It seems that the default for both nvenc and qsv is an open GOP, so the GOP length can vary...which causes issues when segmenting if you're expecting a keyframe/new segment every 3s.

 

NVENC has a tidy solution with the -forced-idr option which seems to work well in light testing when also using -force_key_frames.  Example (forcing a keyframe every 1s instead of 3s for easier verification):

ffmpeg -i test.mkv -c:v h264_nvenc -forced-idr 1 -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+1))" out.mkv

I wasn't using segmenting when I ran across this, but gave it a shot to see what happened.  Segmenting without -forced-idr resulted in varied segment lengths, since the segmenter will only "cut" on GOP boundaries (the -segment_time is really just a suggestion that way unless you allow it to segment on non-keyframes).  Some segments generated from my sample without -forced-idr were 10s long, to give you an idea.

 

I'm still looking for a way to do this with QuickSync without resorting to fixing GOP length with -g (it may be the only way, though, which makes things a little more complicated).

Link to comment
Share on other sites

Waldonnis

Zeranoe ffmpeg builds now include AMF encoding support.

 

h264_amf:

Encoder h264_amf [AMD AMF H.264 Encoder]:
    General capabilities: delay
    Threading capabilities: none
    Supported pixel formats: nv12 yuv420p d3d11
h264_amf AVOptions:
  -usage             <int>        E..V.... Encoder Usage (from 0 to 3) (default transcoding)
     transcoding                  E..V.... Generic Transcoding
     ultralowlatency              E..V....
     lowlatency                   E..V....
     webcam                       E..V.... Webcam
  -profile           <int>        E..V.... Profile (from 66 to 257) (default main)
     main                         E..V....
     high                         E..V....
     constrained_baseline              E..V....
     constrained_high              E..V....
  -level             <int>        E..V.... Profile Level (from 0 to 62) (default auto)
     auto                         E..V....
     1.0                          E..V....
     1.1                          E..V....
     1.2                          E..V....
     1.3                          E..V....
     2.0                          E..V....
     2.1                          E..V....
     2.2                          E..V....
     3.0                          E..V....
     3.1                          E..V....
     3.2                          E..V....
     4.0                          E..V....
     4.1                          E..V....
     4.2                          E..V....
     5.0                          E..V....
     5.1                          E..V....
     5.2                          E..V....
     6.0                          E..V....
     6.1                          E..V....
     6.2                          E..V....
  -quality           <int>        E..V.... Quality Preference (from 0 to 2) (default speed)
     speed                        E..V.... Prefer Speed
     balanced                     E..V.... Balanced
     quality                      E..V.... Prefer Quality
  -rc                <int>        E..V.... Rate Control Method (from -1 to 3) (default -1)
     cqp                          E..V.... Constant Quantization Parameter
     cbr                          E..V.... Constant Bitrate
     vbr_peak                     E..V.... Peak Contrained Variable Bitrate
     vbr_latency                  E..V.... Latency Constrained Variable Bitrate
  -enforce_hrd       <boolean>    E..V.... Enforce HRD (default false)
  -filler_data       <boolean>    E..V.... Filler Data Enable (default false)
  -vbaq              <boolean>    E..V.... Enable VBAQ (default false)
  -frame_skipping    <boolean>    E..V.... Rate Control Based Frame Skip (default false)
  -qp_i              <int>        E..V.... Quantization Parameter for I-Frame (from -1 to 51) (default -1)
  -qp_p              <int>        E..V.... Quantization Parameter for P-Frame (from -1 to 51) (default -1)
  -qp_b              <int>        E..V.... Quantization Parameter for B-Frame (from -1 to 51) (default -1)
  -preanalysis       <boolean>    E..V.... Pre-Analysis Mode (default false)
  -max_au_size       <int>        E..V.... Maximum Access Unit Size for rate control (in bits) (from 0 to INT_MAX) (default 0)
  -header_spacing    <int>        E..V.... Header Insertion Spacing (from -1 to 1000) (default -1)
  -bf_delta_qp       <int>        E..V.... B-Picture Delta QP (from -10 to 10) (default 4)
  -bf_ref            <boolean>    E..V.... Enable Reference to B-Frames (default true)
  -bf_ref_delta_qp   <int>        E..V.... Reference B-Picture Delta QP (from -10 to 10) (default 4)
  -intra_refresh_mb  <int>        E..V.... Intra Refresh MBs Number Per Slot in Macroblocks (from 0 to INT_MAX) (default 0)
  -coder             <int>        E..V.... Coding Type (from 0 to 2) (default auto)
     auto                         E..V.... Automatic
     cavlc                        E..V.... Context Adaptive Variable-Length Coding
     cabac                        E..V.... Context Adaptive Binary Arithmetic Coding
  -me_half_pel       <boolean>    E..V.... Enable ME Half Pixel (default true)
  -me_quarter_pel    <boolean>    E..V.... Enable ME Quarter Pixel (default true)
  -aud               <boolean>    E..V.... Inserts AU Delimiter NAL unit (default false)
  -log_to_dbg        <boolean>    E..V.... Enable AMF logging to debug output (default false)

...and hevc_amf:

Encoder hevc_amf [AMD AMF HEVC encoder]:
    General capabilities: delay
    Threading capabilities: none
    Supported pixel formats: nv12 yuv420p d3d11
hevc_amf AVOptions:
  -usage             <int>        E..V.... Set the encoding usage (from 0 to 3) (default transcoding)
     transcoding                  E..V....
     ultralowlatency              E..V....
     lowlatency                   E..V....
     webcam                       E..V....
  -profile           <int>        E..V.... Set the profile (default main) (from 1 to 1) (default main)
     main                         E..V....
  -profile_tier      <int>        E..V.... Set the profile tier (default main) (from 0 to 1) (default main)
     main                         E..V....
     high                         E..V....
  -level             <int>        E..V.... Set the encoding level (default auto) (from 0 to 186) (default auto)
     auto                         E..V....
     1.0                          E..V....
     2.0                          E..V....
     2.1                          E..V....
     3.0                          E..V....
     3.1                          E..V....
     4.0                          E..V....
     4.1                          E..V....
     5.0                          E..V....
     5.1                          E..V....
     5.2                          E..V....
     6.0                          E..V....
     6.1                          E..V....
     6.2                          E..V....
  -quality           <int>        E..V.... Set the encoding quality (from 0 to 10) (default speed)
     balanced                     E..V....
     speed                        E..V....
     quality                      E..V....
  -rc                <int>        E..V.... Set the rate control mode (from -1 to 3) (default -1)
     cqp                          E..V.... Constant Quantization Parameter
     cbr                          E..V.... Constant Bitrate
     vbr_peak                     E..V.... Peak Contrained Variable Bitrate
     vbr_latency                  E..V.... Latency Constrained Variable Bitrate
  -header_insertion_mode <int>        E..V.... Set header insertion mode (from 0 to 2) (default none)
     none                         E..V....
     gop                          E..V....
     idr                          E..V....
  -gops_per_idr      <int>        E..V.... GOPs per IDR 0-no IDR will be inserted (from 0 to INT_MAX) (default 60)
  -preanalysis       <boolean>    E..V.... Enable preanalysis (default false)
  -vbaq              <boolean>    E..V.... Enable VBAQ (default false)
  -enforce_hrd       <boolean>    E..V.... Enforce HRD (default false)
  -filler_data       <boolean>    E..V.... Filler Data Enable (default false)
  -max_au_size       <int>        E..V.... Maximum Access Unit Size for rate control (in bits) (from 0 to INT_MAX) (default 0)
  -min_qp_i          <int>        E..V.... min quantization parameter for I-frame (from -1 to 51) (default -1)
  -max_qp_i          <int>        E..V.... max quantization parameter for I-frame (from -1 to 51) (default -1)
  -min_qp_p          <int>        E..V.... min quantization parameter for P-frame (from -1 to 51) (default -1)
  -max_qp_p          <int>        E..V.... max quantization parameter for P-frame (from -1 to 51) (default -1)
  -qp_p              <int>        E..V.... quantization parameter for P-frame (from -1 to 51) (default -1)
  -qp_i              <int>        E..V.... quantization parameter for I-frame (from -1 to 51) (default -1)
  -skip_frame        <boolean>    E..V.... Rate Control Based Frame Skip (default false)
  -me_half_pel       <boolean>    E..V.... Enable ME Half Pixel (default true)
  -me_quarter_pel    <boolean>    E..V.... Enable ME Quarter Pixel  (default true)
  -aud               <boolean>    E..V.... Inserts AU Delimiter NAL unit (default false)
  -log_to_dbg        <boolean>    E..V.... Enable AMF logging to debug output (default false)

So far, it looks like it's only h.264 and HEVC (so no mpeg2 or vpx). I don't see any new decoders or hwaccels either, so they may just be relying on d3d11 for decoding acceleration so far.  I'm also a little curious about the short list of pixel formats in their HEVC implementation and wondering if Main 10 encoding is actually supported.  It doesn't appear to be, but it could be incomplete still and the library/sdk support may not be there yet.

Link to comment
Share on other sites

  • 3 weeks later...
  • 2 weeks later...
Jennice

I have now moved my Emby server near my TV, and it display the screen correctly. However, I still can't get my i7-8700k to unload the transcoding to it's GPU.

I get the same CPU load, but stuttering when enabling QuickSync.

Earlier, the need for an attached TV /monitor was mentioned, but that doesn't fix it for me.

 

I'm still on software transcoding... and for the record, I don't have other GPUs installed.

Link to comment
Share on other sites

Guest asrequested

I have now moved my Emby server near my TV, and it display the screen correctly. However, I still can't get my i7-8700k to unload the transcoding to it's GPU.

I get the same CPU load, but stuttering when enabling QuickSync.

Earlier, the need for an attached TV /monitor was mentioned, but that doesn't fix it for me.

 

I'm still on software transcoding... and for the record, I don't have other GPUs installed.

 

I wouldn't worry too much. With 6 cores and 12 threads, you have more than you need for transcoding.

Link to comment
Share on other sites

Jennice

I wouldn't worry too much. With 6 cores and 12 threads, you have more than you need for transcoding.

 

I know... 1st world problem. :)

 

I am just curious as to why it works so briliantly for some, and fails miserably for others (like me).

Link to comment
Share on other sites

Waldonnis

I know... 1st world problem. :)

 

I am just curious as to why it works so briliantly for some, and fails miserably for others (like me).

 

Intel's drivers are just finicky, and Windows is even more touchy that way.  I can see their rationale for not loading drivers for peripherals that aren't connected/active (security being pretty high on that list), but it can be inconvenient when the device in question has a secondary use like this.

Link to comment
Share on other sites

Gerrit507

I know... 1st world problem. :)

 

I am just curious as to why it works so briliantly for some, and fails miserably for others (like me).

 

Use Linux and vaapi. I've never really got QS completely working in Windows. On Linux I didn't have any issues so far. It's working like a charm.

Link to comment
Share on other sites

Jennice

It may work well on linux, but with my complete lack of Linux knowledge, I'll be likely to make it fail. also, I got the CPU to deal with it, so QS on Win10 is just a nice-to-have. :)

 

Speaking of CPU power, my old 7MC computer E8400 CPU, underclocked to 1 GHz due to cooling/fan noise has a hard time handling the live stream from Emby (it dealt with the HD tv signal from 7MC fine, but not with the same channel when served from Emby.

 

Would I be ok to use the server (i7 8700k) as client, while still operating as server for client transcodes?

My main reason for not going with a shield or other "light" client is that I need the analogue audio out which the windows/PC sound card gives me, and it enables me to switch between my TV (HDMI with audio), and a projector on VGA (analogue audio out to power amps). VGA is a compromise due to long cable runs which proved unreliable over HDMI and DVI, even with the "cheaper" extenders for HDMI over LAN cables.

Link to comment
Share on other sites

Waldonnis

It may work well on linux, but with my complete lack of Linux knowledge, I'll be likely to make it fail. also, I got the CPU to deal with it, so QS on Win10 is just a nice-to-have. :)

Bah, Linux has a learning curve, but it's worth it IMO.  I still boot over to Linux regularly because scripting some operations is just easier compared to Windows' rudimentary scripting support.  PowerShell is miles better than batch files, but having bash, perl, python, etc available without having to install Cygwin or MingW is just too nice, and compiling custom ffmpegs or other utilities is so much easier.  Admittedly, I'm an old *NIX salt, having dealt with it for over three decades now, but once you get the hang of it, it's hard to imagine life without it.  If you're curious, you can always do an install to a USB stick to play with it without committing a hard drive or partition to it, assuming your BIOS allows booting from USB devices.

 

Speaking of CPU power, my old 7MC computer E8400 CPU, underclocked to 1 GHz due to cooling/fan noise has a hard time handling the live stream from Emby (it dealt with the HD tv signal from 7MC fine, but not with the same channel when served from Emby.

 

Would I be ok to use the server (i7 8700k) as client, while still operating as server for client transcodes?

My main reason for not going with a shield or other "light" client is that I need the analogue audio out which the windows/PC sound card gives me, and it enables me to switch between my TV (HDMI with audio), and a projector on VGA (analogue audio out to power amps). VGA is a compromise due to long cable runs which proved unreliable over HDMI and DVI, even with the "cheaper" extenders for HDMI over LAN cables.

 

 

Ugh, the long cable blues...I know it well.  How long is the cable run, if you don't mind me asking?.

 

I'll defer the "client and server on the same machine" answers to people with hardware closer to yours that know the capabilities a bit better than I would.

Link to comment
Share on other sites

Jennice

Bah, Linux has a learning curve, but it's worth it IMO.  I still boot over to Linux regularly because scripting some operations is just easier compared to Windows' rudimentary scripting support.  PowerShell is miles better than batch files, but having bash, perl, python, etc available without having to install Cygwin or MingW is just too nice, and compiling custom ffmpegs or other utilities is so much easier.  Admittedly, I'm an old *NIX salt, having dealt with it for over three decades now, but once you get the hang of it, it's hard to imagine life without it.  If you're curious, you can always do an install to a USB stick to play with it without committing a hard drive or partition to it, assuming your BIOS allows booting from USB devices.

 

 

 

Ugh, the long cable blues...I know it well.  How long is the cable run, if you don't mind me asking?.

 

I'll defer the "client and server on the same machine" answers to people with hardware closer to yours that know the capabilities a bit better than I would.

 

 

Hi Waldonnis,

 

I played a bit with Linux in the 90s, but it felt much less user friendly at the time (compared to my needs). I know it's improved a lot since then, and some lab pc's at work are linux.

I'm not the kind of person to make my own ffmpeg builds and such, so I haven't feld the need to dig into that level of detail of user control/possibilities.

I may give it a try on occasion, though, but not at the moment.

 

The HDMI cable is 15 meters I think, so it runs 1080i, and the projector is 20m VGA.

Link to comment
Share on other sites

Waldonnis

Hi Waldonnis,

 

I played a bit with Linux in the 90s, but it felt much less user friendly at the time (compared to my needs). I know it's improved a lot since then, and some lab pc's at work are linux.

I'm not the kind of person to make my own ffmpeg builds and such, so I haven't feld the need to dig into that level of detail of user control/possibilities.

I may give it a try on occasion, though, but not at the moment.

 

The HDMI cable is 15 meters I think, so it runs 1080i, and the projector is 20m VGA.

 

That's a problematic distance for sure, especially for 1080p (even 1080i can be iffy depending on the cable), and without sinking more serious money into it, I can see how that would be hard to overcome.  I've seen the Cat6 extenders work pretty well, and fiber of course, but finding a good one isn't cheap (relatively) nor easy.  Interference can be a problem in certain environments as well, but it's generally not an issue.  Anyway, off-topic, but interesting stuff.

Link to comment
Share on other sites

Gerrit507

Hi Waldonnis,

 

I played a bit with Linux in the 90s, but it felt much less user friendly at the time (compared to my needs). I know it's improved a lot since then, and some lab pc's at work are linux.

I'm not the kind of person to make my own ffmpeg builds and such, so I haven't feld the need to dig into that level of detail of user control/possibilities.

I may give it a try on occasion, though, but not at the moment.

 

The HDMI cable is 15 meters I think, so it runs 1080i, and the projector is 20m VGA.

 

Well, after installation of ubuntu you basically need 3 or 4 commands to set everything up and you are done.

1. Download Emby
wget https://github.com/MediaBrowser/Emby/releases/download/3.2.60.0/emby-server-deb_3.2.60.0_amd64.deb

2. Install emby
sudo dpkg -i emby-server-deb_3.2.60.0_amd64.deb

3. Install vaapi packages
sudo apt-get install i965-va-driver vainfo 

4.
Reboot system and access the webinterface as you know it...

That's it ;)

 

And a last note: Because your CPU is so new, make sure you have Kernel 4.8 or newer to have full compatibility

Edited by Gerrit507
Link to comment
Share on other sites

Jennice

Well, after installation of ubuntu you basically need 3 or 4 commands to set everything up and you are done.

1. Download Emby
wget https://github.com/MediaBrowser/Emby/releases/download/3.2.60.0/emby-server-deb_3.2.60.0_amd64.deb

2. Install emby
sudo dpkg -i emby-server-deb_3.2.60.0_amd64.deb

3. Install vaapi packages
sudo apt-get install i965-va-driver vainfo 

4.
Reboot system and access the webinterface as you know it...

That's it ;)

 

And a last note: Because your CPU is so new, make sure you have Kernel 4.8 or newer to have full compatibility

 

Well, then I also need to find tools to import EPG, etc.. Maybe some day... but not right now. Thanks for the guide, though. :)

Link to comment
Share on other sites

  • 2 weeks later...
Gerrit507

Hey Guys,

 

I switched to firefox as my main browser lately because it somehow works better, faster and more efficient on the most websites.

 

I've found out that emby doesn't work very well with firefox especially when hevc and dts is involved.

 

I've made a comparision with chrome and edge, where chrome never seems to have any issues and edge works randomly. Sometimes the same movie transcodes fine on edge and on the other day the transcode is horribly slow.

 

I've snipped out some comparision pics of all browsers. I just don't understand the randomness of edge and firefox, for example the same movie I watched on edge last night without issues, now was extremely slow in transcoding. The same goes for firefox, for example if I pause the video and restart it from the same point it sometimes starts transcoding fine.

 

edit: I just gave edge another run and now it's blazing fast again, same movie and same bitrate. It just doesn't make any sense to me.

 

edit: Here are the log files from both edge runs. The exact same transcoding properties were used...

post-191225-0-03548900-1517080283_thumb.jpg

post-191225-0-16851400-1517080288_thumb.jpg

post-191225-0-32810500-1517080295_thumb.jpg

post-191225-0-02055100-1517080392_thumb.jpg

post-191225-0-26323800-1517080396_thumb.jpg

post-191225-0-78216700-1517080429_thumb.jpg

post-191225-0-13703400-1517080755_thumb.jpg

log_edge_dts_1.txt

log_edge_dts_2.txt

Edited by Gerrit507
Link to comment
Share on other sites

Gerrit507

Sounds like it was affected by whatever else was happening on your system at the time.

Well, the thing is there is nothing happing on the system. The CPU is at 30% load and plenty of RAM free. It simply can't transcode fast enough to provide compatible media for Firefox I guess. What I don't understand is that it's not reproduceable at all, especially on Edge, most of the time it works but sometimes it doesn't. Then what's strange is that same hevc movie has container mkv on Edge and container webm on Firefox as original media info. Why?

Edited by Gerrit507
Link to comment
Share on other sites

The browsers each support different formats and we account for this by taking advantage of what is possible. Chrome and edge can both direct play more frequently than Firefox.

 

Does this answer your question?

Link to comment
Share on other sites

Gerrit507

The browsers each support different formats and we account for this by taking advantage of what is possible. Chrome and edge can both direct play more frequently than Firefox.

 

Does this answer your question?

Yes, I'm aware of that. I actually wanted to find out what's so taxing for the system when playing with Firefox.

 

All three browsers let emby transcode hevc to h264. All three need transcoding when dts is involved. Chrome gets mp3, firefox aac and Edge gets ac3. Last but not least Chrome and Firefox both get HLS and Edge gets "Video"...

 

So basically for all three browseres everything has to be transcoded.  Why is it so slow especially for Firefox then? Does transcoding from dts to aac cut the whole performance in half? This just can't be.

Link to comment
Share on other sites

If i had to guess it would appear the browser reports that it doesn't support mp3 and that's why firefox is using aac and chrome is using mp3.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...