Jump to content

GPU Transcoding (Intel QuickSync and nVidia NVENC)


witteschnitte

Recommended Posts

mastrmind11

how do i confirm transcoding is done via GPU instead of CPU

The CPU load is less.  You can also see it in the transcode log.

Link to comment
Share on other sites

mastrmind11

where is the logs? in window

look at your server dashboard, there is a section called "Paths"

Link to comment
Share on other sites

  • 5 weeks later...
Waldonnis

Just dropping this in here for info, but it looks like some stuff was committed in the past couple of days for ffmpeg using AMF (AMD hardware acceleration):

 

Add HW H.264 and HEVC encoding for AMD GPUs based on AMF SDK

 

I have no idea if this is commit was complete enough to get it working.  I also don't think I have the required header file since I don't have the SDK installed an don't have the AMD hardware to test it anyway.  If someone manages to get a build together and has the hardware, though, I'd be curious to see if/how it works out.  FYI, definitely check the license on the header file before thinking of distributing or posting a build with it enabled since I have no idea what license it's under.

Link to comment
Share on other sites

Jennice

I have now built my new i7-8700k based Emby rig, 16 GB RAM, SATA SSD, on-board GPU.

 

First tests indicate that QuickSync transcode still causes more stutter than SW transcode. 15 mbps 1080i material transcoded to 720p 2 mbps.

 

The processor handles 3 transcodes at once, using SW transcode. It appears that hyperthreading is not that well utilized.

 

Does anyone know the possible number of simultanious QuickSync transcodes?

Link to comment
Share on other sites

Waldonnis

First tests indicate that QuickSync transcode still causes more stutter than SW transcode. 15 mbps 1080i material transcoded to 720p 2 mbps.

 

The processor handles 3 transcodes at once, using SW transcode. It appears that hyperthreading is not that well utilized.

 

Does anyone know the possible number of simultanious QuickSync transcodes?

 

I don't think there's a hard cap on simultaneous encoding sessions, but you're bound to run into a "soft limit" at some point (doubt it's as low as 3, though).

 

I'm not sure I understand what you meant about stuttering, though.  Is it not transcoding fast enough to prevent buffering or is it a visual issue in the output?  I'd need to see a transcode log to know more about what's going on with that.  As for threading, I'm also confused as to what you meant about that.  Are you seeing load across all physical and virtual cores?  And is this with -threads 0?  Nevermind, found your other post about this.

Edited by Waldonnis
Link to comment
Share on other sites

Jennice

Using QS, I just tend to get occasional freezing of the picture, just for a fraction of a second, but still annoying.

SW transcode works much better. ( I know, QS is not official).

Meanwhile, with my current settings (medium quality H264 preset, 1 transcoding thread set in Emby), I can transcode 4 channels at once (which is what my HD Homerun offers). CPU load is around 64%, but spread across all CPU threads/cores, except core 0.

post-238452-0-85810900-1512500126_thumb.png

Link to comment
Share on other sites

Gerrit507

This could be helpful for everybody, who is using hw-accel and encounters performance issues, especially for quicksync and vaapi users:

 

Untick "Allow subtitle extraction on-the-fly" in Transcoding options. After disabling this option the transcoding speed at least quadrupled. I've tested this multiple times on a Pentium G4600 running Ubuntu 16.04 and vaapi enabled, with repeating results. The performance impact is already there when no subtitle is selected. In general everything which has to do with subtitles seems to have a huge impact on performance using vaapi.

 

This could be potential bug @@Luke.

Edited by Gerrit507
Link to comment
Share on other sites

This could be helpful for everybody, who is using hw-accel and encounters performance issues, especially for quicksync and vaapi users:

 

Untick "Allow subtitle extraction on-the-fly" in Transcoding options. After disabling this option the transcoding speed at least quadrupled. I've tested this multiple times on a Pentium G4600 running Ubuntu 16.04 and vaapi enabled, with repeating results. The performance impact is already there when no subtitle is selected. In general everything which has to do with subtitles seems to have a huge impact on performance using vaapi.

 

This could be potential bug @@Luke.

 

Hi.  Did you read the text underneath that check box...?

Link to comment
Share on other sites

Gerrit507

Hi.  Did you read the text underneath that check box...?

Yes I did read it! It says nothing about changing the performance from perfect(70FPS+) to absolutely unusable (15FPS). Keep in mind: this option is default. As I said there is a general problem with subtitles in vaapi. Everytime I enable a subtitle I get this performance drop.

Edited by Gerrit507
Link to comment
Share on other sites

Yes I did read it! It says nothing about changing the performance from perfect(70FPS+) to absolutely unusable (15FPS). Keep in mind: this option is default. As I said there is a general problem with subtitles in vaapi. Everytime I enable a subtitle I get this performance drop.

 

Burning in subtitles is painful, so yes that can happen. We will update the help text.

Link to comment
Share on other sites

Yes I did read it! It says nothing about changing the performance from perfect(70FPS+) to absolutely unusable (15FPS). Keep in mind: this option is default. As I said there is a general problem with subtitles in vaapi. Everytime I enable a subtitle I get this performance drop.

 

Okay, this statement:

 

 

On some systems this can take a long time and cause video playback to stall during the extraction process.

 

Was attempting to convey the potential performance impact of the option.  I guess we need to make it more clear.  Thanks.

Link to comment
Share on other sites

Gerrit507

Okay, this statement:

 

Was attempting to convey the potential performance impact of the option.  I guess we need to make it more clear.  Thanks.

Disabling it by default would help a lot I guess. Thank you.

 

I wasn't only talking about burning in subtitles. As soon as subtitles are transcoded for the webinterface the performance drops no matter if burned in or not :mellow:

Link to comment
Share on other sites

It is checked by default. Unfortunately the on the fly extraction is also very slow as well. That's why external subtitles are the most efficient option.

Link to comment
Share on other sites

Yes, it is a bit of a catch-22 here as, depending on the exact makeup of your content and the subs, both checked or un-checked could be the slower option.

Link to comment
Share on other sites

Gerrit507

Yes, it is a bit of a catch-22 here as, depending on the exact makeup of your content and the subs, both checked or un-checked could be the slower option.

 

 

It is checked by default. Unfortunately the on the fly extraction is also very slow as well. That's why external subtitles are the most efficient option.

 

Thank you for clearing up. It just seems that vaapi has issues with subtitles.

Link to comment
Share on other sites

Jennice

I run my Emby server without monitor attached, and Intel's GPU settings show a "Virtual Display" as monitor when connecting over a remote desktop solution.

 

I can transcode 1 tv stream  with QuickSync enabled. I don't know if it falls back on SW transode, CPU load is 17%. But if trying to start a second transcode stream, it crashes the first transcode stream.

 

With QuickSync disabled, everything works, up to the 4 TV tuners I have connected (HDHomeRun) to my i7-8700k CPU...

 

I am close to giving up on QS for transcode, at least without a monitor attached.

 

Has anyone had luck tricking Win10 into thinking there's a display attached? I have seen HDMI dongles, but it's not worth that much money for me to experiment (haven't seen them locally in Denmark).

 

Jennice

Link to comment
Share on other sites

Gerrit507

I've tested some video files to investigate the subtitle issue with vaapi. The performance only drops with subtitles in PGS format. Subtitles in SUBRIP format are working well. When I disable the option in the settings and let the subtitles burn in, the transcode with pgs subs won't start at all. The affected files are UHD BD rips in mkv containers.

 

I add a log and media information. I hope it helps you.

 

 

post-191225-0-26788300-1513140402_thumb.jpg

Log

Link to comment
Share on other sites

I've tested some video files to investigate the subtitle issue with vaapi. The performance only drops with subtitles in PGS format. Subtitles in SUBRIP format are working well. When I disable the option in the settings and let the subtitles burn in, the transcode with pgs subs won't start at all. The affected files are UHD BD rips in mkv containers.

 

I add a log and media information. I hope it helps you.

 

Well they are burning in there, but at 17 fps that's not fast enough to be playable. Consider using text-based subtitle formats that don't require burning in.

Link to comment
Share on other sites

Hi, 

just to inform you(and congratulate of course) that Gpu transcode (nvenc) works well for me, from 50% cpu usage to nearly 5%.

Nvidia-smi show me a small 25% Gpu load for a movie transcode 1080p 4Mbps at a rate of 70/90fps

So it's work fine except in two case!

When the video is in H265 or in UHD (even in H264). where it goes full cpu (and when it's come to a UHD in H265 it's hardly 15fps in transcoding)

but it's appears that it a system limitation from my GPU (Nvenc/Nvdec) if i refer to this https://developer.nvidia.com/video-encode-decode-gpu-support-matrix

 

Can you confirm?

Does i need to upgrade to a more recent/powerful gpu like a Quadro P2000 and will emby handle it?

 

System Unbuntu

Dual Xeon X5675 with 48Gb

transcode folder dedicated 120gb SSD

Gpu Quadro K2000

 

 

Best regards!

Link to comment
Share on other sites

Gerrit507

Hi, 

just to inform you(and congratulate of course) that Gpu transcode (nvenc) works well for me, from 50% cpu usage to nearly 5%.

Nvidia-smi show me a small 25% Gpu load for a movie transcode 1080p 4Mbps at a rate of 70/90fps

So it's work fine except in two case!

When the video is in H265 or in UHD (even in H264). where it goes full cpu (and when it's come to a UHD in H265 it's hardly 15fps in transcoding)

but it's appears that it a system limitation from my GPU (Nvenc/Nvdec) if i refer to this https://developer.nvidia.com/video-encode-decode-gpu-support-matrix

 

Can you confirm?

Does i need to upgrade to a more recent/powerful gpu like a Quadro P2000 and will emby handle it?

 

System Unbuntu

Dual Xeon X5675 with 48Gb

transcode folder dedicated 120gb SSD

Gpu Quadro K2000

 

 

Best regards!

Hi,

 

Yes the K2000 is a Kepler card, they can't do HEVC decoding. You need a Pascal Card or there are very few Maxwell cards, which also can handle HEVC. I've used a 1050Ti on Windows and HEVC worked, haven't tested nvenc on linux yet, but it should work too.

 

There was a certain format though, where NVENC struggled. I think it's a bug of ffmpeg. When you have HDR content with interlaced chroma subsampling a.k.a 4:2:0 Type 2, like it is on the HDR blurays. This format however produced artifacts and the decode didn't work.

 

Another limititation is that all Geforce cards and the cheaper Quadros can only handle 2 encodes at the same time. You would need a P2000 at least.

Link to comment
Share on other sites

Hi,

 

Yes the K2000 is a Kepler card, they can't do HEVC decoding. You need a Pascal Card or there are very few Maxwell cards, which also can handle HEVC. I've used a 1050Ti on Windows and HEVC worked, haven't tested nvenc on linux yet, but it should work too.

 

There was a certain format though, where NVENC struggled. I think it's a bug of ffmpeg. When you have HDR content with interlaced chroma subsampling a.k.a 4:2:0 Type 2, like it is on the HDR blurays. This format however produced artifacts and the decode didn't work.

 

Another limititation is that all Geforce cards and the cheaper Quadros can only handle 2 encodes at the same time. You would need a P2000 at least.

Ok thanks!

Yeah i was aware about the stream limitation on Gforce and cheap Quadro that why i look for 2000 series who doesn't have this artificial limitation.

Thanks a lot!

Best regards

Link to comment
Share on other sites

Gerrit507

Ok thanks!

Yeah i was aware about the stream limitation on Gforce and cheap Quadro that why i look for 2000 series who doesn't have this artificial limitation.

Thanks a lot!

Best regards

 

Honestly, the way cheaper option is an Intel CPU with integrated graphics. You get the HD630 from the Pentium G4600 upwards and it costs only about 70€ here in Germany. The transcode capabilities of this little thing are amazing. It handles 3-4 hevc streams and about 8 h264 streams at the same time. I'm also not having the "bug" with interlaced chroma subsampling.

Edited by Gerrit507
Link to comment
Share on other sites

Honestly, the way cheaper option is an Intel CPU with integrated graphics. You get the HD630 from the Pentium G4600 upwards and it costs only about 70€ here in Germany. The transcode capabilities of this little thing are amazing. It handles 3-4 hevc streams and about 8 h264 streams at the same time. I'm also not having the "bug" with interlaced chroma subsampling.

I need this kind of cpu for other purpose ;)

it would cost me a lot to have a similar in a more recent intel cpu.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...