Jump to content

GPU Transcoding (Intel QuickSync and nVidia NVENC)


witteschnitte

Recommended Posts

lifespeed

Rather than look at a new CPU why not consider a NVIDIA Pascal video card?

 

Because hardware-accelerated transcoding is still in beta, and reliability of the media server may be an important consideration.

Link to comment
Share on other sites

I don't think it's coming out of beta anytime soon but personally I find NVENC works better than QSV. It's a fairly low cost experienment and if it doesn't work you can always resell it for a few dollars less than what you paid.

Link to comment
Share on other sites

Guest asrequested

 

Can someone please show an example of, what they're capable of transcoding on a 7.gen i5 or i7 (stock clock freq's, not on LN2 cooled monsters)   :D

Personally, my 3770 only manages 2 - 2.5 mbit 720p before occasional stuttering. The CPU usage meter in Win10 says it's 40% - 50% load, but going higher causes stutter, despite the CPU load monitor not hitting 100% (!?).

Could it be that the load monitoring is just too slow to catch transcoding peak loads?

 

Any way to show trascoding speed relative to "real-time" (obviously this should never get below factor 1)?

 

 

FWIW I set the bitrate to 10 Mb/s on 3 devices and played three movies. My server has an i7 6700K with Z170 chipset. Here are the results.

 

59c9d7371f1ce_Snapshot_233.jpg

 

 

With three transcodes.

 

59c9b85a07045_Servertranscoding3.png

 

 

With one transcode

 

59c9b8c811d63_Servertranscoding15.jpg

 

When starting the second transcode, the CPU maxes out. But the GPU still has room for more. I suspect that if I used my i7 7700k with the Z270 chipset, the results would be much better. I  have that in my HTPC, and when I load it up, it has more headroom.

Edited by Doofus
Link to comment
Share on other sites

Guest asrequested

Only implemented decoding at this point.

 

I should have specified that I have 'convert to stream friendly format' enabled, but I believe the answer is the same. 

Link to comment
Share on other sites

Rather than look at a new CPU why not consider a NVIDIA Pascal video card?

I have a GTX 970 video card, but when using it for the transcode, the CPU load graph looks like a fallen pine tree. Many significant spikes at regular intervals (buffer data going back and forth?). I don't get any benefit with my i7-3770. Just stutter which is in sync with the CPU load peaks. Maybe the CPU and/or RAM / data-bus isn't up to the task? The PC is using an SSD drive for Emby.

 

Right now, I 'm testing Emby on my game pc, trying to figure out what kind of PC I would need for Emby transcoding a few 1080p streams real-time... plus some future-proofing. My movies are on a NAS, along with my old CD music, so the Emby PC would need the OS + Emby + space for recorded TV.

Link to comment
Share on other sites

FWIW I set the bitrate to 10 Mb/s on 3 devices and played three movies. My server has an i7 6700K with Z170 chipset. Here are the results.

...

When starting the second transcode, the CPU maxes out. But the GPU still has room for more. I suspect that if I used my i7 7700k with the Z270 chipset, the results would be much better. I  have that in my HTPC, and when I load it up, it has more headroom.

 

When the CPU max'es out, does that cause the playback/transcode to stutter, or does it "just" mean that the transcode isn't running as fast as its ffmpeg routine permits?

Link to comment
Share on other sites

FWIW I set the bitrate to 10 Mb/s on 3 devices and played three movies. My server has an i7 6700K with Z170 chipset. Here are the results.

 

 

 

...

 

 

 

When starting the second transcode, the CPU maxes out. But the GPU still has room for more. I suspect that if I used my i7 7700k with the Z270 chipset, the results would be much better. I  have that in my HTPC, and when I load it up, it has more headroom.

I tried it with 3 transcoding movies on my i3 3225 with HWT and I get around 64% CPU usage.

Link to comment
Share on other sites

maegibbons

FWIW I set the bitrate to 10 Mb/s on 3 devices and played three movies. My server has an i7 6700K with Z170 chipset. Here are the results.

 

59c9d7371f1ce_Snapshot_233.jpg

 

 

With three transcodes.

 

59c9b85a07045_Servertranscoding3.png

 

 

With one transcode

 

59c9b8c811d63_Servertranscoding15.jpg

 

When starting the second transcode, the CPU maxes out. But the GPU still has room for more. I suspect that if I used my i7 7700k with the Z270 chipset, the results would be much better. I  have that in my HTPC, and when I load it up, it has more headroom.

 

 

Any chance you can show us the ffmpeg-transcode logs when you ran the three transcode test?

 

Many Thanks

 

Krs

 

Mark

Link to comment
Share on other sites

maegibbons

Only implemented decoding at this point.

 

@@Luke

 

Are we close to this being implemented on the roadmap?  Would love to try hardware encoding on record-transcode!

 

Is there a particular issue that makes it difficult?

 

Krs

 

Mark

Link to comment
Share on other sites

I tried it with 3 transcoding movies on my i3 3225 with HWT and I get around 64% CPU usage.

 

 

WOW! What a difference !? strange... compated to the i7-6700k setup mentioned above.

Edited by Jennice
Link to comment
Share on other sites

WOW! What a difference !? strange... compated to the i7-6700k setup mentioned above.

Yeah, and I think 4 streams shouldn't be an issue. And no stuttering whatsoever. My only problem could be my upload then.

Link to comment
Share on other sites

maegibbons

maegibbons: Are you running Win10, and do you have any anti-virus or similar running (your tray area around the date/clock looks like hiding details.

 

I was quoting @ 's post.  I was also suprised by cpu usage which is why i was asking for the ffmpeg logs.  You need to ask him about AV as his machine.

 

Krs

 

mark

Link to comment
Share on other sites

Guest asrequested

It's going to depend on what you're transcoding. I chose three big files that have Dolby Atmos. And at 10 Mb/s it's a lot of work. I should re-run the test and change the bitrate to 2 Mb/s. It's a lot less data to write. There should be a notable difference.

 

I can post the transcode logs, later tonight.

Edited by Doofus
Link to comment
Share on other sites

Guest asrequested

When the CPU max'es out, does that cause the playback/transcode to stutter, or does it "just" mean that the transcode isn't running as fast as its ffmpeg routine permits?

Playback was smooth. And I'm using the Emby Ffmpeg build. I have throttling enabled, Using HWA means that Ffmpeg isn't doing the transcoding, only directing the stream to the GPU with the chosen parameters.

Link to comment
Share on other sites

Anyone know if there are NVENC limitations when on Server 2016?  I got a 1050Ti to attempt to test out hardware transcoding of 4k HEVC 10bit (which the 1050Ti is supposed to have full hardware decode for), and only get about 18fps (even when just playing locally on the server with MPC-HC.  I don't know if it has to do with the fact that I had to use Windows 10 drivers instead of specific Server ones, or that extra roles and features need to be included, or if it's just a limitation of the card itself... but I haven't had any luck.

Link to comment
Share on other sites

Gerrit507

Anyone know if there are NVENC limitations when on Server 2016?  I got a 1050Ti to attempt to test out hardware transcoding of 4k HEVC 10bit (which the 1050Ti is supposed to have full hardware decode for), and only get about 18fps (even when just playing locally on the server with MPC-HC.  I don't know if it has to do with the fact that I had to use Windows 10 drivers instead of specific Server ones, or that extra roles and features need to be included, or if it's just a limitation of the card itself... but I haven't had any luck.

I have a 1050Ti too. My card handles two 4K hevc 10 bit streams at the same time with ease. Although I've found out that NVDEC is unable to handle HDR streams with interlaced 4:2.0 chroma subsampling a.k.a 4:2:0 Type 2. It produces atrifacts and choppy output. So I guess your issue is either caused by the input file or is some driver issue.

 

By the way: Which hw-acceleration do you have selected in MPC? Emby uses NVDEC/NVENC and I think standard for MPC is DXVA.

Edited by Gerrit507
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...