Jump to content


Photo

Hardware Transcoding


  • Please log in to reply
96 replies to this topic

#41 cayars OFFLINE  

cayars

    Advanced Member

  • Alpha Testers
  • 2936 posts
  • Local time: 10:31 PM

Posted 04 October 2018 - 02:21 PM

Actually this is not entirely true. On windows we use the -dxva params which let ffmpeg decide. I haven't tested but i would think it would handle automatically choosing which gpu is available.

I've had two GPUs in my server and only one ever gets used.  I specifically tested this a few different times.  I've since pulled the GPU to use in a different computer.  Maybe I'll throw it back in and test again.  BTW, same exact GPUs.

 

Do you remember roughly how long ago the "-dxva params" was added?

 

I certainly hope I'm wrong, because I'd like two GPUs in my server!

 

Oh BTW, for me previously it only ever ran on the "console" GPU.


Edited by cayars, 04 October 2018 - 02:21 PM.

  • Michaelhelps likes this

#42 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 135862 posts
  • Local time: 10:31 PM

Posted 04 October 2018 - 02:44 PM

We did it throughout this summer.



#43 cayars OFFLINE  

cayars

    Advanced Member

  • Alpha Testers
  • 2936 posts
  • Local time: 10:31 PM

Posted 04 October 2018 - 03:21 PM

Cool, because I did the testing back in the late fall.

So this was added after I tested it!

 

Do you know if anyone has actually tested more than 2 HW encodes this way with nvidia?



#44 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 135862 posts
  • Local time: 10:31 PM

Posted 04 October 2018 - 03:32 PM

That I'm not quite sure about. Thanks.



#45 Doofus ONLINE  

Doofus

    Advanced Member

  • Members
  • 12062 posts
  • Local time: 07:31 PM

Posted 04 October 2018 - 03:42 PM

I've been curious about this. I would think if they are in SLI, it should work?



#46 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1663 posts
  • Local time: 04:31 AM

Posted 04 October 2018 - 04:09 PM

I've been curious about this. I would think if they are in SLI, it should work?

 

No, it's got nothing to do with SLI. SLI is not required for this. SLI allows two cards working on the same 3D scene, but video operations vie NVENC and NVDEC are separate from this.

When you got more than one GPU board, you won't need to connect them for SLI.



#47 Doofus ONLINE  

Doofus

    Advanced Member

  • Members
  • 12062 posts
  • Local time: 07:31 PM

Posted 04 October 2018 - 04:29 PM

No, it's got nothing to do with SLI. SLI is not required for this. SLI allows two cards working on the same 3D scene, but video operations vie NVENC and NVDEC are separate from this.

When you got more than one GPU board, you won't need to connect them for SLI.

 

Ah, thanks for that. I've never actually used it. So multiple GPUs should work?



#48 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1663 posts
  • Local time: 04:31 AM

Posted 04 October 2018 - 06:09 PM

With DXVA: in theory yes, but emby doesn't implement this at the moment.

 

There is nothing like an automatic load balancing or so. The GPU adapter number would need to be specified.

 

With NVENC/NVDEC, I can't tell right now, but we'll come to that point...


  • Doofus and Michaelhelps like this

#49 Michael K. OFFLINE  

Michael K.

    Advanced Member

  • Members
  • 160 posts
  • Local time: 10:31 PM

Posted 07 May 2019 - 11:01 PM

Hi,

Can you tell me if the new emby 4.x will take advantage of more than one GPU, in this case NVidia NVENC/NVDEC? 

 

If I were to have multiple GPU's, would emby balance the load between them in some way? 

 

I'm thinking to use 4 x NVidia P5000 GPU's for the encoding. Perhaps dual Xeon scalable processors for the decode. Any thoughts? 



#50 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 135862 posts
  • Local time: 10:31 PM

Posted 08 May 2019 - 12:27 AM

Hi,

Can you tell me if the new emby 4.x will take advantage of more than one GPU, in this case NVidia NVENC/NVDEC? 

 

If I were to have multiple GPU's, would emby balance the load between them in some way? 

 

I'm thinking to use 4 x NVidia P5000 GPU's for the encoding. Perhaps dual Xeon scalable processors for the decode. Any thoughts? 

 

Hi, we don't have load balancing yet, but it's planned for the future. thanks.



#51 Michael K. OFFLINE  

Michael K.

    Advanced Member

  • Members
  • 160 posts
  • Local time: 10:31 PM

Posted 13 May 2019 - 07:08 PM

Hi, we don't have load balancing yet, but it's planned for the future. thanks.

 

Actually this is not entirely true. On windows we use the -dxva params which let ffmpeg decide. I haven't tested but i would think it would handle automatically choosing which gpu is available.

 

Hey, can you please provide some clarity between these two comments... so does the '-dxva params' work like load balancing the GPU? I would like to use multiple GPU (P5000's), and it sounds like Emby is setup to do that. 

 

Similarly, would Emby/fmpeg be able to leverage dual or quad CPU's using the Xeon Scalable processors?

 

I'm looking to configure the server as best-of-breed, with maximum transcoding performance (cost is not a concern). Any guidance would be greatly appreciated!

 

Thanks! 



#52 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 135862 posts
  • Local time: 10:31 PM

Posted 13 May 2019 - 07:29 PM

The dxva comment is no longer accurate following our rewritten hardware transcoding support. Single gpu usage is as fine tuned as it has ever been, but we do not have load balancing with multiple gpu's yet. thanks.



#53 Michael K. OFFLINE  

Michael K.

    Advanced Member

  • Members
  • 160 posts
  • Local time: 10:31 PM

Posted 13 May 2019 - 07:32 PM

I see, that's clear, thanks. How about support for multiple CPU for software based transcoding? 



#54 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 135862 posts
  • Local time: 10:31 PM

Posted 13 May 2019 - 07:40 PM

We already support multi-threaded decoding and encoding for software based transcoding.



#55 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1663 posts
  • Local time: 04:31 AM

Posted 13 May 2019 - 07:43 PM

The dxva comment is no longer accurate following our rewritten hardware transcoding support. Single gpu usage is as fine tuned as it has ever been, but we do not have load balancing with multiple gpu's yet. thanks.

 

Anyway, the dxva comment never implied any kind of load balancing. It just said that the first "available" device would be chosen, where "available" technically just meant, "Device 0".

 

PS: In DXVA2 terms, "Device 0" means the device driving the primary display.


Edited by softworkz, 13 May 2019 - 07:44 PM.

  • Michael K. likes this

#56 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1663 posts
  • Local time: 04:31 AM

Posted 13 May 2019 - 07:50 PM

Similarly, would Emby/fmpeg be able to leverage dual or quad CPU's using the Xeon Scalable processors?

 

You need to be careful about multi-socket Xeons. Those usually don't include any GPU units at all!



#57 Michael K. OFFLINE  

Michael K.

    Advanced Member

  • Members
  • 160 posts
  • Local time: 10:31 PM

Posted 13 May 2019 - 08:50 PM

Anyway, the dxva comment never implied any kind of load balancing. It just said that the first "available" device would be chosen, where "available" technically just meant, "Device 0".

 

PS: In DXVA2 terms, "Device 0" means the device driving the primary display.

 

That is super helpful, thanks! 

 

 

You need to be careful about multi-socket Xeons. Those usually don't include any GPU units at all!

 

Yes, in this case the CPU's would not have any GPU function. They would do software based transcoding. There are so many ways to configure the transcoding, but one way that I was considering is a CPU software based transcoding for H264 or lower, and then dedicating the GPU's to H265. 



#58 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1663 posts
  • Local time: 04:31 AM

Posted 13 May 2019 - 09:14 PM

Yes, in this case the CPU's would not have any GPU function. They would do software based transcoding. There are so many ways to configure the transcoding, but one way that I was considering is a CPU software based transcoding for H264 or lower, and then dedicating the GPU's to H265. 

 

We need to differentiate between decoding and encoding. For live streaming, Emby is using H.264 encoding only (the HLS spec doesn't include H.265 anyway).

Encoding is the much harder part of the story in most cases.

 

Generally, if you're seeking for high performance transcoding, you shouldn't plan for software transcoding. There's an incredible difference in the effectiveness between GPU and CPU video processing.

CPU processing power will be required anyway for:

  • Audio transcoding
  • Subtitle burn-in
  • Color Conversion
  • Fallback transcoding when hardware transcoding fails

For the CPU setup: From my experience, I would say that a single Xeon with GPU will be able to achieve a significantly higher video processing throughput than 4 siimilar Xeons without GPU.


  • Q-Droid likes this

#59 Michael K. OFFLINE  

Michael K.

    Advanced Member

  • Members
  • 160 posts
  • Local time: 10:31 PM

Posted 13 May 2019 - 10:50 PM

CPU processing power will be required anyway for:

  • Audio transcoding
  • Subtitle burn-in
  • Color Conversion
  • Fallback transcoding when hardware transcoding fails

For the CPU setup: From my experience, I would say that a single Xeon with GPU will be able to achieve a significantly higher video processing throughput than 4 siimilar Xeons without GPU.

 

Very interesting. So in my case, every title is always played with subtitles burn-in. Does this mean that the GPU encoding would likely never be engaged, or is the load split between CPU/GPU somehow when using subtitles? 

 

As for the Intel GPU (QuickSync), I haven't been able to get it to work well with emby. I've been trying with the i7-7700. I haven't tried the new emby v4 yet, which I understand has significant GPU improvements. 



#60 Doofus ONLINE  

Doofus

    Advanced Member

  • Members
  • 12062 posts
  • Local time: 07:31 PM

Posted 13 May 2019 - 11:04 PM

If you want to use CPU transcoding, you'll want to use something like a Threadripper. I have a 1st gen, and it handles everything, pretty well. Not as fast as a GPU, but fast enough. The later versions will probably be much better.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users