Jump to content

Transcoding clarification


Secundo°
 Share

Recommended Posts

Secundo°

Hello,

 

I'm not very familiar with transcoding. I want to use hardware acceleration for my new emby server. My old emby server was installed on a single board computer. With 2 simultaneous streams the Cpu was throttling and at 100%. Because the Cpu on the new server should mostly be used for other tasks, I decided to use a dedicated gpu for media transcoding.

My understanding was that emby almost exclusively uses decoding and only need encoding for tasks like converting files. But this seems to be wrong. I read in some threads (e.g. https://emby.media/community/index.php?/topic/62545-video-card-advice-for-hardware-transcoding/)that the consumer grade nvidia cards only support 2 simultaneous encoding streams/sessions (decoding is unrestricted). After this hint I also noticed this in the support matrix on nvidia website. Now is my question: Whenever transcoding is necessary does emby also need one encoding stream for one decoding stream? My new understanding is that emby uses decode to convert the current file codec to raw video/audio data and then uses encode in order to turn this raw video/audio data to a more compressed version supported by the streaming client, because the raw video/audio would use a way to much bandwidth for streaming. And if the two gpu encoding streams are exhausted, does emby then simply use the cpu for further encoding?

 

Additional info/ off topic: There will be a maximum of 8-10 simultaneous streams on my new server. I plan to buy the new Gtx 1650 which was released 3 days ago. I will buy this card in around 1-2 months. I think the price of this card will drop much in the near future because it is a horrible value card for gamers (rx570 is ~30% much better at ~10% lower price, most reviews advise against this card for gaming and declare it 'dead on arrival'). But looking at the matrix for supported hardware codecs (https://developer.nvidia.com/video-encode-decode-gpu-support-matrix) this card fully supports every decoding codec (H.264/H.265/VP8/VP9/...) and almost every encoding (no hevc b frames). And whats quiet important for me, this card draws only around 75W and is really silent.

 

Thank you

 

English is not my native language :rolleyes:

Link to comment
Share on other sites

 

 

Whenever transcoding is necessary does emby also need one encoding stream for one decoding stream

 

Hi, no, it would just count as one stream towards the nvidia limit. Please let us know if this helps. Thanks !

  • Like 1
Link to comment
Share on other sites

Secundo°

Hi, no, it would just count as one stream towards the nvidia limit. Please let us know if this helps. Thanks !

So this means that the graphics card would only support 2 transcoding streams simultaneously? Is encoding required every time or sometimes only decoding (because decoding seems to have no limit/restriction)? And for my second question, I guess the cpu manages further encoding, when the nvidia limit is exhausted? Thank you

Link to comment
Share on other sites

Q-Droid

What is the CPU in your new server? AMD GPUs do not have the 2-session limit that Nvidia enforces on consumer gaming cards.

Link to comment
Share on other sites

mrfragger

Here’s a tidbit from tomahardware. Avoid that 1650.

 

Turing's accelerated video encode capabilities, which carried over to GeForce GTX 1660, but did not make it into GeForce GTX 1650. Check out the following screen capture from Nvidia’s website, referring to the 1650’s NVENC engine as Volta-equivalent, making it similar to Pascal.

 

That means support for H.265 8K encode at 30 FPS is gone, along with the 25% bitrate savings for HEVC and up to 15% bitrate savings for H.264 that Nvidia touted when Turing launched.

Link to comment
Share on other sites

Secundo°

What is the CPU in your new server? AMD GPUs do not have the 2-session limit that Nvidia enforces on consumer gaming cards.

 

The new Cpu will probably be a Ryzen 5 3600 as soon as Amd releases the Zen2 desktop processors. But there will be other servers running on the same machine (e.g. gitlab server). My question is still whether the nvidia 2 encoding stream limit, really limits to only 2 simultaneous transcoding streams. Or whether sometimes only gpu decoding is necessary (which seems to be unlimited).

I could not find any good codec support matrix for amd gpu's. The nvidia codec support matrix is pretty clear. I would be happy if you know any source for this or have a recommendation for a amd graphics card for transcoding (price should be <200€).

 

Here’s a tidbit from tomahardware. Avoid that 1650.

 

Turing's accelerated video encode capabilities, which carried over to GeForce GTX 1660, but did not make it into GeForce GTX 1650. Check out the following screen capture from Nvidia’s website, referring to the 1650’s NVENC engine as Volta-equivalent, making it similar to Pascal.

 

That means support for H.265 8K encode at 30 FPS is gone, along with the 25% bitrate savings for HEVC and up to 15% bitrate savings for H.264 that Nvidia touted when Turing launched.

Thank you for this advice! Now I'm very unsure whether I will go with the 1650 or rather the 1660. You are right that the encoder is more similar to pascal than to turing. For me the advantages of the 1650 are:

- the 1650 costs around 70€/80$ less and this difference might increase in next 2 months because the 1650 is at a quiet bad value at the time

- the 1650 has 75W TDP instead of 120/130W TDP and does not need a external power supply (all power over PCI-e)

- this graphics card will be only used for transcoding, so 70€+ for an improved chip is alot (and the volta chip is still better than the pascal chip of the 1050)

Link to comment
Share on other sites

lightsout

The new Cpu will probably be a Ryzen 5 3600 as soon as Amd releases the Zen2 desktop processors. But there will be other servers running on the same machine (e.g. gitlab server). My question is still whether the nvidia 2 encoding stream limit, really limits to only 2 simultaneous transcoding streams. Or whether sometimes only gpu decoding is necessary (which seems to be unlimited).

I could not find any good codec support matrix for amd gpu's. The nvidia codec support matrix is pretty clear. I would be happy if you know any source for this or have a recommendation for a amd graphics card for transcoding (price should be <200€).

 

Thank you for this advice! Now I'm very unsure whether I will go with the 1650 or rather the 1660. You are right that the encoder is more similar to pascal than to turing. For me the advantages of the 1650 are:

- the 1650 costs around 70€/80$ less and this difference might increase in next 2 months because the 1650 is at a quiet bad value at the time

- the 1650 has 75W TDP instead of 120/130W TDP and does not need a external power supply (all power over PCI-e)

- this graphics card will be only used for transcoding, so 70€+ for an improved chip is alot (and the volta chip is still better than the pascal chip of the 1050)

There is a very we easy driver mod on GitHub that allows you to remove the 2 steam limitation. I'm on my phone but you can Google it. There's also a thread about it here somewhere in the forum.

 

I'll link to it later if you don't find it.

 

 

Sent from my iPhone using Tapatalk

Link to comment
Share on other sites

Q-Droid

If all 8-10 streams will be transcoding then you'll need discrete hardware that can handle it. I don't know enough about your GPU choices to say if they can, even if the limits are removed. But if only 2-3 streams need to be transcoded then that is a workload that a newish AMD APU or Intel CPU with iGPU should be able to handle. I don't know if Emby can manage multiple render devices and distribute the workload. If it can then a combination of GPU and APU/iGPU could be an option.

Link to comment
Share on other sites

lightsout

If all 8-10 streams will be transcoding then you'll need discrete hardware that can handle it. I don't know enough about your GPU choices to say if they can, even if the limits are removed. But if only 2-3 streams need to be transcoded then that is a workload that a newish AMD APU or Intel CPU with iGPU should be able to handle. I don't know if Emby can manage multiple render devices and distribute the workload. If it can then a combination of GPU and APU/iGPU could be an option.

Agreed if that many are transcoding thats a good size load.I have done around 8 with an nvidia GPU, but that was just to test the driver patch with a bunch of browser windows, I am not sure how well the quality/framerate would hold up.

 

Anyways here is the link to remove the two stream limit.

https://github.com/keylase/nvidia-patch

 

With a powerful CPU and a good gpu you can definitely do a number of streams. I have not found info measuring how many streams any given GPU's can do. since most people are working with the 2 stream limitation.

Link to comment
Share on other sites

Secundo°

Thank you for the link of the patch. Normally there will be around 3 streams. The 8-10 I mentioned is absolutly worst case. I will probably go with the 1660 and try the patch.

  • Like 1
Link to comment
Share on other sites

  • 8 months later...
Secundo°

Let us know how you get on.

 

I'm here for the update. The server with Emby is running since ~2 months now, performance wise good enough for my requirements. But the planned hardware changed a lot. And to make it short: I do not use hardware accelerated transcoding for now.

Long Answer:

Due to increased requirements the server specs are: AMD 3900X, ASUS Pro WS X570-Ace, 64GB ECC, a lot of HDD Space (+parity), an UPS ;). Because of the different server applications running I decided to go with Proxmox VE as operating system (very good choice looking back).

The current graphics card is a GTX 750Ti without external power supply, but the first card I tried was the GTX 1660(Ti) mentioned earlier in this thread.

The first problem i encountered was, that I was not able to install Proxmox VE with GTX 1660Ti installed. It took me some time to find out that it was because of the gpu. I borrowed a GTX 1060 and installed the OS.

After installing the 1660Ti again, I tried PCIe-Passthrough of the gpu to a debian vm. This seemed to have worked, because I could detect the graphics card and gpu details within the vm, but I was not able to install the drivers correctly (they did threw errors). Some time later I tried the same with a Win10 Vm. Same problems. With utilities I could detect the graphics card and its specs (which could not be detected without PCIe Passthrough). But in the device manager only the Basic Display Adapter was listed and driver installers for Windows didn't threw error, they didn't start the installation at all.

These problems might have arose due to insufficient hardware support for PCIe Passthrough (which is not uncommon for consumer grade hardware), because of the newish graphics card (with which I couldn't even install the OS) or due to my lack of knowledge for a proper Passthrough (I've only done it once in Hyper-V to a Win Vm with a GTX 1060, which worked). I don't know...

So in the end I sold the GTX 1660Ti and bought a GTX 750Ti used and with full power supply over PCIe (only because I need dedicated gpu for output to the terminal monitor).

Currently Emby is running inside a debian vm with 12 cpu cores, 8gb ram and 32gb disk space assigned, which I noticed is more than I actually need.

There wasn't really a stress situation but it seems to be able to handle ~2 4k h265 or ~8 1080p h264 stream at the time without drop in quality.

 

I think that says it all. I will not tried the PCIe Passthrough again as long as the software transcoding performance is suitable for my needs. If you have any additional questions feel free to ask :)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...