Jump to content

Limit Hardware Transcode Streams?


Recommended Posts

Posted

So for some reason my GPU (AMD RX 570 8GB) is a little flakey when it comes to multiple Hardware Transcodes going on at once.

It's ok with 2 or 3 hevc to 264 4k transcodes going on at once, but any more and the system locks up.

This may be to do with the GPU being passed through to an Ubuntu 20.04 VM on Unraid, although strangely the same behaviour happens with an AMD Radeon 550 I've got laying around.

I've attempted to troubleshoot on the Unraid side, but not turned up on reason why I'd be seeing this behaviour, I've even tried the GPUs in an alternative Unraid server rig, same behaviour.

On both rigs, and on both cards, the VM has always been running Ubuntu 20.04 (tried both server and desktop) with the latest AMDGPU-PRO drivers, and latest stable Emby Server.

I'd really like to be able to solve the underlying issue, buuut it may take some time, and for now, stability is paramount.

 

So my question for now is, is there a way I can set a hard limit for GPU transcode steams, and fallback to CPU for any others?
I've not seen any options in the UI, but surely there must be a flag or config option exposed somewhere?

 

Just in case anyone has any insight into what might be causing the problem, I've attached the server log and one of the transcode logs from when the system fell over... The transcode logs look fine for all transcodes, until the 3rd or 4th parallel transcode, and then they are all full of the following lines:

amdgpu: amdgpu_cs_query_fence_status failed.
amdgpu: The CS has been cancelled because the context is lost.

 

embyserver.txt ffmpeg-transcode-8a80eced-c25a-4b79-9535-162f9a059ffc_1.txt

Posted

I've had some success by leaving all the Hardware Accelerated Decoders set to 'on', and setting all the Hardware Accelerated Encoders to 'off' (CPU only encoding).

I had 5 simultaneous transcodes running (4K HEVC to h264 @ 3Mbps) and the GPU didn't lock up, which is nice, as these are pretty heavy transcodes. CPU was maxed out, and there was some minor buffering, but this shouldn't be an issue as I expect the majority of playback will be Direct Play/Stream, as I have quite capable devices.

The most important thing is that there was no locking up, and I din't have to reboot my server. Peace, stability, and rest returns to the shire.

Soon, Unraid are planning on bringing GPU drivers into the OS (6.10 I think), so hopefully I'll be able to return to my trusty Emby Docker container, with new and improved Hardware Acceleration.

 

Still, would be nice to figure out what was causing the instability, so if someone is able to interpret the errors in the transcode logs above, and provide into any related insight, that would be amazing.

Posted

The first step for troubleshooting would be to rule out the VM and install Emby on the host OS.

When the problem would still occur, you could create a simple repro (plain ffmpeg command) to reproduce (e.g. when it's started 4 times in parallel) and report this to AMD.

PS: What happens when you try the other way round: Disable all decoders and enable the encoder?

Posted (edited)

@softworkz Sounds like a good plan, would be nice to see if I see the same behaviour in Ubuntu on bare-metal. Thanks for the direction.

I actually tried the same setup on 2 server rigs, both with an Ubuntu VM on top of Unraid, trying both the 550 and rx570, all getting the same amdgpu errors in the transcode log.

I can't rip unraid out of my production rig, but I can do on my backup rig.

I've got a busy weekend already, so doubt i'll get chance todo the bare metal install, but i'll try and free up some time soon.

I've turned tail back to the Emby Docker for now, as I was getting all sorts of performance issues on the VM, seemed to be Disk I/O related - that was causing Emby playback to take 5-6 seconds to start (direct play), whereas on the docker we're talking sub 0.5 sec. Playback would also fail randomly at any point in the video.

If I remember correctly, after disabling all decoders, and enabling both the h264 and h265 encoders - emby would attempt to transcode to h264 on the GPU and after starting several transcodes (can't remember the exact number) I was seeing the same amdgpu errors as above.

Edited by flexage
typo
Posted

@softworkz Thanks for your help before. I managed to free up some time this evening to follow this up and test out the transcoding on bare metal hardware.

 

On Ubuntu 20.04 with the Radeon RX570, I still see the same

amdgpu: amdgpu_cs_query_fence_status failed.
amdgpu: The CS has been cancelled because the context is lost.

errors in the transcode log, and the GPU locks up, monitor blanks, but Emby and the OS continue to run (sans transcoding).

Sure sounds like dodgy AMD GPU Linux drivers to me.

 

To confirm this, I wiped the OS from the bare metal set up, installed Win 10, official AMD Radeon Drivers, latest Emby Stable, then ran several transcode jobs in parallel.

No errors, no GPU lockups - I can run about 2-3 transcodes at once (4k HEVC 5Mbps Source --> h264 ~2-3Mbps Target) before GPU performance becomes the bottleneck.

I'm going to set up a Win 10 VM on the Unraid system, pass through the GPU, and see if I can replicate this stable behaviour. I'm not really super thrilled about running Windows for this purpose, given it's overheads, but fingers crossed it works.

 

Assuming all goes well, and the Win 10 VM behaves similarly to the bare metal system, I'm still limited to about 3 transcodes and then performance drops to no longer sustaining fluid playback. I know some Nvidia cards are limited to 2 simultaneous transcodes, and then I've read here on the forums that Emby detects that the video card won't allow any more and falls back to CPU transcoding... In my case Emby just continues to throw more transcodes to the GPU until all transcodes just get slower and slower...

Is there a way I can replicate the "Nvidia Limiting", and tell Emby only use the GPU for up to maybe 3 transcodes, then fall back to CPU?

 

Thanks again for your help, I was almost beginning to think the RX570 was faulty (ironically it turns out my spare Radeon 550 had actually developed a fault).

Posted

Seems so, thanks for looking into this, forgive me - I've read through the links you've posted and I'm having a bit on a hard time interpreting...

Are those kernel patches still to be released? Also, do those links mean it's a Debian (and derivative) only bug, or more likely given the context, a general linux issue?

Thanks again for checking this out 👍

Posted
12 minutes ago, flexage said:

Are those kernel patches still to be released?

I'm afraid, I can't tell (I'm not a Linux guy).

13 minutes ago, flexage said:

I've read through the links you've posted and I'm having a bit on a hard time interpreting...

I think that a safe interpretation would be that the problem is neither specific to Emby, nor to your hardware setup and/or OS configuration, but instead a bug from the side of AMD.

My suggestion would be to contact AMD and ask for assistance. I'm sure that you won't be the first one.

Posted

Thanks, I can use Linux, but Linux dev goes right over my head.

Ironically, this I've just seen that in the last 24hrs, Unraid have just added GPU drivers in their latest beta, the intent being that these could be passed through to docker containers for h/w transcoding and such.

This would be my ideal set up, but assuming the Emby docker even has support for the AMD drivers (and after looking at the emby docker hub I'm not sure it does), I'd imagine I'd just run into the same issue we've been detailing here.

\_|shrugs|_/ I may try to gather some evidence, and craft something to send to AMD, but I get the feeling they will be less than responsive - will probably be a lot of work for zero payoff.

Cheers for all of your help, I appreciate it!

Posted
9 minutes ago, flexage said:

\_|shrugs|_/ I may try to gather some evidence, and craft something to send to AMD, but I get the feeling they will be less than responsive - will probably be a lot of work for zero payoff.

Why so negative?

Their latest drivers explicitly indicate support for Ubuntu 20.04 and RX570: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-20
BTW: did you install these drivers? On the host OS?

Also, the kernel patches were submitted by AMD guys:
image.png.43c537ee42a6e2bd89546dae138c8c50.png

I'm sure they will be able to tell in which Linux versions the patches from those guys will be included and I'm also sure that they know the symptoms of this very well already.

Also, why "lot of work"? Just copy/paste everything relevant from this conversation..

PS: I would make a phone call or create a support ticket online, but not a forum post.

Posted

Sorry, default negative mode after a long night. Honestly didn't mean to sound ungrateful for your advice.

With the drivers, I followed the GPU selection tool on the AMD drivers page, and installed the next revision up, 20.40, this was on both the bare metal Ubuntu, and also in the Ubuntu VM previously.

91162857_Screenshot2020-11-15at6_53_43am.thumb.png.e862a976060ba97dfab85222fa39a854.png

I suppose I assumed they'd want more details than I had gathered here, and I'd already destroyed my previous test environments - but I suppose grabbing the information in this thread, including the references you kindly sourced would go a long way.

I agree, support ticket or phone call would be the way to go, will check out their support portal tomorrow, and see if they get back with anything.

Sorry again for sounding ungrateful, just tired, there's been quite a bit of leg work (metaphorically speaking) on this one, will have to change my handle to Grandpa Flex 😁

Posted

You didn't sound ungrateful.

As long as it's about some hardware, support is often not as bad as one would expect.
(it's rather us - the software guys - being responsible for bad support experiences).

Few years ago I had called Logitech support. Once I got connected and specified my product and model, I said

"Hello, I'd like to buy a 'K'".
After I had explained that I need the 'K' because it had fallen of the keyboard and my cleaner has probably disposed it, she laughed and said that she's afraid but K's would currently be out of stock. But I could have L's, J's and C's.
And T's would even be on sale that week... (she was still laughing).

Eventually, they sent me a whole new DiNovo Edge for free (they don't sell individual letters), even though mine had been out of support/warranty for long.

Product support is often a dark area, but sometimes there's still light to be seen 😆

Posted
9 hours ago, softworkz said:

Eventually, they sent me a whole new DiNovo Edge for free (they don't sell individual letters), even though mine had been out of support/warranty for long.

That was a good story, gives me lots of hope! Good result!

I'm feeling refreshed after some rest, so have been doing some digging, and noticed that the Linux Kernel in Ubuntu 20.04.1 is 5.4, which while being the latest LTS Kernel, is a fair bit behind having been released on 24 Nov 2019.

Before I contact AMD support, I'm wondering if the AMDGPU kernel fix might have already made it into one of the newer, non-lts kernels, since the latest 5.9 released just last month.

Unraid 6.9 beta 35, the one with the GPU drivers baked in, currently has Linux Kernel 5.8 included, soon to move up to 5.9, so it's possible it might already be in there. I'm going to attempt to find out if those Patches from the AMD team you linked have been merged already, and when.

Ultimately, I'd like to give this a go in the official Emby docker, there is the Linux Server IO docker container that says they've added AMD VAAPI support, but I'm reluctant to stray.

There's some docs on the Emby Docker Hub for passing through GPU's for use with VAAPI, mainly setting the device to pass through, and passing the PGID for the video group...

Any tips on installing the AMD drivers in the docker container? Primarily I'm unsure which version on the AMDGPU drivers to install being as I don't know which base OS the Emby Container is based off.

Posted

I took a look at the source code for Kernel 5.8, and it looks like the patch for the race condition is included! Good times, I can see all the code from the patch that adds the mutex lock, to prevent the threads competing. https://github.com/torvalds/linux/blob/bcf876870b95592b52519ed4aafcf9d95999bc9c/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c

Also, it seems that 5.9 has some more changes to the kernel amdgpu code, so maybe even more fixes for the future.

Very good stuff, thanks for digging up that info, it gave me a lot to go on!

I'm going to attempt to get some AMDGPU drivers installed on the official Emby Docker and if I can get anything running.

Pretty sure I'll hit a wall, so will likely be back here asking for more help 😂

Posted

Tried to deduce if we're using a base os container for the Emby docker, and I wasn't able to find out a particular one, so I'm assuming we're not including a base OS here.

I've found the amd firmware binaries on my host Unraid system at `/lib/firmware/amdgpu` and attempted to pass them through to the same path in the container.

Doesn't look like they're being picked up... is this the wrong approach to take? Or do I need another step to make the binaries active on the docker container?

Docker and. Linux at this level isn't my forté, so I was drawing inspirarion from this SO post: https://askubuntu.com/questions/809081/how-to-run-opencl-program-in-docker-container-with-amdgpu-pro

Posted

After scouring some other Linux threads on here, I saw that @Luke mentioned that the Emby Docker Container was based off Arch Linux.

Since I now have the kernel driver for AMDGPU on my host, and the Kernel is of a new enough version to not suffer the race condition bug I had experienced previously with Ubuntu 20.04, I wanted to have a stab at installing the AMDGPU-PRO driver, but noticed that while we do have RPM installed, there are no package repositories configured in the docker.

I had a look at the Arch Linux documentation, and turned up some information about the AMDGPU-PRO driver they have (https://aur.archlinux.org/pkgbase/amdgpu-pro-installer/) but without a package manager, I'm unsure how to attain the driver. I had a look but seemed to not be able to find an RPM package available. I may have missed it, I'm not familiar with Arch, and especially the cut back version of Arch that may be running in the container.

Are any of you guys familiar enough to document the method for installing this driver on the Emby Docker? I'm surely not the only one to have had attempted this, and I would soon image an influx of fellow Unraid users also wishing to take advantage of its new AMD GPU support.

Posted
41 minutes ago, Luke said:

@alucryd may have some insight on that.

Awesome! Thanks for the referral 👍

Posted

@flexage Our docker is not based on arch linux, but on busybox so you won't be able to easily install a package on it, best way you'd go around that is to manually extract a package from another distro on top of it, but the tools needed must be in busybox so that excludes RPMs, you could extract DEBs with ar or Arch packages with tar though.

That said, the only hwaccel you can get for AMD out of our current builds would be VAAPI, so AMDGPU PRO wouldn't help much. However we are working on getting AMF support, and I'll personally be getting my first AMD card in years so I can test it properly. As with Nvidia you will still need to install the drivers on the host system (AMDGPU PRO is proprietary, we can't ship it ourselves) and expose them to docker, but I don't have the specifics yet.

Posted
6 hours ago, alucryd said:

@flexage Our docker is not based on arch linux, but on busybox

@alucryd Ahhhh, thanks for that, maybe I was reading an old thread that mentioned Arch.

6 hours ago, alucryd said:

best way you'd go around that is to manually extract a package from another distro on top of it

Assuming I manage to pull the contents out of a deb package or something, where should they be placed inside the container? Do I need to modify any config or init files to load the driver package, or is simply placing them somewhere like `/lib/` enough?

6 hours ago, alucryd said:

the only hwaccel you can get for AMD out of our current builds would be VAAPI, so AMDGPU PRO

Forgive me if I'm mistaken, I thought the AMDGPU PRO drivers were required for using VAAPI with AMD cards with Emby?

6 hours ago, alucryd said:

we are working on getting AMF support, and I'll personally be getting my first AMD card in years so I can test it properly

That's exciting! I've heard the VAAPI isn't the best performance-wise when it comes to AMD... I'm assuming AMF may be better.

If you need it, I'd be happy to help test when the time comes!

6 hours ago, alucryd said:

As with Nvidia you will still need to install the drivers on the host system (AMDGPU PRO is proprietary, we can't ship it ourselves)

I use Unraid as the Host OS on my production system. Fortunately, over this last weekend, they have just added the AMD graphics drivers and Kernel support for exactly this purpose. AFAIK my host OS is now amenable to AMD H/W Acceleration within docker containers. This is why I've been exploring adding the AMDGPU PRO drivers to my local Emby Docker container. As far as I could tell, installing the drivers into the container was the only missing item in the chain to get HWA working, however my uninformed judgement isn't to be trusted on such matters. 😂

 

Thanks for taking the time to help me learn and understand, I'm sorry about having so many questions.

 

Also just wanted to give a big thanks to all involved. When running Emby virtualised, the Emby Docker is just waaaaay more performant compared to running it in a VM. It may be because of the mapped paths to local media folders (instead of having to use SMB inside of a VM), or maybe that the overhead is lower from sharing the host kernel / not running a full virtualised os, but Emby Docker really is sweet. AMD HWA would be the icing on the cake 👍

Posted
2 hours ago, flexage said:

@alucryd Ahhhh, thanks for that, maybe I was reading an old thread that mentioned Arch.

Assuming I manage to pull the contents out of a deb package or something, where should they be placed inside the container? Do I need to modify any config or init files to load the driver package, or is simply placing them somewhere like `/lib/` enough?

Since this is a very minimal docker, libs are directly in `/lib`, and VAAPI drivers are in `/lib/dri`.

2 hours ago, flexage said:

Forgive me if I'm mistaken, I thought the AMDGPU PRO drivers were required for using VAAPI with AMD cards with Emby?

You might be right, I haven't delved really far into AMD hwaccel yet, having no hardware, but it was my understanding that VAAPI was provided by the mesa stack for AMD (and we did ship the mesa driver for a while, but since performance was meh at best, and building mesa was a pain, we dropped it with AMF in mind for the future of AMD). The Arch wiki page does mention that AMDGPU PRO supports VAAPI and VDPAU though, but my google-fu only points me towards AMF when looking into the pro stack.

2 hours ago, flexage said:

That's exciting! I've heard the VAAPI isn't the best performance-wise when it comes to AMD... I'm assuming AMF may be better.

If you need it, I'd be happy to help test when the time comes!

Will be sure to forward you some test images when the time comes, thanks!

2 hours ago, flexage said:

I use Unraid as the Host OS on my production system. Fortunately, over this last weekend, they have just added the AMD graphics drivers and Kernel support for exactly this purpose. AFAIK my host OS is now amenable to AMD H/W Acceleration within docker containers. This is why I've been exploring adding the AMDGPU PRO drivers to my local Emby Docker container. As far as I could tell, installing the drivers into the container was the only missing item in the chain to get HWA working, however my uninformed judgement isn't to be trusted on such matters. 😂

 

Thanks for taking the time to help me learn and understand, I'm sorry about having so many questions.

 

Also just wanted to give a big thanks to all involved. When running Emby virtualised, the Emby Docker is just waaaaay more performant compared to running it in a VM. It may be because of the mapped paths to local media folders (instead of having to use SMB inside of a VM), or maybe that the overhead is lower from sharing the host kernel / not running a full virtualised os, but Emby Docker really is sweet. AMD HWA would be the icing on the cake 👍

Yeah that's to be expected, smb is a crappy network file system, you'd be better off using nfs where possible. Also containerization like docker has way less overhead than virtualization, and as always native installation is the best in terms of performance. And with proper sandboxing via systemd, you get the same security, if not better security than docker. I'm not a huge fan of docker personally and always prefer to run things natively with proper sandboxing.

Posted
1 hour ago, alucryd said:

Since this is a very minimal docker, libs are directly in `/lib`, and VAAPI drivers are in `/lib/dri`.

Awesome, thanks for the clarification! So no need to initialise the libraries in any conf files or /etc or anything?

 

1 hour ago, alucryd said:

You might be right, I haven't delved really far into AMD hwaccel yet, having no hardware, but it was my understanding that VAAPI was provided by the mesa stack for AMD (and we did ship the mesa driver for a while, but since performance was meh at best, and building mesa was a pain, we dropped it with AMF in mind for the future of AMD). The Arch wiki page does mention that AMDGPU PRO supports VAAPI and VDPAU though, but my google-fu only points me towards AMF when looking into the pro stack.

Yeah, I don't know much about it either, I was only following what I'd seen on the Emby Knowledge Base for Linux HWA with VAAPI (https://support.emby.media/support/solutions/articles/44001160207-hardware-acceleration-on-linux). I did read that the mesa drivers also worked with VAAPI, but that there had been issues with that method.

 

1 hour ago, alucryd said:

Will be sure to forward you some test images when the time comes, thanks!

Looking forward to it 👍

 

1 hour ago, alucryd said:

I'm not a huge fan of docker personally and always prefer to run things natively with proper sandboxing.

I agree, I'd love to run native, but don't want to ditch Unraid - it would be a little tricky at best to get Emby running natively on Unraid. Docker is the next best option that is available to me, and having ran Emby in a docker for the last year or 2, I've been very impressed with both the stability, and the performance of it. NFS is great, and I did run it that way for a while back when I had Emby in a Linux VM, but I really did miss the ability to watch for realtime filesystem changes to the media collection. I had a good go at it, and instead ended up automating filesystem watching with some proxy scripts 🤖

 

Thanks for the help dude, I appreciate it 😁

  • 4 weeks later...
Posted

@alucryd Thanks for your help last month, I appreciate it.

I was unable to get the amdgpu-pro drivers installed on the official Emby Docker... I just couldn't get my head around the environment in use there.

I grabbed the driver package, and spent a good few hours writing a script to extract all the debs, and put them where i thought they needed to be, but I think my lack of understanding on the environment was holding me back.

For a test I installed the linuxserver.io emby container, they had added amd gpu drivers earlier i the year, and amd hw acceleration worked out of the box (sort of). Transcoding from h264 would always go via the GPU, but anything h256/hevc would end up being software transcoded. I looked into it just a little bit, and noticed that they had installed the open-source mesa-drivers and we're using the custom emby build of FFMPEG. I remembered you saying that there were issues with those ones, so I thought I'd try removing them and installing the proprietary amd drivers.

It turned out to be quite easy, since they base their containers off Ubuntu LTS, which I am pretty familiar with. I wrote a quick container startup script (they provide a mechanism by which to provide your own startup scripts) to check whether it had installed them previously, and if it hadn't, it would remove the mesa drivers and intalll the official amd ones. This way they are always put back after the container is updated or recreated.

Using the official drivers, I was actually able to transcode both h264.and hevc using hardware.

I've only just git it working, and I don't know if I'll stay with the linuxserver.io container, as I'd prefer to be on the official one, but just wanted to report back my findings.

I wish I could wrap my head around how to install the drivers on the official emby container, and apply my driver re-installation hack on that instead, but it completely blows my head up trying to figure it out 😂

But yeah, just wanted to feed back that at least on Unraid, having the latest AMD drivers in the container lets me at least use HW transcoding via VAAPI.

Posted

@flexage Thanks for the feedback, we're debating whether to include the VAAPI drivers back, or to go with AMF instead (or both). Still waiting for that shiny new AMD graphics card, although I was able to purchase one, I still have no idea when I'll receive it unfortunately :(

Posted

@alucryd I'm sure you'll do whatever is manageable with regards to driver type, just having any one of them will allow for hw transcoding which is a huge plus over nothing at all. 👍

I know how you feel with regards to delivery, the r550 I bought recently was faulty, and it took me over 4 weeks for Amazon to go through the returns proceess. Ultimately, after they failed to collect it 3 times, they let me keep the card and issued a refund anyway. It's pretty unstable though, so no use to anyone 😂

It's great that you've managed to purchase one, I hope you don't have to wait very much longer, it's tedious!

I'm still happy to help with testing 👍

Just out of curiosity, I was looking over the public GitHub repository that is supposed to be for Emby docker (Media browser/Emby.Build), it hadn't been updated in quite a while. Is that repo perhaps, obsolete, replaced with a private one? No problem if so, I just won't refer to it again when tinkering with the container. 😂

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...