Jump to content

NVIDIA HW Enc/Dec fails randomly


Recommended Posts

Posted (edited)

I have Emby server setup in docker on Ubuntu 24.04. NVIDIA enc/dec fails randomly. I attached 2 log files playing the exact same video. The *-bad.txt log shows an error:

09:27:24.715 ffmpeg version 5.1-emby_2023_06_25 Copyright (c) 2000-2022 the FFmpeg developers and softworkz for Emby LLC
09:27:24.715   built with gcc 10.3.0 (crosstool-NG 1.25.0)
09:27:24.715 Execution Date: 2025-02-21 09:27:24
09:27:24.717 [AVHWDeviceContext @ 0x292c5a80] cu->cuInit(0) failed -> CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
09:27:24.718 Device creation failed: -542398533.
09:27:24.718 Failed to set value 'cuda=cuda:0' for option 'init_hw_device': Generic error in an external library
09:27:24.718 Error parsing global options: Generic error in an external library
09:27:24.718 EXIT

After restarting the embyserver docker container it starts working (*-good.txt log file).

I purchased a monthly subscription to test HW enc/dec. My intention is to buy a lifetime license, but if this issue is not fixed, I'll have to pass on it.

Please, let me know if you need more information/logs to investigate the problem.

Thank you.

 

ffmpeg-transcode-bad.txt ffmpeg-transcode-good.txt

Edited by ZR1000A1
Typo
Posted (edited)

Anyone???? It's still happening. Restarting Emby server does not help, only restart of the embyserver docker container brings NVIDIA transcoding back.

Edited by ZR1000A1
Posted

Hi, how have you configured the docker container? Are you able to try our native package on the host machine directly and see how that compares?

Posted (edited)

I set it up following this instructions, including video and render GID. I also (obviously) installed nvidia-container-runtime. 

The weird thing is that HW transcoding may stay working fine for a few days and then just loses GPU. The host still sees NVIDIA GPU ok. Simple restart of the embyserver container brings it back to normal. I understand that those kind of intermittent/random issues are extremely hard to solve, but maybe you have any ideas what can cause it?

Thanks!

Update: attached docker compose file

docker-compose-embyserver.yaml

Edited by ZR1000A1
TravelAccount
Posted

 

On 2/26/2025 at 11:06 PM, ZR1000A1 said:

I set it up following this instructions, including video and render GID. I also (obviously) installed nvidia-container-runtime. 

The weird thing is that HW transcoding may stay working fine for a few days and then just loses GPU. The host still sees NVIDIA GPU ok. Simple restart of the embyserver container brings it back to normal. I understand that those kind of intermittent/random issues are extremely hard to solve, but maybe you have any ideas what can cause it?

Thanks!

Update: attached docker compose file

docker-compose-embyserver.yaml 908 B · 0 downloads

 

What is your docker host?

 

Posted
9 minutes ago, TravelAccount said:

 

 

What is your docker host?

 

It's in OP -> Ubuntu 24.04

Posted

@ZR1000A1Are you able to try our native package on the host machine directly and see how that compares?

Posted (edited)

@LukeIn theory I can do that, but, unfortunately, I need to run emby server in docker container. But there is a good news. I think, I narrowed it down to system update. Last 2 times the problem happens right after

sudo apt upgrade

If this is the case, I can live with it. I'll postpone updates for a couple of weeks and see how it goes.

Thanks!  

P.S. If it matters, here's the list of pakages updated those caused embyserver to lose GPU:

2025-03-01 00:07:19 upgrade libopeniscsiusr:amd64 2.1.9-3ubuntu5.2 2.1.9-3ubuntu5.3
2025-03-01 00:07:19 upgrade open-iscsi:amd64 2.1.9-3ubuntu5.2 2.1.9-3ubuntu5.3
2025-03-01 00:07:20 upgrade initramfs-tools:all 0.142ubuntu25.4 0.142ubuntu25.5
2025-03-01 00:07:20 upgrade initramfs-tools-core:all 0.142ubuntu25.4 0.142ubuntu25.5
2025-03-01 00:07:20 upgrade initramfs-tools-bin:amd64 0.142ubuntu25.4 0.142ubuntu25.5
2025-03-01 00:07:20 upgrade libvirt-daemon-config-nwfilter:all 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:20 upgrade libvirt-daemon-driver-qemu:amd64 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:20 upgrade libvirt-clients:amd64 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:20 upgrade libvirt-daemon:amd64 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:20 upgrade libvirt-daemon-system:amd64 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:20 upgrade libvirt-daemon-config-network:all 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:21 upgrade libvirt-l10n:all 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:21 upgrade libvirt-daemon-system-systemd:all 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6
2025-03-01 00:07:21 upgrade libvirt0:amd64 10.0.0-2ubuntu8.5 10.0.0-2ubuntu8.6

 

Edited by ZR1000A1
Posted

Right I wasn't asking you to change how you run Emby Server, but just to fire it up briefly in the native package just to compare.

Posted

I think, I can confirm that 

apt upgrade

causes embyserver docker container to lose NVIDIA GPU. Ran another upgrade today, which updated the following packages:

2025-03-03 10:45:01 upgrade libcryptsetup12:amd64 2:2.7.0-1ubuntu4.1 2:2.7.0-1ubuntu4.2
2025-03-03 10:45:01 upgrade cryptsetup-initramfs:all 2:2.7.0-1ubuntu4.1 2:2.7.0-1ubuntu4.2
2025-03-03 10:45:01 upgrade cryptsetup-bin:amd64 2:2.7.0-1ubuntu4.1 2:2.7.0-1ubuntu4.2
2025-03-03 10:45:01 upgrade cryptsetup:amd64 2:2.7.0-1ubuntu4.1 2:2.7.0-1ubuntu4.2
2025-03-03 10:45:01 upgrade libfwupd2:amd64 1.9.27-0ubuntu1~24.04.1 1.9.28-0ubuntu1~24.04.1
2025-03-03 10:45:01 upgrade fwupd:amd64 1.9.27-0ubuntu1~24.04.1 1.9.28-0ubuntu1~24.04.1
2025-03-03 10:45:01 upgrade libpackagekit-glib2-18:amd64 1.2.8-2ubuntu1.1 1.2.8-2ubuntu1.2
2025-03-03 10:45:01 upgrade gir1.2-packagekitglib-1.0:amd64 1.2.8-2ubuntu1.1 1.2.8-2ubuntu1.2
2025-03-03 10:45:01 upgrade packagekit-tools:amd64 1.2.8-2ubuntu1.1 1.2.8-2ubuntu1.2
2025-03-03 10:45:01 upgrade packagekit:amd64 1.2.8-2ubuntu1.1 1.2.8-2ubuntu1.2
2025-03-03 10:45:01 upgrade pollinate:all 4.33-3.1ubuntu1 4.33-3.1ubuntu1.1
2025-03-03 10:45:01 upgrade qemu-system-modules-spice:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:01 upgrade qemu-system-modules-opengl:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:01 upgrade qemu-system-gui:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:02 upgrade qemu-block-extra:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:02 upgrade qemu-utils:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:02 upgrade qemu-system-x86:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:02 upgrade qemu-system-common:amd64 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:02 upgrade qemu-system-data:all 1:8.2.2+ds-0ubuntu1.5 1:8.2.2+ds-0ubuntu1.6
2025-03-03 10:45:02 upgrade cloud-init:all 24.4-0ubuntu1~24.04.2 24.4.1-0ubuntu0~24.04.1

and embyserver immediately stopped HW transcoding.

I think, my question is - is it "fixable" in any way or I just need to make a habit of restarting embyserver docker container every time `apt upgrade` is ran? 

Posted

Unrelated question. 

I used to be able to edit my own posts, but this option seems to be unavailable anymore. Why? 

Posted (edited)

@Luke Ok, now I'm 100% convinced that the problem is in system upgrades!!! I was not that sure before, because I remembered that I noticed the HW transcoding issue without explicitly running `apt upgrade`. This morning embyserver lost GPU again, but it was working last night. I checked system logs and what I found -  Unattended-Upgrades were enabled!!! I'm building a new server and forgot to disable it. The following packaged got upgraded overnight:

2025-03-04 06:33:42 upgrade krb5-locales:all 1.20.1-6ubuntu2.4 1.20.1-6ubuntu2.5
2025-03-04 06:33:42 upgrade libgssapi-krb5-2:amd64 1.20.1-6ubuntu2.4 1.20.1-6ubuntu2.5
2025-03-04 06:33:42 upgrade libkrb5-3:amd64 1.20.1-6ubuntu2.4 1.20.1-6ubuntu2.5
2025-03-04 06:33:42 upgrade libkrb5support0:amd64 1.20.1-6ubuntu2.4 1.20.1-6ubuntu2.5
2025-03-04 06:33:42 upgrade libk5crypto3:amd64 1.20.1-6ubuntu2.4 1.20.1-6ubuntu2.5
2025-03-04 06:33:45 upgrade wpasupplicant:amd64 2:2.10-21ubuntu0.1 2:2.10-21ubuntu0.2

So, my verdict is that ANY system upgrade causes Embyserver docker container to lose GPU.

It would be nice if developers could address the issue. Meanwhile, everybody should restart embyserver container after every `apt upgrade`

Edited by ZR1000A1

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...