Jump to content

Hardware detection fails within docker container on previously working system.


Go to solution Solved by darkassassin07,

Recommended Posts

darkassassin07
Posted (edited)

I noticed last night some things were converting very slowly and realized they were using the cpu to do it. Digging a bit more this morning, it seems emby is erroring while running ffdetect during startup.

embyserver.txt:

2025-11-17 09:20:46.171 Info NvidiaCodecProvider: ProcessRun 'ffdetect_nvencdec' Execute: /bin/ffdetect -hide_banner -show_program_version -loglevel 48 -show_error -show_log 40 nvencdec -print_format json 
2025-11-17 09:20:46.172 Info NvidiaCodecProvider: ProcessRun 'ffdetect_nvencdec' Process exited with code 1 - Failed

When ran manually using docker exec I get:

ffdetect version 5.1-emby_2023_06_25_p4 Copyright (c) 2018-2022 softworkz for Emby LLC
  built with gcc 10.3.0 (crosstool-NG 1.25.0)
  configuration: --cc=x86_64-emby-linux-gnu-gcc --prefix=/home/embybuilder/Buildbot/x64/ffmpeg-x64/staging --disable-alsa --disable-doc --disable-ffplay --disable-gnutls --disable-libpulse --disable-librtmp --disable-libxcb --disable-openssl --disable-vdpau --disable-vulkan --disable-xlib --enable-chromaprint --enable-fontconfig --enable-gpl --enable-iconv --enable-libaribb24 --enable-libass --enable-libdav1d --enable-libfreetype --enable-libfribidi --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libzvbi --enable-mbedtls --enable-pic --enable-version3 --enable-libtesseract --enable-cuda-llvm --enable-cuvid --enable-libdrm --enable-libmfx --enable-nvdec --enable-nvenc --enable-vaapi --enable-opencl --enable-cross-compile --cross-prefix=x86_64-emby-linux-gnu- --arch=x86_64 --target-os=linux --enable-shared --disable-static --pkg-config=pkg-config --pkg-config-flags=--static --extra-libs='-ldl -lm -lstdc++ -lsharpyuv -pthread' --disable-debug
  libavutil      57. 28.100 / 57. 28.100
Cannot load libcuda.so.1
Error loading CUDA functions
{
    "ProgramVersion": {
        "Version": "5.1-emby_2023_06_25_p4",
        "Copyright": "Copyright (c) 2018-2022 softworkz for Emby Llc",
        "Compiler": "gcc 10.3.0 (crosstool-NG 1.25.0)",
        "Configuration": "--cc=x86_64-emby-linux-gnu-gcc --prefix=/home/embybuilder/Buildbot/x64/ffmpeg-x64/staging --disable-alsa --disable-doc --disable-ffplay --disable-gnutls --disable-libpulse --disable-librtmp --disable-libxcb --disable-openssl --disable-vdpau --disable-vulkan --disable-xlib --enable-chromaprint --enable-fontconfig --enable-gpl --enable-iconv --enable-libaribb24 --enable-libass --enable-libdav1d --enable-libfreetype --enable-libfribidi --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libzvbi --enable-mbedtls --enable-pic --enable-version3 --enable-libtesseract --enable-cuda-llvm --enable-cuvid --enable-libdrm --enable-libmfx --enable-nvdec --enable-nvenc --enable-vaapi --enable-opencl --enable-cross-compile --cross-prefix=x86_64-emby-linux-gnu- --arch=x86_64 --target-os=linux --enable-shared --disable-static --pkg-config=pkg-config --pkg-config-flags=--static --extra-libs='-ldl -lm -lstdc++ -lsharpyuv -pthread' --disable-debug"
    },
    "Error": {
        "Number": -1,
        "Message": "Operation not permitted"
    },
    "Log": [
        {
            "Level": 16,
            "Category": 0,
            "Message": "Cannot load libcuda.so.1"
        },
        {
            "Level": 16,
            "Category": 0,
            "Message": "Error loading CUDA functions"
        }

 

I'm not sure what could have changed to introduce a permission error, or how to fix it.

 

This is within a docker container, so here is my compose file, which hasn't changed in a long time and was previously transcoding using hardware just fine:

 

services:
  emby:
    image: emby/embyserver:latest
    container_name: emby
    runtime: nvidia # Expose NVIDIA GPUs
    environment:
      - UID=1000 # The UID to run emby as (default: 2)
      - GID=1000 # The GID to run emby as (default 2)
      - GIDLIST=1000,44,104
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=all
      - TZ=America/Vancouver
    volumes:
      - /srv/emby/programdata:/config # Configuration directory
      - /mnt/Drobo/Media:/Media # Media directory
      - /mnt/Backups/Emby:/Backups #backup directory
      - /mnt/Cache/Emby_Cache:/Cache #cache
      - /mnt/Cache/Transcoding_Temp:/Transcoding_temp #transcoding directory
      #- type: tmpfs
        #target: /Transcoding_temp
    #    ports:
    #      - 8096:8096 # HTTP port
    devices:
      - /dev/dri:/dev/dri # VAAPI/NVDEC/NVENC render nodes
    networks:
      arr:
        ipv4_address: 172.18.0.94
    restart: unless-stopped
    labels:
      com.centurylinklabs.watchtower.enable: true
networks:
  arr:
    external: true

 

Emby Server v4.9.1.90 on debian 12

Docker engine: 29.0.2

Containerd: 2.1.5

 

embyserver.txt hardware_detection-63898968046.txt

Edited by darkassassin07
Posted

Hi, is that the official emby docker container?

darkassassin07
Posted

I've bee able to confirm other docker containers are successfully using hardware transcoding, namely Tdarr; it just seems to be the Emby container running into this issue.

I then tried reverting to 4.9.1.80 and then 4.9.1.36; both encountered the same error in their hardware detection logs.

Perhaps this is an issue/interaction with the latest nvidia-container-toolkit (v1.18.0-1) which I updated to Nov 9th along with my usual 'apt-get upgrade'?

 

 

 

Has anyone got an Emby docker container with Nvidia hardware transcoding working that I can compare against? What versions are you running?

Posted
1 hour ago, darkassassin07 said:

I've bee able to confirm other docker containers are successfully using hardware transcoding, namely Tdarr; it just seems to be the Emby container running into this issue.

I then tried reverting to 4.9.1.80 and then 4.9.1.36; both encountered the same error in their hardware detection logs.

Perhaps this is an issue/interaction with the latest nvidia-container-toolkit (v1.18.0-1) which I updated to Nov 9th along with my usual 'apt-get upgrade'?

 

 

 

Has anyone got an Emby docker container with Nvidia hardware transcoding working that I can compare against? What versions are you running?

Hardware transcoding works for me on 4.9.2.6 beta. Using a Tesla T4.  nvidia-container-toolkit   v1.18.0-1. Driver version 580.95.05

Posted

Btw you do not need /dev/dri:/dev/dri when you're using nvidia in your docker compose.

darkassassin07
Posted (edited)

Hmm. That leaves me a little unsure where to look next.

 

I do seem to be running an older version of the driver (535.247.01-1~deb12u1) which is the newest apt finds, but I'm not sure why that would cause this or how to update beyond what apt found (I'm a bit fresh to Linux in general).

 

/edit: beta 4.9.2.6 throws the same permission error.

 

 

The test nvidia provides:

sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

Works just fine. I don't know where else to look.

Edited by darkassassin07
Posted

Are you running the script for your 3080 to unlock the 2x stream limit?

darkassassin07
Posted
23 minutes ago, guunter said:

Are you running the script for your 3080 to unlock the 2x stream limit?

I don't recall doing anything like that. I haven't really needed more than 2 streams tho.

 

My main concern is just getting it to work at all rn.

 

I put all my focus into fixing my openvpn server lastnight, hopefully I'll have time to dive more into this in the next couple days.

I'm going to attempt to re-install the drivers+container toolkit. Maybe that'll get me somewhere 🤷

I don't understand why tdarr is using the card to transcode just fine, but emby fails. I would think both would fail if it were a driver issue, but I don't know what else to do.

Posted

Hi, did you find anything?

darkassassin07
Posted

Sort of?

When I 'ls -ln /dev/dri': on the host and via exec in my tdarr container, card0 and renderD128 are in groups 44 and 105 respectively in both places. In the emby container however, both are in group 0.

I presume that's the cause of the permission errors, but I don't know why it's like that or how to fix it.

 

I've reinstalled the drivers and container-toolkit for good measure, but no change.

  • Solution
darkassassin07
Posted

Solved!

 

My Emby compose file had:

    runtime: nvidia # Expose NVIDIA GPUs

 

Tdarr on the otherhand used: 

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities:
                - gpu

 

I had tried adding the deploy section to emby, but hadn't removed the runtime arg. Once I properly swapped them out, Embys container saw the gpu in the correct groups and now both Emby and Tdarr are able to use the gpu as expected. (I had also tried stopping Tdarr to see if it was somehow 'reserving' the gpu and preventing Emby from using it, as the term 'reservations' above would imply, but that was not the case)

 

@guunterWhich of the two do you have, or are you using something different?

  • Thanks 2
Posted
9 hours ago, darkassassin07 said:

Solved!

 

My Emby compose file had:

    runtime: nvidia # Expose NVIDIA GPUs

 

Tdarr on the otherhand used: 

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities:
                - gpu

 

I had tried adding the deploy section to emby, but hadn't removed the runtime arg. Once I properly swapped them out, Embys container saw the gpu in the correct groups and now both Emby and Tdarr are able to use the gpu as expected. (I had also tried stopping Tdarr to see if it was somehow 'reserving' the gpu and preventing Emby from using it, as the term 'reservations' above would imply, but that was not the case)

 

@guunterWhich of the two do you have, or are you using something different?

I use the linuxserver.io image.

---
services:
  emby:
    image: lscr.io/linuxserver/emby:beta
    container_name: emby
    runtime: nvidia
    environment:
      - PUID=1001
      - PGID=1001
      - TZ=America/Los_Angeles
      - NVIDIA_VISIBLE_DEVICES=all
    volumes:
      - /opt/appdata/emby:/config
      - /mnt/nas/media/:/mnt/nas/media/:ro
      - /dev/shm:/transcoding
    ports:
      - 8096:8096
      - 8920:8920 #optional
    restart: unless-stopped

 

 

 

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...