Docker HWA Nvidia Instructions

December 8, 2021

Hello!

Im running Debian Bullseye, do you think I can follow the same guide you posted @TallBoiDez?

at this point there is no nvidia driver passthrough for docker on debian 11 only debian 10

May 28, 2022

Recent docker compose instructions on Ubuntu 20 (Docker v20)

This assumes you have the NVIDIA drivers installed.

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

1. Follow instructions to install docker-ce

2. Paste the code under "Setup the package repository and the GPG key"

3. DO NOT install nvidia-docker2, it is deprecated. Instead, install nvidia-container-toolkit:

apt install nvidia-container-toolkit

4. Install docker-compose-plugin

apt install docker-compose-plugin

5. Restart docker

systemctl restart docker

6. Configure your docker-compose.yml. Example:

version: '3.8'
services:
  emby:
    container_name: emby
    image: emby/embyserver:latest
    
    # Obviously you'll need to update these
    environment:
      - UID=1002
      - GID=1002

    network_mode: 'host'

    ports:
      - '8096:8096'

    volumes:
      - "./config:/config"

    restart: unless-stopped

    # This is how GPU support is configured with recent docker/docker compose:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

7. Start your container

docker compose up

(notice NO HYPEN. `docker-compose` coming from the Ubuntu 20 repos is outdated, so you can't use the device reservations above. Instead, we're using the compose plugin from the docker command)

Edited October 13, 2022 by karlshea

July 18, 2022

Hi guys, can one of you tell me if that is applicable for UNRAID ?

I have Emby docker installed by not recognizing my Nvidia card.

July 18, 2022

21 minutes ago, zfrenchy said:

Hi guys, can one of you tell me if that is applicable for UNRAID ?

I have Emby docker installed by not recognizing my Nvidia card.

This might work for unRAID, I'm not familiar with unRAID so I can't say for certain

August 20, 2022

Hello,

I'm having trouble getting NVIDIA hardware detection to work in my environment: (I have had this working previously, on a previous version of Debian and maybe a different GFX card, I can't remember)

Debian 11 (bullseye)
OMV6
Linux 5.18.0-0.bpo.1-amd64
docker 5:20.10.17~3-0~debian-bullseye20.10.5+dfsg1-1+deb11u2
NVIDIA GeForce 750 TI
NVIDIA Drivers: 515.65.01

I've installed CUDA/NVIDIA drivers/toolkits etc
followed parts of this guide: https://forum.openmediavault.org/index.php?thread%2F38013-howto-nvidia-hardware-transcoding-on-omv-5-in-a-plex-docker-container%2F=
and the one on this thread

add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/ /"
apt-get install cuda
reboot

apt install nvidia-container-docker
apt install nvidia-docker2
apt install nvidia-smi

nvidia-smi.jpg.97a57c8ca1bf26c1a7254601ab3dbe6c.jpg

I haven't been successful when getting any running processes to show up, eg Plex (although i think Plex requires a license for HW i don't have) or NVIDIA test containers

/etc/docker/daemon.json

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia" 
}

I'm using portainer to manage docker containers

Environment variables, some of these get automatically added because i have removed these for testing (NVIDIA_VISIBLE_DEVICES & NVIDIA_DRIVER_CAPABILITIES)
UID 1002 is the emby user/group

UID=1002
GID=1002
GIDLIST=1002,44,107
VERSION=latest
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
LANG=en_US.UTF-8
HOME=/tmp
AMDGPU_IDS=/share/libdrm/amdgpu.ids
FONTCONFIG_PATH=/etc/fonts
LD_LIBRARY_PATH=/lib:/system
LIBVA_DRIVERS_PATH=/lib/dri
OCL_ICD_VENDORS=/etc/OpenCL/vendors
PCI_IDS_PATH=/share/hwdata/pci.ids
SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
gpus=all
XDG_CACHE_HOME=/config/cache
LANGUAGE=en_US.UTF-8
TERM=xterm
S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,video,utility

Devices set to /dev/dri /dev/dri - privileged mode off/on didn't make a difference

portainer-emby-runtime.jpg.d5937cd01ed67d00e59e5ad75b4dc850.jpg

Emby's hardware detect logs appear to find the driver and card but seems like a permission issue? The permissions I've set in the docker appear to be correct
Or is the NVIDIA GeForce TI 750 not supported? I do have a 1060 I could try

I can't work it out. Does anyone have any ideas or have i missed something obvious to try? The Emby server works fine it just struggles when playing high quality media that requires more resources because HW acceleration isn't available

logs attached:
"Message": "Failed to initialize VA /dev/dri/renderD128. Error -1"

ls -la /dev/dri

dev-dri.jpg.c668431974e3a1eda2f9f588d0a62fb1.jpg

hardware_detection-63796607500.txt

August 26, 2022

Since you're running Kernel version 5.18.x, you'll need to apply this fix to get the nvidia device to be accessible :

https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256

Specifically, the `ibt=off` kernel parameter

Edited August 26, 2022 by isamudaison

August 27, 2022

13 hours ago, isamudaison said:

Since you're running Kernel version 5.18.x, you'll need to apply this fix to get the nvidia device to be accessible :

https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256

Specifically, the `ibt=off` kernel parameter

Nice find, I tried this in my instance and it didn't make a difference,

GRUB_CMDLINE_LINUX_DEFAULT="quiet ibt=off". Update-grub, reboot etc

Did some more reading and I think a lot of people had different issues where they couldn't boot etc and had newer CPUs from what it looks like. Currently using i3-3240 for this

I installed different NVIDIA driver versions and same issue 470.129.06, CUDA 11.4. Getting the same error logs in Emby. Wanted to see if Plex can pick up the NVIDIA hardware but I don't have Plex Pass for that

August 29, 2022

Interesting... we have similar setups (I'm running Ubuntu 22.04 server, but kernel 5.18, so it shouln't matter *too* much); I recall there is some confusion out there RE installing the right docker packages from nvidia, so here's my installed packages (and it's working on my setup):

apt list --installed | grep nvidia

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-cfg1-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-compute-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-container-tools/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-decode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-encode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-compute-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-container-toolkit/bionic,now 1.10.0-1 amd64 [installed]
nvidia-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-headless-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-headless-no-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-common-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-source-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]

I can paste my 'docker container inspect' results as well if you'd like that.

September 7, 2022

On 8/30/2022 at 5:21 AM, isamudaison said:

Interesting... we have similar setups (I'm running Ubuntu 22.04 server, but kernel 5.18, so it shouln't matter *too* much); I recall there is some confusion out there RE installing the right docker packages from nvidia, so here's my installed packages (and it's working on my setup):

apt list --installed | grep nvidia

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-cfg1-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-compute-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-container-tools/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-decode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-encode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-compute-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-container-toolkit/bionic,now 1.10.0-1 amd64 [installed]
nvidia-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-headless-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-headless-no-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-common-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-source-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]

I can paste my 'docker container inspect' results as well if you'd like that.

Yes please, feel free to paste docker container inspect or any different config to what I have I can tried

Here's my mounts

Mounts

    0 { Destination: /config, Mode: , Propagation: rprivate, RW: true, Source: /srv/docker_app/emby/config, Type: bind }
    1
        Destination /dev/dri
        Mode
        Propagation rprivate
        RW true
        Source /dev/dri
        Type bind

Other things I've tried so far:

Fresh new instance of OMV6 (Debian), docker and Emby docker version, installed NVIDIA drivers etc the same my process above (following OMV6 Guide https://forum.openmediavault.org/index.php?thread%2F38013-howto-nvidia-hardware-transcoding-on-omv-5-in-a-plex-docker-container%2F=) same issue
used a different Graphics card NVIDIA 1060 6GB, same issue (it can detect the Graphics card in Emby but same "Message": "Failed to initialize VA /dev/dri/renderD128. Error -1"

I am yet to try (when i get the time):

Install Emby through flatpak/or deb install onto the O/S to see if it can see the HW, if it does then it has to be something to do with docker/nvidia-container config

September 10, 2022

So I managed to get this to work by installing the local/normal install for Emby on Linux. This creates the emby user/groups, permissions and other config.
What I originally had was the emby user/groups in OpenMediaVault, this does create UID/GID groups on the linux system but it probably missing other configuration that links to these groups is my guess.
Once the emby normal install is completed I just disable the service. Get the ID's of the emby user and group it has generated and reference these values in the docker container

Some instructions if anyone interested in future:

download emby and install using dpkg

get NVIDIA drivers from official NVIDIA site and follow the onscreen prompts:

wget https://us.download.nvidia.com/XFree86/Linux-x86_64/515.65.01/NVIDIA-Linux-x86_64-515.65.01.run
chmod +x NVIDIA-Linux-x86_64-515.65.01.run

if get Noveau error follow this
   How to remove Nouveau kernel driver (fix Nvidia install error)

   I had to add the below to
   /usr/lib/modprobe.d/nvidia-disable-nouveau.config
   /etc/modprobe.d/nvidia-disable-nouveau.conf

    blacklist nouveau
    blacklist lbm-nouveau
    options nouveau modeset=0
    alias nouveau off
    alias lbm-nouveau off

   restart after this

run nvidia-smi to confirm that the drivers and card are working

sudo apt install nvidia-container-toolkit nvidia-container-runtime nvidia-docker2

Change/edit the Daemon configuration file /etc/docker/daemon.json (acquired this info from here)
My example:

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia",
    "data-root": "/srv/docker"
}

Change/edit this configuration file /etc/nvidia-container-runtime/config.toml (I've removed some of the #comments)

disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false

[nvidia-container-cli]
environment = []
load-kmods = true

[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"

# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
    "docker-runc",
    "runc",
]

mode = "auto"

    [nvidia-container-runtime.modes.csv]

    mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

sudo systemctl restart docker
or reboot (if haven't already from installing NVIDIA Drivers)

Might need to do this if it still isn't showing up (acquired this info from here)
"Since Docker 5.20.10.2 (I think) there was a change how docker gets access to hardware via cgroups. You need this workaround in the kernel boot parameters:"

echo 'GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=false' > /etc/default/grub.d/cgroup.cfg
update-grub

docker environment variables are essentially the same as my first post except for different UID/GID variables

Hope that helps

October 8, 2022

Installation link for Nvidia GPU on docker

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

I used this and the docker-compose above by karlshea on ubuntu 22.04.1 successfully

February 12, 2023

I've been scratching my brain for a couple days now. Read a lot of posts and tried a lot of things outlined on these forums as well as elsewhere. I cannot get my docker implementation to initialize renderD128 -> "Message": "Failed to initialize VA /dev/dri/renderD128. Error -1"

hopefully someone has an idea of what I'm missing

Ubuntu 22.04 server (headless)
Docker version 23.0.1
Docker Compose version v2.16.0

daemon.json

{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    },
    "default-runtime": "nvidia"
}

nvidia-smi output

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro K4000        Off  | 00000000:01:00.0 Off |                  N/A |
| 30%   37C    P0    28W /  87W |      0MiB /  3011MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

docker-compose file:

services:
  emby:
    image: emby/embyserver:latest
    container_name: "emby"
    environment:
      - PUID=998 #emby system user
      - PGID=999 #emby system group
      - GIDLIST=998,44,109,1001 #emby,video,render,sambagroup
      - TZ='America/Chicago'
      - gpus=all
    volumes:
      - ./application/config:/config
      #- volumesremoved
    devices:
      - /dev/dri:/dev/dri
    networks:
     - traefik-net
    labels:
     - "traefik.http.routers.emby.rule=Host($HOST)"
     - "traefik.http.routers.emby.entrypoints=websecure"
     - "traefik.http.routers.emby.tls=true"

     # Set port to use
     - "traefik.http.services.emby.loadbalancer.server.port=8096"

     # Enable traefik
     - traefik.enable=true
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
              driver: nvidia

I've successfully run the test docker image:
sudo docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
from the directions posted here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

Since the test image worked, I'm thinking this is purely a emby docker config/permissions issue? Hopefully someone can point me in the right direction to finally enable my GPU acceleration.

hardware_detection-63811833879.txt

February 13, 2023

You're using a mix of old information and docker run commands in a docker compose file.

The default runtime is fine now. `- gpus=all` is not an environment variable, and

    devices:
      - /dev/dri:/dev/dri

is only needed for intel quicksync.

You're also using PUID and PGID which are specific to the linuxserver emby image, but you're using the official emby image which requires 'UID=' and 'GID='

And your deploy section is missing the driver.

For reference, this my compose file. Note, I'm using an .env file to define variables (the parts that start with $)

version: '3.8'
services:
  emby:
    image: emby/embyserver
    container_name: emby
    restart: unless-stopped
    ports:
      - 8096:8096
      - 8920:8920
    environment:
      - TZ=$TZ
      - UID=$PUID
      - GID=$PGID
    volumes:
      - $HOME/.config/emby:/config
      - $HOME/media:/media
    tmpfs:
      - /transcode:mode=770,size=16G,uid=$PUID,gid=$PGID
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]

You should then get hardware accelerated encoding and decoding assuming you have it enabled in the settings.
image.png.6c04987ab8b8c076b72a5fdb7fde42a1.png

February 13, 2023

Thanks for the quick reply yesterday. I did make some updates per your suggestions but still no dice on getting past the renderD128 error message.

version: "3.7"

networks:
 traefik-net:
  name: traefik-net
  external: true

services:
  emby:
    image: emby/embyserver:latest
    container_name: "emby"
    environment:
      - UID=998 #1001
      - GID=998 #1002
      - TZ='America/Chicago'
    volumes:
      - $mounts
    networks:
     - traefik-net
    labels:
     - "traefik.http.routers.emby.rule=Host($HOST)"
     - "traefik.http.routers.emby.entrypoints=websecure"
     - "traefik.http.routers.emby.tls=true"

     # Set port to use
     - "traefik.http.services.emby.loadbalancer.server.port=8096"

     # Enable traefik
     - traefik.enable=true
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

hardware_detection-63811926548.txt

February 13, 2023

forgot to add that I've noticed that portainer reports the gidlist as being 2 (bin), if I don't have that specified in my compose file. Specifying vs not specifying the gidlist does update what I seen in portainer, but it does not affect video card access.

February 13, 2023

Yes, the default gidlist is 2, but you might need the gid of the video group and the render group.
I'm not sure what your host OS is, but you can check the ownership of /dev/dri

$ ls -la /dev/dri
total 0
drwxr-xr-x   3 root root        100 Feb  6 14:28 .
drwxr-xr-x  21 root root       4720 Feb  6 14:29 ..
drwxr-xr-x   2 root root         80 Feb  6 14:28 by-path
crw-rw----+  1 root video  226,   0 Feb  6 14:28 card0
crw-rw----+  1 root render 226, 128 Feb  6 14:28 renderD128

with this you can see card0 has a group `video` and renderD128 has a group `render`.
you can find the group ids to add with

$ cat /etc/group | grep video
video:x:44:

$ cat /etc/group | grep render
render:x:109:

so yours was the same, you would add
- GIDLIST=44,109
as an environment variable.

The other option would be adding your user to those groups.

February 14, 2023

My video group is 44 and render is 109, same as you listed above. My emby system user is also a part of those groups.

$ groups emby
emby : emby video render

using gidlist in my environment variables yields the same end result as not having it there.

February 14, 2023

This might shed some light on the situation: a docker inspect command on the emby docker reveals this:

"DeviceRequests": [
                {
                    "Driver": "nvidia",
                    "Count": -1,
                    "DeviceIDs": null,
                    "Capabilities": [
                        [
                            "gpu"
                        ]
                    ],
                    "Options": null
                }
            ],

Conceptually I would think the count and device IDs would be populated with non-bogus info if the docker was actually finding the card/drivers right?

February 14, 2023

6 hours ago, cypher0117 said:

Thanks for the quick reply yesterday. I did make some updates per your suggestions but still no dice on getting past the renderD128 error message.

version: "3.7"

networks:
 traefik-net:
  name: traefik-net
  external: true

services:
  emby:
    image: emby/embyserver:latest
    container_name: "emby"
    environment:
      - UID=998 #1001
      - GID=998 #1002
      - TZ='America/Chicago'
    volumes:
      - $mounts
    networks:
     - traefik-net
    labels:
     - "traefik.http.routers.emby.rule=Host($HOST)"
     - "traefik.http.routers.emby.entrypoints=websecure"
     - "traefik.http.routers.emby.tls=true"

     # Set port to use
     - "traefik.http.services.emby.loadbalancer.server.port=8096"

     # Enable traefik
     - traefik.enable=true
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

hardware_detection-63811926548.txt 137.81 kB · 2 downloads

Did you compare to our guide?

https://hub.docker.com/r/emby/embyserver

February 14, 2023

Luke, that's where I started from and that worked great on my old hardware (intel HD graphics not nvidia). Since moving to my 'new' hardware, I haven't been able to get the nvidia graphics part of it working. I've looked at the nvidia-container-runtime guide, official docker documentation, and through plenty of forums. I'm honestly not sure what I'm doing wrong at this point.

February 14, 2023

15 hours ago, cypher0117 said:
This might shed some light on the situation: a docker inspect command on the emby docker reveals this:
"DeviceRequests": [
                {
                    "Driver": "nvidia",
                    "Count": -1,
                    "DeviceIDs": null,
                    "Capabilities": [
                        [
                            "gpu"
                        ]
                    ],
                    "Options": null
                }
            ],
Conceptually I would think the count and device IDs would be populated with non-bogus info if the docker was actually finding the card/drivers right?

My inspect looks the same, and I actually have that same "Failed to open the drm device /dev/dri/renderD128" message in my hw detection logs, but hw transcoding still works fine. Have you actually enabled it in settings and tested?

February 14, 2023

transcode.jpg.c3ab98ebc2c0a4adfff170cefbbeb788.jpg

I don't see anything in my transcoding menu. I can play something that is being transcoded, yet I don't see any processes running on the GPU via nvidia-smi

February 15, 2023

Posting back because I finally figured it out...

I was missing the following packages on my server:

libnvidia-decode-470-server
libnvidia-encode-470-server

Once I installed those, restarted docker/emby docker I was able to see transcode options in the emby settings. I was also able to verify I can play a video that gets transcoded, and was able to see the process from nvidia-smi.

For what it's worth, I followed the nvidia install directions from here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker. That got the nvidia headless drivers installed and working correctly with docker. However it did not install the decode/encode libraries. Not sure if that was because I chose a headless version of the driver or what. BUT my issue appears to be resolved. I do still have the "Message": "Failed to open the drm device /dev/dri/renderD128" error being reported in my HW logs, but HW transcoding does indeed work.

Thank you for the help and suggestions!

February 15, 2023

I'm curious about which method you used to install the drivers initially.
For anyone else having trouble I'd recommend the following on debian based distros (in this case ubuntu).

Install the required dependencies

sudo apt install software-properties-common -y

Add the GPU PPA repository and update apt

sudo add-apt-repository ppa:graphics-drivers/ppa && sudo apt update

Run the following command to detect the recommended drivers

ubuntu-drivers devices

If you get a response like '-bash: ubuntu-drivers: command not found' run this command, then run the previous command again.

sudo apt install ubuntu-drivers-common

That should give you an output similar to this. Notice 'recommended' at the end of the line of the recommended driver.

$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:03.1/0000:09:00.0 ==
modalias : pci:v000010DEd00001C30sv00001028sd000011B3bc03sc00i00
vendor   : NVIDIA Corporation
model    : GP106GL [Quadro P2000]
driver   : nvidia-driver-510 - distro non-free
driver   : nvidia-driver-418-server - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-525-server - distro non-free
driver   : nvidia-driver-390 - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-515-server - distro non-free
driver   : nvidia-driver-515 - distro non-free
driver   : nvidia-driver-525 - third-party non-free recommended
driver   : xserver-xorg-video-nouveau - distro free builtin

Install the driver with

sudo ubuntu-drivers autoinstall

Or if you want to pick specifically

sudo apt install nvidia-driver-525

and finally, reboot the system for the drivers to take effect.

sudo reboot

February 20, 2023

I used the Nvidia cuda directory for install.

Pre-install:

sudo apt install dirmngr ca-certificates software-properties-common apt-transport-https dkms curl -y

Import GPG key:

curl -fSsL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub | sudo gpg --dearmor | sudo tee /usr/share/keyrings/nvidia-drivers.gpg > /dev/null 2>&1

Import Nvidia repository:

echo 'deb [signed-by=/usr/share/keyrings/nvidia-drivers.gpg] https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /' | sudo tee /etc/apt/sources.list.d/nvidia-drivers.list

Update:

sudo apt update

Install driver:

sudo apt install nvidia-headless-470-server

after posting here and a bunch of searching I also installed, which happened to be the secret sauce I needed in my situation:

sudo apt install libnvidia-decode-470-server libnvidia-encode-470-server

Sign In

Docker HWA Nvidia Instructions

Recommended Posts

byakuya32 24

karlshea 8

zfrenchy 12

Dez_Lamar 9

Arrowtron 1

isamudaison 11

Arrowtron 1

isamudaison 11

Arrowtron 1

Arrowtron 1

egandt 3

cypher0117 4

D34DC3N73R 18

cypher0117 4

cypher0117 4

D34DC3N73R 18

cypher0117 4

cypher0117 4

Luke 40075

cypher0117 4

D34DC3N73R 18

cypher0117 4

cypher0117 4

D34DC3N73R 18

cypher0117 4

Create an account or sign in to comment

Create an account

Sign in

Activity