Jump to content

Docker HWA Nvidia Instructions


cnstarz

Recommended Posts

byakuya32
 

Hello!

Im running Debian Bullseye, do you think I can follow the same guide you posted @TallBoiDez?

at this point there is no nvidia driver passthrough for docker on debian 11 only debian 10

Link to comment
Share on other sites

  • 5 months later...
karlshea

Recent docker compose instructions on Ubuntu 20 (Docker v20)

This assumes you have the NVIDIA drivers installed.

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

1. Follow instructions to install docker-ce

2. Paste the code under "Setup the package repository and the GPG key"

3. DO NOT install nvidia-docker2, it is deprecated. Instead, install nvidia-container-toolkit:

apt install nvidia-container-toolkit

4. Install docker-compose-plugin

apt install docker-compose-plugin

5. Restart docker

systemctl restart docker

6. Configure your docker-compose.yml. Example:

version: '3.8'
services:
  emby:
    container_name: emby
    image: emby/embyserver:latest
    
    # Obviously you'll need to update these
    environment:
      - UID=1002
      - GID=1002

    network_mode: 'host'

    ports:
      - '8096:8096'

    volumes:
      - "./config:/config"

    restart: unless-stopped

    # This is how GPU support is configured with recent docker/docker compose:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

7. Start your container

docker compose up

(notice NO HYPEN. `docker-compose` coming from the Ubuntu 20 repos is outdated, so you can't use the device reservations above. Instead, we're using the compose plugin from the docker command)

Edited by karlshea
  • Like 1
  • Thanks 2
Link to comment
Share on other sites

  • 1 month later...
zfrenchy

Hi guys, can one of you tell me if that is applicable for UNRAID ?

I have Emby docker installed by not recognizing my Nvidia card.

Link to comment
Share on other sites

TallBoiDez
21 minutes ago, zfrenchy said:

Hi guys, can one of you tell me if that is applicable for UNRAID ?

I have Emby docker installed by not recognizing my Nvidia card.

This might work for unRAID, I'm not familiar with unRAID so I can't say for certain 

Link to comment
Share on other sites

  • 1 month later...
Arrowtron

Hello,

I'm having trouble getting NVIDIA hardware detection to work in my environment: (I have had this working previously, on a previous version of Debian and maybe a different GFX card, I can't remember)

Debian 11 (bullseye)
OMV6
Linux 5.18.0-0.bpo.1-amd64
docker 5:20.10.17~3-0~debian-bullseye20.10.5+dfsg1-1+deb11u2
NVIDIA GeForce 750 TI
NVIDIA Drivers: 515.65.01

 

I've installed CUDA/NVIDIA drivers/toolkits etc
followed parts of this guide: https://forum.openmediavault.org/index.php?thread%2F38013-howto-nvidia-hardware-transcoding-on-omv-5-in-a-plex-docker-container%2F=
and the one on this thread

add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/ /"
apt-get install cuda
reboot

apt install nvidia-container-docker
apt install nvidia-docker2
apt install nvidia-smi

nvidia-smi.jpg.97a57c8ca1bf26c1a7254601ab3dbe6c.jpg

I haven't been successful when getting any running processes to show up, eg Plex (although i think Plex requires a license for HW i don't have) or NVIDIA test containers

/etc/docker/daemon.json

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia" 
}

I'm using portainer to manage docker containers

Environment variables, some of these get automatically added because i have removed these for testing (NVIDIA_VISIBLE_DEVICES & NVIDIA_DRIVER_CAPABILITIES)
UID 1002 is the emby user/group

UID=1002
GID=1002
GIDLIST=1002,44,107
VERSION=latest
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
LANG=en_US.UTF-8
HOME=/tmp
AMDGPU_IDS=/share/libdrm/amdgpu.ids
FONTCONFIG_PATH=/etc/fonts
LD_LIBRARY_PATH=/lib:/system
LIBVA_DRIVERS_PATH=/lib/dri
OCL_ICD_VENDORS=/etc/OpenCL/vendors
PCI_IDS_PATH=/share/hwdata/pci.ids
SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
gpus=all
XDG_CACHE_HOME=/config/cache
LANGUAGE=en_US.UTF-8
TERM=xterm
S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,video,utility

Devices set to /dev/dri /dev/dri - privileged mode off/on didn't make a difference

portainer-emby-runtime.jpg.d5937cd01ed67d00e59e5ad75b4dc850.jpg

 

Emby's hardware detect logs appear to find the driver and card but seems like a permission issue? The permissions I've set in the docker appear to be correct
Or is the NVIDIA GeForce TI 750 not supported? I do have a 1060 I could try

I can't work it out. Does anyone have any ideas or have i missed something obvious to try? The Emby server works fine it just struggles when playing high quality media that requires more resources because HW acceleration isn't available

logs attached:
"Message": "Failed to initialize VA /dev/dri/renderD128. Error -1"

ls -la /dev/dri

dev-dri.jpg.c668431974e3a1eda2f9f588d0a62fb1.jpg

hardware_detection-63796607500.txt

Link to comment
Share on other sites

Arrowtron
13 hours ago, isamudaison said:

Since you're running Kernel version 5.18.x, you'll need to apply this fix to get the nvidia device to be accessible :

https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256

Specifically, the `ibt=off` kernel parameter

Nice find, I tried this in my instance and it didn't make a difference,

GRUB_CMDLINE_LINUX_DEFAULT="quiet ibt=off". Update-grub, reboot etc

Did some more reading and I think a lot of people had different issues where they couldn't boot etc and had newer CPUs from what it looks like. Currently using i3-3240 for this

I installed different NVIDIA driver versions and same issue 470.129.06, CUDA 11.4. Getting the same error logs in Emby. Wanted to see if Plex can pick up the NVIDIA hardware but I don't have Plex Pass for that

Link to comment
Share on other sites

isamudaison

Interesting... we have similar setups (I'm running Ubuntu 22.04 server, but kernel 5.18, so it shouln't matter *too* much); I recall there is some confusion out there RE installing the right docker packages from nvidia, so here's my installed packages (and it's working on my setup):

 

apt list --installed | grep nvidia

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-cfg1-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-compute-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-container-tools/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-decode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-encode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-compute-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-container-toolkit/bionic,now 1.10.0-1 amd64 [installed]
nvidia-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-headless-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-headless-no-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-common-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-source-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]

I can paste my 'docker container inspect' results as well if you'd like that.

Link to comment
Share on other sites

  • 2 weeks later...
Arrowtron
On 8/30/2022 at 5:21 AM, isamudaison said:

Interesting... we have similar setups (I'm running Ubuntu 22.04 server, but kernel 5.18, so it shouln't matter *too* much); I recall there is some confusion out there RE installing the right docker packages from nvidia, so here's my installed packages (and it's working on my setup):

 

apt list --installed | grep nvidia

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-cfg1-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-compute-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-container-tools/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.10.0-1 amd64 [installed,automatic]
libnvidia-decode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
libnvidia-encode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-compute-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-container-toolkit/bionic,now 1.10.0-1 amd64 [installed]
nvidia-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-headless-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-headless-no-dkms-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-common-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-kernel-source-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,automatic]
nvidia-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]

I can paste my 'docker container inspect' results as well if you'd like that.

Yes please, feel free to paste docker container inspect or any different config to what I have I can tried

Here's my mounts

Mounts

    0 { Destination: /config, Mode: , Propagation: rprivate, RW: true, Source: /srv/docker_app/emby/config, Type: bind }
    1
        Destination /dev/dri
        Mode
        Propagation rprivate
        RW true
        Source /dev/dri
        Type bind

Other things I've tried so far:

 

I am yet to try (when i get the time):

  • Install Emby through flatpak/or deb install onto the O/S to see if it can see the HW, if it does then it has to be something to do with docker/nvidia-container config
Link to comment
Share on other sites

Arrowtron

So I managed to get this to work by installing the local/normal install for Emby on Linux. This creates the emby user/groups, permissions and other config.
What I originally had was the emby user/groups in OpenMediaVault, this does create UID/GID groups on the linux system but it probably missing other configuration that links to these groups is my guess.
Once the emby normal install is completed I just disable the service. Get the ID's of the emby user and group it has generated and reference these values in the docker container

 

Some instructions if anyone interested in future:

download emby and install using dpkg

get NVIDIA drivers from official NVIDIA site and follow the onscreen prompts:

wget https://us.download.nvidia.com/XFree86/Linux-x86_64/515.65.01/NVIDIA-Linux-x86_64-515.65.01.run
chmod +x NVIDIA-Linux-x86_64-515.65.01.run

if get Noveau error follow this
    How to remove Nouveau kernel driver (fix Nvidia install error)
      
    I had to add the below to
    /usr/lib/modprobe.d/nvidia-disable-nouveau.config
    /etc/modprobe.d/nvidia-disable-nouveau.conf

    blacklist nouveau
    blacklist lbm-nouveau
    options nouveau modeset=0
    alias nouveau off
    alias lbm-nouveau off

   
    restart after this
    
run nvidia-smi to confirm that the drivers and card are working

sudo apt install nvidia-container-toolkit nvidia-container-runtime nvidia-docker2

   
Change/edit the Daemon configuration file /etc/docker/daemon.json (acquired this info from here)
My example:

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia",
    "data-root": "/srv/docker"
}

Change/edit this configuration file /etc/nvidia-container-runtime/config.toml (I've removed some of the #comments)
 

disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false

[nvidia-container-cli]
environment = []
load-kmods = true

[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"

# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
    "docker-runc",
    "runc",
]

mode = "auto"

    [nvidia-container-runtime.modes.csv]

    mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"


sudo systemctl restart docker
or reboot (if haven't already from installing NVIDIA Drivers)


Might need to do this if it still isn't showing up (acquired this info from here)
"Since Docker 5.20.10.2 (I think) there was a change how docker gets access to hardware via cgroups. You need this workaround in the kernel boot parameters:"

echo 'GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=false' > /etc/default/grub.d/cgroup.cfg
update-grub

docker environment variables are essentially the same as my first post except for different UID/GID variables 

Hope that helps

docker nvidia-smi.png

emby Transcoding.png

  • Thanks 1
Link to comment
Share on other sites

  • 4 weeks later...
  • 4 months later...
cypher0117

I've been scratching my brain for a couple days now.  Read a lot of posts and tried a lot of things outlined on these forums as well as elsewhere.  I cannot get my docker implementation to initialize renderD128 -> "Message": "Failed to initialize VA /dev/dri/renderD128. Error -1"

hopefully someone has an idea of what I'm missing

Ubuntu 22.04 server (headless)
Docker version 23.0.1
Docker Compose version v2.16.0

daemon.json

{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    },
    "default-runtime": "nvidia"
}

 

nvidia-smi output

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro K4000        Off  | 00000000:01:00.0 Off |                  N/A |
| 30%   37C    P0    28W /  87W |      0MiB /  3011MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

 

docker-compose file:
 

services:
  emby:
    image: emby/embyserver:latest
    container_name: "emby"
    environment:
      - PUID=998 #emby system user
      - PGID=999 #emby system group
      - GIDLIST=998,44,109,1001 #emby,video,render,sambagroup
      - TZ='America/Chicago'
      - gpus=all
    volumes:
      - ./application/config:/config
      #- volumesremoved
    devices:
      - /dev/dri:/dev/dri
    networks:
     - traefik-net
    labels:
     - "traefik.http.routers.emby.rule=Host($HOST)"
     - "traefik.http.routers.emby.entrypoints=websecure"
     - "traefik.http.routers.emby.tls=true"

     # Set port to use
     - "traefik.http.services.emby.loadbalancer.server.port=8096"

     # Enable traefik
     - traefik.enable=true
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
              driver: nvidia

 

 

I've successfully run the test docker image:
sudo docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
from the directions posted here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

Since the test image worked, I'm thinking this is purely a emby docker config/permissions issue?  Hopefully someone can point me in the right direction to finally enable my GPU acceleration.

 

hardware_detection-63811833879.txt

Link to comment
Share on other sites

D34DC3N73R

You're using a mix of old information and docker run commands in a docker compose file. 

The default runtime is fine now. `- gpus=all` is not an environment variable, and 

    devices:
      - /dev/dri:/dev/dri

is only needed for intel quicksync. 

You're also using PUID and PGID which are specific to the linuxserver emby image, but you're using the official emby image which requires 'UID=' and 'GID='

And your deploy section is missing the driver.

For reference, this my compose file. Note, I'm using an .env file to define variables (the parts that start with $)

version: '3.8'
services:
  emby:
    image: emby/embyserver
    container_name: emby
    restart: unless-stopped
    ports:
      - 8096:8096
      - 8920:8920
    environment:
      - TZ=$TZ
      - UID=$PUID
      - GID=$PGID
    volumes:
      - $HOME/.config/emby:/config
      - $HOME/media:/media
    tmpfs:
      - /transcode:mode=770,size=16G,uid=$PUID,gid=$PGID
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]

You should then get hardware accelerated encoding and decoding assuming you have it enabled in the settings.
image.png.6c04987ab8b8c076b72a5fdb7fde42a1.png

  • Thanks 1
Link to comment
Share on other sites

cypher0117

Thanks for the quick reply yesterday.  I did make some updates per your suggestions but still no dice on getting past the renderD128 error message.

version: "3.7"

networks:
 traefik-net:
  name: traefik-net
  external: true

services:
  emby:
    image: emby/embyserver:latest
    container_name: "emby"
    environment:
      - UID=998 #1001
      - GID=998 #1002
      - TZ='America/Chicago'
    volumes:
      - $mounts
    networks:
     - traefik-net
    labels:
     - "traefik.http.routers.emby.rule=Host($HOST)"
     - "traefik.http.routers.emby.entrypoints=websecure"
     - "traefik.http.routers.emby.tls=true"

     # Set port to use
     - "traefik.http.services.emby.loadbalancer.server.port=8096"

     # Enable traefik
     - traefik.enable=true
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

 

hardware_detection-63811926548.txt

Link to comment
Share on other sites

cypher0117

forgot to add that I've noticed that portainer reports the gidlist as being 2 (bin), if I don't have that specified in my compose file.  Specifying vs not specifying the gidlist does update what I seen in portainer, but it does not affect video card access.

Link to comment
Share on other sites

D34DC3N73R

Yes, the default gidlist is 2, but you might need the gid of the video group and the render group.
I'm not sure what your host OS is, but you can check the ownership of /dev/dri

$ ls -la /dev/dri
total 0
drwxr-xr-x   3 root root        100 Feb  6 14:28 .
drwxr-xr-x  21 root root       4720 Feb  6 14:29 ..
drwxr-xr-x   2 root root         80 Feb  6 14:28 by-path
crw-rw----+  1 root video  226,   0 Feb  6 14:28 card0
crw-rw----+  1 root render 226, 128 Feb  6 14:28 renderD128

with this you can see card0 has a group `video` and renderD128 has a group `render`. 
you can find the group ids to add with 
 

$ cat /etc/group | grep video
video:x:44:

$ cat /etc/group | grep render
render:x:109:

so yours was the same, you would add 
- GIDLIST=44,109
as an environment variable.

The other option would be adding your user to those groups.

Link to comment
Share on other sites

cypher0117

My video group is 44 and render is 109, same as you listed above.  My emby system user is also a part of those groups. 
 

$ groups emby
emby : emby video render


using gidlist in my environment variables yields the same end result as not having it there. 

Link to comment
Share on other sites

cypher0117

This might shed some light on the situation: a docker inspect command on the emby docker reveals this:

"DeviceRequests": [
                {
                    "Driver": "nvidia",
                    "Count": -1,
                    "DeviceIDs": null,
                    "Capabilities": [
                        [
                            "gpu"
                        ]
                    ],
                    "Options": null
                }
            ],

Conceptually I would think the count and device IDs would be populated with non-bogus info if the docker was actually finding the card/drivers right?

Link to comment
Share on other sites

6 hours ago, cypher0117 said:

Thanks for the quick reply yesterday.  I did make some updates per your suggestions but still no dice on getting past the renderD128 error message.

version: "3.7"

networks:
 traefik-net:
  name: traefik-net
  external: true

services:
  emby:
    image: emby/embyserver:latest
    container_name: "emby"
    environment:
      - UID=998 #1001
      - GID=998 #1002
      - TZ='America/Chicago'
    volumes:
      - $mounts
    networks:
     - traefik-net
    labels:
     - "traefik.http.routers.emby.rule=Host($HOST)"
     - "traefik.http.routers.emby.entrypoints=websecure"
     - "traefik.http.routers.emby.tls=true"

     # Set port to use
     - "traefik.http.services.emby.loadbalancer.server.port=8096"

     # Enable traefik
     - traefik.enable=true
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

 

hardware_detection-63811926548.txt 137.81 kB · 2 downloads

Did you compare to our guide?

https://hub.docker.com/r/emby/embyserver

Link to comment
Share on other sites

cypher0117

Luke, that's where I started from and that worked great on my old hardware (intel HD graphics not nvidia).  Since moving to my 'new' hardware, I haven't been able to get the nvidia graphics part of it working.  I've looked at the nvidia-container-runtime guide, official docker documentation, and through plenty of forums.  I'm honestly not sure what I'm doing wrong at this point.

Link to comment
Share on other sites

D34DC3N73R
15 hours ago, cypher0117 said:

This might shed some light on the situation: a docker inspect command on the emby docker reveals this:

"DeviceRequests": [
                {
                    "Driver": "nvidia",
                    "Count": -1,
                    "DeviceIDs": null,
                    "Capabilities": [
                        [
                            "gpu"
                        ]
                    ],
                    "Options": null
                }
            ],

Conceptually I would think the count and device IDs would be populated with non-bogus info if the docker was actually finding the card/drivers right?

My inspect looks the same, and I actually have that same "Failed to open the drm device /dev/dri/renderD128" message in my hw detection logs, but hw transcoding still works fine. Have you actually enabled it in settings and tested?

Link to comment
Share on other sites

cypher0117

transcode.jpg.c3ab98ebc2c0a4adfff170cefbbeb788.jpg

 

I don't see anything in my transcoding menu.  I can play something that is being transcoded, yet I don't see any processes running on the GPU via nvidia-smi

Link to comment
Share on other sites

cypher0117

Posting back because I finally figured it out...

I was missing the following packages on my server:

libnvidia-decode-470-server
libnvidia-encode-470-server

Once I installed those, restarted docker/emby docker I was able to see transcode options in the emby settings.  I was also able to verify I can play a video that gets transcoded, and was able to see the process from nvidia-smi.

For what it's worth, I followed the nvidia install directions from here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker.  That got the nvidia headless drivers installed and working correctly with docker.  However it did not install the decode/encode libraries.  Not sure if that was because I chose a headless version of the driver or what.  BUT my issue appears to be resolved.  I do still have the "Message": "Failed to open the drm device /dev/dri/renderD128" error being reported in my HW logs, but HW transcoding does indeed work.

Thank you for the help and suggestions!

  • Thanks 1
Link to comment
Share on other sites

D34DC3N73R

I'm curious about which method you used to install the drivers initially. 
For anyone else having trouble I'd recommend the following on debian based distros (in this case ubuntu).

Install the required dependencies

sudo apt install software-properties-common -y


Add the GPU PPA repository and update apt

sudo add-apt-repository ppa:graphics-drivers/ppa && sudo apt update


Run the following command to detect the recommended drivers

ubuntu-drivers devices

If you get a response like '-bash: ubuntu-drivers: command not found' run this command, then run the previous command again.

sudo apt install ubuntu-drivers-common

 

That should give you an output similar to this. Notice  'recommended' at the end of the line of the recommended driver.

$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:03.1/0000:09:00.0 ==
modalias : pci:v000010DEd00001C30sv00001028sd000011B3bc03sc00i00
vendor   : NVIDIA Corporation
model    : GP106GL [Quadro P2000]
driver   : nvidia-driver-510 - distro non-free
driver   : nvidia-driver-418-server - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-525-server - distro non-free
driver   : nvidia-driver-390 - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-515-server - distro non-free
driver   : nvidia-driver-515 - distro non-free
driver   : nvidia-driver-525 - third-party non-free recommended
driver   : xserver-xorg-video-nouveau - distro free builtin


Install the driver with

sudo ubuntu-drivers autoinstall

Or if you want to pick specifically

sudo apt install nvidia-driver-525

 

and finally, reboot the system for the drivers to take effect.

sudo reboot

 

Link to comment
Share on other sites

cypher0117

I used the Nvidia cuda directory for install.

Pre-install:

sudo apt install dirmngr ca-certificates software-properties-common apt-transport-https dkms curl -y

Import GPG key:

curl -fSsL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub | sudo gpg --dearmor | sudo tee /usr/share/keyrings/nvidia-drivers.gpg > /dev/null 2>&1

Import Nvidia repository:

echo 'deb [signed-by=/usr/share/keyrings/nvidia-drivers.gpg] https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /' | sudo tee /etc/apt/sources.list.d/nvidia-drivers.list

Update:

sudo apt update

Install driver:

sudo apt install nvidia-headless-470-server

 

after posting here and a bunch of searching I also installed, which happened to be the secret sauce I needed in my situation:

sudo apt install libnvidia-decode-470-server libnvidia-encode-470-server

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...