Jump to content

Nvidia hw transcoding not working (anymore)


joydashy
Go to solution Solved by joydashy,

Recommended Posts

joydashy

Hi there,

I upgraded from a K620 to a GTX1050. On the K620 I've had hw transcoding working fine, but it lacked Displayport, which is why I had to upgrade.

I took the opportunity to upgrade the Nvidia drivers on the host as well as the (LXC) container:

NVIDIA-SMI 455.28       Driver Version: 455.28       CUDA Version: 11.1

nvidia-smi is working fine inside the container, and the relevant devices are present:

[15:34 root@texoma ~] > l /dev/dri/*
crw-rw---- 1 root video      226,   0 Oct 28 14:11 /dev/dri/card0
crw-rw---- 1 root messagebus 226, 128 Oct 28 14:11 /dev/dri/renderD128
[15:38 root@texoma ~] > l /dev/nvidia*
---------- 1 root root        0 Oct 28 15:27 /dev/nvidia-modeset
crw-rw-rw- 1 root root 234,   0 Oct 28 14:11 /dev/nvidia-uvm
crw-rw-rw- 1 root root 234,   1 Oct 28 14:11 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195,   0 Oct 28 14:11 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Oct 28 14:11 /dev/nvidiactl

Sadly, the hw transcoding is no longer working, and I have run out of things to try. Perhaps these drivers are too new?

Also, when I had the K620, I did not actually need the card0 & renderD128 devices. But now I see them being looked for in the hardware_detection log, perhaps that is simply a difference between K620 and GTX 1050?

 

Thanks in advance!

 

 

embyserver.txt hardware_detection-63739495658.txt

Link to comment
Share on other sites

@joydashy - You looked at the wrong place in the hw detection log.

image.png.79841b4aacf861b133ac886b547b358d.png

The driver seems to be installed but it cannot initialize. Otherwise, I'm afraid, but I can't help much with virtualization setups.

Edited by softworkz
Link to comment
Share on other sites

joydashy

Ok, well have always used it perfectly fine within LXC. But perhaps I have to do something with CUDA driver then, even though Emby wont use that I assume...hmm

Link to comment
Share on other sites

Nvidia HW acceleration always uses a CUDA context. The message comes from attempting to initialize the cuvid library which is used for decoding.

The log messages indicate that the driver libraries exist and can be loaded but not initialized.

Link to comment
Share on other sites

joydashy

I see ok, well I think the driver includes Cuda support already, don't think I need to install the Cuda toolkit.

Also tried downgrading to Driver Version: 450.80.02, same issue :(

Link to comment
Share on other sites

Maybe you need to install the driver on the host OS as well? 

Anyway, I would start there and see whether it works and then try to get it working in the guest OS.

Link to comment
Share on other sites

Q-Droid

Did you notice the group ownership on your render device and is emby a member of that group?

Is "messagebus" correct for that device? I've seen video and render before.

Link to comment
Share on other sites

joydashy
20 hours ago, Q-Droid said:

Did you notice the group ownership on your render device and is emby a member of that group?

Is "messagebus" correct for that device? I've seen video and render before.

Hi, thanks for the input. Emby is member of video, render and messagebus. I noticed the change in group ownership of the render device, I guess caused by a Debian update. The host group ID 108 is render, but in the container its messagebus. Otherwise I don't think that should cause issues...

Link to comment
Share on other sites

joydashy

Well not so far. I am contemplating installing Emby on the host to test (but I try to keep the Proxmox host as clean as possible), or otherwise reinstalling the previous GPU to make sure that one still works (and I didn't somehow break something else)...otherwise don't have any ideas.

Link to comment
Share on other sites

12 minutes ago, joydashy said:

I am contemplating installing Emby on the host to test (but I try to keep the Proxmox host as clean as possible),

You don't have to do any detailed setup. Just install, run, stop, look at the hw detection log and uninstall.

Link to comment
Share on other sites

joydashy
44 minutes ago, softworkz said:

BTW, which drivers did you install? The ones from the Nvidia website?

Yes, currently on the run-file install from Nvidia, Latest Long Lived Branch Version.

I think I'll test it on the host then first.

Link to comment
Share on other sites

joydashy
9 minutes ago, softworkz said:

What are you talking about?

All Nvidia codecs are perfectly detected:

image.png.7f3329c81e0cf86d19572faed43237f6.png

My bad, I had not loaded my Premier key on the host installation yet... it works now.

So the issue appears to be related to the container, that's good to know at least.

Edited by softworkz
  • Like 1
Link to comment
Share on other sites

  • Solution
joydashy

And now I have fixed the issue as well. The device ID had changed by 1 after replacing the GPU, and I had overlooked this..

Wrong:

crw-rw-rw- 1 root root 195, 254 Oct 29 21:00 /dev/nvidia-modeset
crw-rw-rw- 1 root root 235,   0 Oct 29 18:57 /dev/nvidia-uvm
crw-rw-rw- 1 root root 235,   1 Oct 29 18:57 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195,   0 Oct 29 18:57 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Oct 29 18:57 /dev/nvidiactl

lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

lxc.cgroup.devices.allow: c 195:0 rwm
lxc.cgroup.devices.allow: c 195:255 rwm
lxc.cgroup.devices.allow: c 236:0 rwm
lxc.cgroup.devices.allow: c 236:1 rwm

Good:

crw-rw-rw- 1 root root 195, 254 Oct 29 21:00 /dev/nvidia-modeset
crw-rw-rw- 1 root root 235,   0 Oct 29 18:57 /dev/nvidia-uvm
crw-rw-rw- 1 root root 235,   1 Oct 29 18:57 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195,   0 Oct 29 18:57 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Oct 29 18:57 /dev/nvidiactl

lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 235:* rwm

Also note that as before, it's not needed to add the card0/renderD128 devices to this configuration.

Thanks softworkz, you got me back to the actual issue :)

  • Like 1
Link to comment
Share on other sites

  • 1 year later...
joydashy

Over a year later, I am back to this same problem after updating the host to Proxmox 7.x (Debian 11 based). But I have already updated the device ID's and I have a working Nvidia driver inside the container:

Sat Jan 15 08:44:40 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.94       Driver Version: 470.94       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:03:00.0 Off |                  N/A |
|  0%   38C    P0    N/A /  90W |      0MiB /  2000MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

And the hardware detection log:

https://paste.gg/p/anonymous/673ffb2f4b2b4390a9a5537ddfaa267c/files/353f078516a646a48dcce2884071f0e4/raw

Have tried several things to get it to work in Emby, but no luck. Only idea I have now is that perhaps  the drivers are too new? I vaguely recall from some time ago I also had to install older Nvidia drivers for it to work, but I don't even know anymore if that was related to Emby or Plex or something else...

Anyone got this combo to work yet? Or has any good ideas?

Edited by joydashy
Link to comment
Share on other sites

joydashy

I only just now noticed that in the OP, I also had a working driver inside the container, and I also suggested it was "too new", haha 😅.

So I actually am at the same spot back then, but the solution I had back then, does not work anymore...

Edited by joydashy
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...