Jump to content

Kubernetes with GPU acceleration


Recommended Posts

Posted

Hi all,

 

I am involved in the future of orchestration and would like this service to run in my test infrastructure at home to add some reliability and performance gains. I realize the app is a windows app running through mono on Linux in a container so it’s not particularly easy to adapt to this... but I figured a single instance pod with some persistent storage would do.

 

To that end, I use a Deployment which makes one replica. Mounts NFS for the big library share and has another “config” NFS persistent volume.

 

That works great in kubernetes. The web service is exposed and traffic is routed. I killed the node it was on and a new one blinked into existence and took over. Perfect. Except...

 

The Nvidia config in ffmpeg requires the device to be mounted with --device which isn’t how kubernetes exposes the GPU. I have the nvidia plug-in container running, but emby can’t see it.

 

I tried using the ffmpeg compiled in jrottenberg’s repo with the nvidia driver but it doesn’t seem to work at all with the emby image.

 

I used this dockerfile to try to cobble them together, but I have a feeling the custom compiled ffmpeg in the original image is doing stuff that can’t just be ripped out and replaced.

 

```

FROM jrottenberg/ffmpeg:4.2-nvidia AS ffmpeg

 

# Emby Server

FROM emby/embyserver

ENV APP_NAME="emby-server" IMG_NAME="embyserver" TAG_NAME="beta" PKG_NAME="emby-server-beta" EDGE=0 UMASK=002

 

ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib"

 

COPY --from=ffmpeg /usr/local/bin /usr/local/bin/

COPY --from=ffmpeg /usr/local/share /usr/local/share/

COPY --from=ffmpeg /usr/local/lib /usr/local/lib/

 

ADD putffmpeg.sh /putffmpeg.sh

RUN ./putffmpeg.sh

 

VOLUME [ "/config" ]

EXPOSE 8096 8920 7359/udp 1900/udp

 

ENTRYPOINT ["/init"]

```

 

If anyone has gotten this working or has an idea for a next step, I’d love to hear it. If I had access to the Emby-base, I’m pretty sure I could make it work... but also that seems to be secret.

Posted

Hi, what is a windows app running through mono on a linux container? what do you mean by that?

Posted (edited)

Hi, I'm actually trying to do the same, got the GPU mounted into the pod.

Where did you check that emby can't see the GPU?
 

(Beginner question, but I just bought my premiere license :-) )

 

Edit: added and removed stuff again :-)

Edited by ramonrue
Posted

Hi, what is a windows app running through mono on a linux container? what do you mean by that?

Since it went closed source, it’s kinda tough to make heads or tails of the thing. From deconstructing the container, I found a bunch of DLLs and mono and a bunch of Linux so files... so not a cloud native setup with a web server, database server, transcode worker, and ingress. It’s not surprising, and I mean no insult.

 

I just want my GPU enabled kubernetes cluster to work.

 

Can I sign an NDA and get access to the active dev elopement channel so I can make the GPU stuff work on K8s?

Posted

Where did you check that emby can't see the GPU?

I dropped into the container with kubectl exec -it to try to run ffmpeg outright. The useless sh shell can’t even run files because of linkers or something.

 

In the configuration, I look at the transcode options and I can’t select anything. That indicates there’s no hardware transcoder option.

 

I also ran a transcode job and checked nvidia-smi on the node and nothing was using it.

Posted

Alright, I can confirm that transcoding does not seem to use the GPU.

However, I have hardware acceleration enabled, and can also set a Transcoding Thread Count. Though I suspect that this is targeting the CPU, a Xeon 1245v6.

 

No dealbreaker for me, but would be nice if you can get this working :-) 

Posted

We haven't tested kubernetes ourselves. Can you please attach the emby server and hardware detection log? thanks.

Posted (edited)

We haven't tested kubernetes ourselves. Can you please attach the emby server and hardware detection log? thanks.

 

I found the attachment uploader! 

 

hardware_detection-63708615507.txt

Edited by Emptee
Posted

And the emby server log?

Posted

The reason it doesn’t work is because of “ --device /dev/dri:/dev/dri” not being a supported option for a pod definition. Kubernetes uses a plug-in that manifests as a container running on each node

Posted

Ok so how does this work exactly, do you know?

 

 

The Nvidia config in ffmpeg requires the device to be mounted with --device which isn’t how kubernetes exposes the GPU. I have the nvidia plug-in container running, but emby can’t see it.
Posted

Ok so how does this work exactly, do you know?

 

 

I looked into it and i'm having trouble nailing it down and making it work.  My attempt at running the Emby hardware encoding container on one of the Kubernetes nodes using the direct docker run command isn't working due to the premiere key not being accepted. 

 

I'll keep chipping away at this once I can get that far. 

Posted

Sounds like it's not able to send out the network requests for that?

Posted

So this is the review of doing everything the "regular docker way" and it's still not using it for transcode, even though it can identify it. 

 

Use this command to start the emby server (host network just overrides the k8s junk) 

 

 docker run -d \

>     --name emby-test \

>     --restart unless-stopped \

>     --network="host" \

>     --volume /mnt/disk1/emby-test/config:/config \

>     --volume /mnt/disk1/emby-test/storage:/storage \

>     --device /dev/dri:/dev/dri \

>     --publish 8096:8096 \

>     --publish 8920:8920 \

>     --env NVIDIA_VISIBLE_DEVICES=all \

>     --env NVIDIA_DRIVER_CAPABILITIES=compute,utility,video \

>     --env UID=112 \

>     --env GID=117 \

>     --env GIDLIST=1000,107,44 \

>     emby/embyserver:latest

 
 
 
Get into the system using exec. 
 
docker exec -it emby-test /bin/ash
 
Ping google, check license = network is working fine. License accepted. 
 
Check the transcode settings in Emby, switch to "advanced" but has no codec selections. 
 
Check hardware_detection log, looks like it finds the GPU but then "Failed creating CUDA context for device 0"  hardware_detection-63708956558.txt
 
Play a file and force transcode. 
 
Run watch nvidia-smi on the host and see:
 
+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

 
 
Posted

Okay I'm actually not 100% sure, but doesn't this look like it's working? 

 

excerpt from the hardwarde detection log:
{"ApplicationVersion":"4.3.0.22","VersionType":"Beta","ReleaseCode":"8fc4a582ac914a76a620289665ca03d2","OperatingSystem":"Linux","OperatingSystemName":"Unix","OperatingSystemVersion":"5.3.0.19 ","VideoDecoders":{"1: NVDEC GeForce GTX 760 - MPEG-2":{"IsEnabledByDefault":true,"DefaultPriority":30,"MaxMacroBlocks":65280,"NvidiaCodecInfo":{"Name":"NVDEC GeForce GTX 760 - MPEG-2","Description":"Adapter #0: 'GeForce GTX 760' ComputeCapability: 3.0","DeviceInfo":{"Adapter":0,"Name":"GeForce GTX 760","Desription":"Adapter #0: 'GeForce GTX 760' ComputeCapability: 3.0","ComputeCapability":{"Major":3,"Minor":0,"Build":-1,"Revision":-1,"MajorRevision":-1,"MinorRevision":-1}}

 

As you can also see, i switched to the beta container (because of another issue though).

 

My setup is pretty standard apart from that. I'm using containerd with the custom nvidia-runtime.

Container Runtime Version:  containerd://1.2.10

Kubernetes Version: 1.16.2
 
5dcfb49a64683_gpu.png
Posted

Yes it detected  your Nvidia. That looks pretty good, no?

  • 1 month later...
Posted

Did you figure this out?

  • 4 months later...
Posted (edited)

Baby arrived on armistice day so I took a break...  back at it. 

 

Went through a kubernetes 17.5 build and got things back where they need to be. NFS provisioner, ingress, etc. 

 

Demonstrate that Docker is running with NVIDIA driver and runtime: 

$ sudo docker run -it --runtime=nvidia --rm nvidia/cuda:9.0-devel nvidia-smi

Wed May  6 05:02:08 2020       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  GeForce GTX 1660    Off  | 00000000:07:00.0 Off |                  N/A |

| 33%   30C    P8     6W / 130W |      0MiB /  5944MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

                                                                               

+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

 

Demonstrate that Kubernetes isn't in the way: 

pod1 definitionapiVersion: v1
kind: Pod
metadata:
  name: pod1
spec:
  restartPolicy: OnFailure
  containers:
  - image: nvidia/cuda
    name: pod1-ctr
    command: ["sleep"]
    args: ["100000"]
    resources:
      limits:
        nvidia.com/gpu: 1

apply the pod and drop into it to get some sweet nvidia-smi output. 

 

kubectl apply -f ../specs/998-nvidia-pod1.yml 

pod/pod1 created

% kubectl get pod

NAME                                      READY   STATUS              RESTARTS   AGE

dnsutils                                  1/1     Running             76         3d3h

nfs-client-provisioner-7445d86bbc-89n4l   1/1     Running             4          7d3h

pod1                                      0/1     ContainerCreating   0          5s

public-58db795fb6-8g8lv                   1/1     Running             4          5d1h

% kubectl exec -it pod1 -- /bin/bash

root@pod1:/# nvidia-smi

Wed May  6 05:04:52 2020       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  GeForce GTX 1070    Off  | 00000000:06:00.0 Off |                  N/A |

|  0%   36C    P8     8W / 185W |      0MiB /  8119MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

                                                                               

+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

 

Check in K8s

% kubectl exec -it -n media emby-server-6d6d4bd9b6-l2hcm -- /bin/sh

/ # ffdetect nvenc

ffdetect version 4.3.0-emby_2020_02_24 Copyright © 2018-2019 softworkz for Emby LLC

  built with gcc 8.3.0 (crosstool-NG 1.24.0)

  configuration: --cc=x86_64-unknown-linux-gnu-gcc --prefix=/home/embybuilder/Buildbot/x64/ffmpeg-x64/staging --disable-amf --disable-debug --disable-doc --disable-ffplay --disable-vdpau --disable-xlib --enable-fontconfig --enable-gnutls --enable-gpl --enable-iconv --enable-libass --enable-libfreetype --enable-libfribidi --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libwebp --enable-libx264 --enable-libx265 --enable-libzvbi --enable-version3 --enable-libsmbclient --enable-libdav1d --enable-libvpx --enable-cuda-llvm --enable-cuvid --enable-libmfx --enable-nvdec --enable-nvenc --enable-vaapi --enable-cross-compile --cross-prefix=x86_64-unknown-linux-gnu- --extra-libs='-lexpat -lfreetype -lfribidi -lfontconfig -liconv -lpng -lz -lvorbis -logg -lnettle -lhogweed -lgmp -laddns-samba4 -lasn1util-samba4 -lauthkrb5-samba4 -lCHARSET3-samba4 -lcliauth-samba4 -lcli-cldap-samba4 -lcli-ldap-common-samba4 -lcli-nbt-samba4 -lcli-smb-common-samba4 -lcom_err -lcommon-auth-samba4 -ldbwrap-samba4 -ldcerpc-binding -ldcerpc-samba-samba4 -ldl -lflag-mapping-samba4 -lgenrand-samba4 -lgensec-samba4 -lgse-samba4 -lgssapi_krb5 -llibcli-lsa3-samba4 -llibsmb-samba4 -linterfaces-samba4 -liov-buf-samba4 -lk5crypto -lkrb5 -lkrb5samba-samba4 -lkrb5support -lldb -lldbsamba-samba4 -lm -lmessages-dgm-samba4 -lmessages-util-samba4 -lmsghdr-samba4 -lmsrpc3-samba4 -lndr -lndr-krb5pac -lndr-nbt -lndr-samba-samba4 -lndr-standard -lreplace-samba4 -lsamba-cluster-support-samba4 -lsamba-credentials -lsamba-debug-samba4 -lsamba-errors -lsamba-hostconfig -lsamba-modules-samba4 -lsamba-security-samba4 -lsamba-sockets-samba4 -lsamba-util -lsamba3-util-samba4 -lsamdb -lsamdb-common-samba4 -lsecrets3-samba4 -lserver-id-db-samba4 -lserver-role-samba4 -lsmbconf -lsmbd-shim-samba4 -lsmb-transport-samba4 -lsocket-blocking-samba4 -lsys-rw-samba4 -ltalloc -ltalloc-report-samba4 -ltdb -ltdb-wrap-samba4 -ltevent -ltevent-util -ltime-basic-samba4 -lutil-cmdline-samba4 -lutil-reg-samba4 -lutil-setid-samba4 -lutil-tdb-samba4 -luuid -lwbclient -lwinbind-client-samba4 -ldrm' --arch=x86_64 --target-os=linux --pkg-config=pkg-config --enable-shared --disable-static

  libavutil      56. 36.100 / 56. 36.100

[DEVICE]

DeviceIndex=0

DEVICEINFO:Name=GeForce GTX 1660

DEVICEINFO:Description=GeForce GTX 1660

DEVICEINFO:ComputeCapMajor=7

DEVICEINFO:ComputeCapMinor=5

DEVICEINFO:PROPERTIES:ClockRate=1830000

DEVICEINFO:PROPERTIES:MultiprocessorCount=22

DEVICEINFO:PROPERTIES:Integrated=0

DEVICEINFO:PROPERTIES:CanMapHostMemory=1

DEVICEINFO:PROPERTIES:ComputeMode=0

DEVICEINFO:PROPERTIES:ConcurrentKernels=1

DEVICEINFO:PROPERTIES:PciBusId=7

DEVICEINFO:PROPERTIES:PciDeviceId=0

DEVICEINFO:PROPERTIES:TccDriver=0

DEVICEINFO:PROPERTIES:MemoryClockRate=4001000

DEVICEINFO:PROPERTIES:GlobalMemoryBusWidth=192

DEVICEINFO:PROPERTIES:AsyncEngineCount=3

DEVICEINFO:PROPERTIES:UnifiedAddressing=1

DEVICEINFO:PROPERTIES:PciDomainId=0

DEVICEINFO:PROPERTIES:ComputeCapabilityMajor=7

DEVICEINFO:PROPERTIES:ComputeCapabilityMinor=5

DEVICEINFO:PROPERTIES:ManagedMemory=1

DEVICEINFO:PROPERTIES:MultiGpuBoard=0

DEVICEINFO:PROPERTIES:MultiGpuBoardGroupId=0

ERROR:Number=801

ERROR:Message=Failed creating CUDA context for device 0

[/DEVICE]

 

That failed to CUDA thing is disappointing... what about just going straight into docker?

 

sudo docker run -it emby/embyserver:4.4.2.0 -- /bin/sh

 

/ # ffdetect nvenc

ffdetect version 4.3.0-emby_2020_02_24 Copyright © 2018-2019 softworkz for Emby LLC

  built with gcc 8.3.0 (crosstool-NG 1.24.0)

  configuration: --cc=x86_64-unknown-linux-gnu-gcc --prefix=/home/embybuilder/Buildbot/x64/ffmpeg-x64/staging --disable-amf --disable-debug --disable-doc --disable-ffplay --disable-vdpau --disable-xlib --enable-fontconfig --enable-gnutls --enable-gpl --enable-iconv --enable-libass --enable-libfreetype --enable-libfribidi --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libwebp --enable-libx264 --enable-libx265 --enable-libzvbi --enable-version3 --enable-libsmbclient --enable-libdav1d --enable-libvpx --enable-cuda-llvm --enable-cuvid --enable-libmfx --enable-nvdec --enable-nvenc --enable-vaapi --enable-cross-compile --cross-prefix=x86_64-unknown-linux-gnu- --extra-libs='-lexpat -lfreetype -lfribidi -lfontconfig -liconv -lpng -lz -lvorbis -logg -lnettle -lhogweed -lgmp -laddns-samba4 -lasn1util-samba4 -lauthkrb5-samba4 -lCHARSET3-samba4 -lcliauth-samba4 -lcli-cldap-samba4 -lcli-ldap-common-samba4 -lcli-nbt-samba4 -lcli-smb-common-samba4 -lcom_err -lcommon-auth-samba4 -ldbwrap-samba4 -ldcerpc-binding -ldcerpc-samba-samba4 -ldl -lflag-mapping-samba4 -lgenrand-samba4 -lgensec-samba4 -lgse-samba4 -lgssapi_krb5 -llibcli-lsa3-samba4 -llibsmb-samba4 -linterfaces-samba4 -liov-buf-samba4 -lk5crypto -lkrb5 -lkrb5samba-samba4 -lkrb5support -lldb -lldbsamba-samba4 -lm -lmessages-dgm-samba4 -lmessages-util-samba4 -lmsghdr-samba4 -lmsrpc3-samba4 -lndr -lndr-krb5pac -lndr-nbt -lndr-samba-samba4 -lndr-standard -lreplace-samba4 -lsamba-cluster-support-samba4 -lsamba-credentials -lsamba-debug-samba4 -lsamba-errors -lsamba-hostconfig -lsamba-modules-samba4 -lsamba-security-samba4 -lsamba-sockets-samba4 -lsamba-util -lsamba3-util-samba4 -lsamdb -lsamdb-common-samba4 -lsecrets3-samba4 -lserver-id-db-samba4 -lserver-role-samba4 -lsmbconf -lsmbd-shim-samba4 -lsmb-transport-samba4 -lsocket-blocking-samba4 -lsys-rw-samba4 -ltalloc -ltalloc-report-samba4 -ltdb -ltdb-wrap-samba4 -ltevent -ltevent-util -ltime-basic-samba4 -lutil-cmdline-samba4 -lutil-reg-samba4 -lutil-setid-samba4 -lutil-tdb-samba4 -luuid -lwbclient -lwinbind-client-samba4 -ldrm' --arch=x86_64 --target-os=linux --pkg-config=pkg-config --enable-shared --disable-static

  libavutil      56. 36.100 / 56. 36.100

[DEVICE]

DeviceIndex=0

DEVICEINFO:Name=GeForce GTX 1660

DEVICEINFO:Description=GeForce GTX 1660

DEVICEINFO:ComputeCapMajor=7

DEVICEINFO:ComputeCapMinor=5

DEVICEINFO:PROPERTIES:ClockRate=1830000

DEVICEINFO:PROPERTIES:MultiprocessorCount=22

DEVICEINFO:PROPERTIES:Integrated=0

DEVICEINFO:PROPERTIES:CanMapHostMemory=1

DEVICEINFO:PROPERTIES:ComputeMode=0

DEVICEINFO:PROPERTIES:ConcurrentKernels=1

DEVICEINFO:PROPERTIES:PciBusId=7

DEVICEINFO:PROPERTIES:PciDeviceId=0

DEVICEINFO:PROPERTIES:TccDriver=0

DEVICEINFO:PROPERTIES:MemoryClockRate=4001000

DEVICEINFO:PROPERTIES:GlobalMemoryBusWidth=192

DEVICEINFO:PROPERTIES:AsyncEngineCount=3

DEVICEINFO:PROPERTIES:UnifiedAddressing=1

DEVICEINFO:PROPERTIES:PciDomainId=0

DEVICEINFO:PROPERTIES:ComputeCapabilityMajor=7

DEVICEINFO:PROPERTIES:ComputeCapabilityMinor=5

DEVICEINFO:PROPERTIES:ManagedMemory=1

DEVICEINFO:PROPERTIES:MultiGpuBoard=0

DEVICEINFO:PROPERTIES:MultiGpuBoardGroupId=0

ERROR:Number=801

ERROR:Message=Failed creating CUDA context for device 0

[/DEVICE]

 

 

Oof, disappointing. Will try again in the morning. 

Edited by Emptee
ramonrue
Posted

Just for completeness, Emby with GPU is working for me on the k8s cluster, currently I'm on kubernetes 1.18.

 

I'm not quite sure how to interpret your log outputs, but maybe you could try using another container runtime?

I'm on containerd and added the following snippet to my containerd config:
 

  [plugins.linux]
    shim = "containerd-shim"
    runtime = "nvidia-container-runtime"
    runtime_root = ""
    no_shim = false
    shim_debug = false

I followed this blog post: https://josephb.org/blog/containerd-nvidia/

Also, you might need to specify the runtime in your podspec (I guess with this https://kubernetes.io/docs/concepts/containers/runtime-class/ ?), or update the default runtime for Docker to nvidia. 

Posted (edited)

Okay... I didn't do much and now it's working perfectly. 

apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/instance: emby
    app.kubernetes.io/name: emby
  name: emby-server-svc
  namespace: default
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: http-port
  selector:
    app: emby-server
  sessionAffinity: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emby-server
  labels:
    app.kubernetes.io/instance: emby
    app.kubernetes.io/name: emby
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: emby-server
  template:
    metadata:
      labels:
        app: emby-server
    spec:
      containers:
      - name: emby-server
        image: emby/embyserver:4.4.2.0
        ports:
        - name: http-port
          containerPort: 8096
        - name: https-port
          containerPort: 8920
        volumeMounts:
        - mountPath: /config
          name: emby-config
        - mountPath: /mnt/library
          name: media-library
        resources:
          limits:
            nvidia.com/gpu: 1
        env:
        - name: UID
          value: "1002"
        - name: GID
          value: "100"
        - name: NVIDIA_VISIBLE_DEVICES
          value: "all"
        - name: NVIDIA_DRIVER_CAPABILITIES
          value: "compute,utility,video"
        livenessProbe:
          httpGet:
            path: /web/index.html
            port: 8096
          timeoutSeconds: 30
        readinessProbe:
          httpGet:
            path: /web/index.html
            port: 8096
          timeoutSeconds: 30
      volumes:
      - name: emby-config
        persistentVolumeClaim:
          claimName: emby-config
          readOnly: false
      - name: media-library
        nfs:
          server: <NFS Server>
          path: /path/to/Media
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: emby-server
  namespace: default
  labels:
    app: emby-server
    app.kubernetes.io/instance: emby
    app.kubernetes.io/name: emby
  annotations:
    kubernetes.io/ingress.class: "public"
    nginx.ingress.kubernetes.io/proxy-body-size: 500m
    nginx.org/client-max-body-size: 500m
    nginx.org/proxy-connect-timeout: 75s
    nginx.org/proxy-read-timeout: 60s
spec:
  rules:
  - host: <HOSTNAME>.domain.tld
    http:
      paths:
      - path: /
        backend:
          serviceName: emby-server-svc
          servicePort: http

I updated the deployment with a proper liveness/readiness check since the / path returns a 301 redirect instead of a 200 ok.  This makes ingress grumpy sometimes. 

 

Next up is cert manager but it's outside the scope of emby.  Enjoy kubernetes, everyone!

Edited by Emptee
Posted

That's great, thanks for the feedback.

  • 4 years later...
Posted
On 08/05/2020 at 06:35, Emptee said:

Okay... I didn't do much and now it's working perfectly. 

apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/instance: emby
    app.kubernetes.io/name: emby
  name: emby-server-svc
  namespace: default
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: http-port
  selector:
    app: emby-server
  sessionAffinity: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: emby-server
  labels:
    app.kubernetes.io/instance: emby
    app.kubernetes.io/name: emby
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: emby-server
  template:
    metadata:
      labels:
        app: emby-server
    spec:
      containers:
      - name: emby-server
        image: emby/embyserver:4.4.2.0
        ports:
        - name: http-port
          containerPort: 8096
        - name: https-port
          containerPort: 8920
        volumeMounts:
        - mountPath: /config
          name: emby-config
        - mountPath: /mnt/library
          name: media-library
        resources:
          limits:
            nvidia.com/gpu: 1
        env:
        - name: UID
          value: "1002"
        - name: GID
          value: "100"
        - name: NVIDIA_VISIBLE_DEVICES
          value: "all"
        - name: NVIDIA_DRIVER_CAPABILITIES
          value: "compute,utility,video"
        livenessProbe:
          httpGet:
            path: /web/index.html
            port: 8096
          timeoutSeconds: 30
        readinessProbe:
          httpGet:
            path: /web/index.html
            port: 8096
          timeoutSeconds: 30
      volumes:
      - name: emby-config
        persistentVolumeClaim:
          claimName: emby-config
          readOnly: false
      - name: media-library
        nfs:
          server: <NFS Server>
          path: /path/to/Media
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: emby-server
  namespace: default
  labels:
    app: emby-server
    app.kubernetes.io/instance: emby
    app.kubernetes.io/name: emby
  annotations:
    kubernetes.io/ingress.class: "public"
    nginx.ingress.kubernetes.io/proxy-body-size: 500m
    nginx.org/client-max-body-size: 500m
    nginx.org/proxy-connect-timeout: 75s
    nginx.org/proxy-read-timeout: 60s
spec:
  rules:
  - host: <HOSTNAME>.domain.tld
    http:
      paths:
      - path: /
        backend:
          serviceName: emby-server-svc
          servicePort: http

I updated the deployment with a proper liveness/readiness check since the / path returns a 301 redirect instead of a 200 ok.  This makes ingress grumpy sometimes. 

 

Next up is cert manager but it's outside the scope of emby.  Enjoy kubernetes, everyone!

Hey dude I've been trying to get my GPU available to my emby which is running on k3s/rancher 

Are you running single node k8s? If not how do you make the GPU available to each node for fail over? 

Also is there anything specific you had to do with the node and k8s to make the GPU available to the deployment other than the environment variables? 

 

Thanks 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...