Jump to content


Photo

Linux - GPU Passthrough - Multi-method Hardware Transcoding

Linux GPU Passthrough transcoding

  • Please log in to reply
3 replies to this topic

#1 mcdonald9809 OFFLINE  

mcdonald9809

    Newbie

  • Members
  • 2 posts
  • Local time: 11:31 PM

Posted 05 January 2020 - 04:01 AM

Adventures in GPU Passthrough and Emby Server

 

I decided to post this as I am sure I am not the only person wondering about trying a server with multiple GPU's for transcoding. From my experimentation, each GPU used must have its own Emby Server instance/VM.

 

Over the last 3 months I have been experimenting with Linux, Emby and GPU passthrough. I have tried many distros and configurations. I started with Arch linux to cut my teeth on GPU passthrough as the documentation is excellent. My adventures have proven very educational. It does seem like every BLOG and post out there is usually missing at least one thing if you are doing a 'from scratch' set up. Here is what I have currently settled on as my 'production' system (yes, using Ubuntu server). This can be done using Arch, Fedora, CentOS ... however, I found this to be stable (for my use-case). I have also tried Proxmox and ESXi server. I was able to set this system up with Proxmox... EXCEPT, I ran into the AMD GPU reset issue in the VM that is pretty universal across ALL disto's and configurations (from my experience). Therefore, I settled on the Base VM system also being an Emby server running on Ubuntu 18.04 and using the AMD GPU for video AMD GPU acceleration in Emby Transcoding. This works great as in the event you need to reset the AMD-based transcoding server .. you won't run into the GPU hanging issue since the entire server will be rebooted. Below is my hardware I am using. I began this experiment with an I7 4790K, 16GB DDR3 and ASUS Z97-A motherboard. That configuration worked, but when I decided I wanted to run ALL of the GPU's in ONE system, I found that I was running out of processors to dedicate to VMs, hence I went with the 8700K and NOT the 9700K (12 threads vs 8 threads) -- another bonus to the processor upgrade is the addition of hardware HEVC enconding/decoding that the older 4790K didn't have.

By heart I am a 'bang-for-the-buck' kinda person, so EVERY other computer/server I have are AMD-based. But as a technologist and realist, I realized, if I wanted a Multi-method transcoding server in one-box... I needed to go with Intel (at least until they release their GPU in the coming year(s).). I chose the ATX motherboard platform due to the number of full-size PCIe slots (even though not all of them are 16x slots. The motherboard I selected has 3 full-size PCIe slots, and I configured them as 8x/4x/4x. I have the AMD RX 570 in the 8x slot, the MSI GTX 1050Ti in a 4x slot and an Intel PRO/1000 quad port card in the final 4x PCIe slot. In my VMs I assign each VM its own NIC port on the Intel quad-port card.

You may be wondering; "Where are the media files stored?" ... well, I have a FreeNAS server (ZFS RaidZ2) that holds all the media and the Emby Server configuration backup files. Once I figured out how to use the Backup Configuration plugin .. my experimentation became so much easier. Purchasing my Lifetime Emby Premiere License has been invaluable (No, I am not affiliated with Emby -- other than being a user).

Multi-method transcoding Server

Intel I7 8700K (6core/12thread) w/UHD 630 graphics
Gigabyte Z390 UD motherboard
G.Skill Ripjaws V 16GB (2x8GB) DDR4-3200
Intel PRO/1000 PTQuadPort (PCIe 4x)
MSI GTX 1050Ti Gaming X 4GB
MSI RX 570 Armor 4GB OC
Inland Profressional 120GB SSD (SATA) - Boot/OS
WD 1TB HDD (SATA) - VM & Storage


Base OS : Ubuntu Server 18.04.3
Add-ons : Libvirt, QEMU, OVMF, Cockpit Web Management, Emby-Media server (AMD-based transcoding)

VM1 (Intel-based Transcoding)
OS : Ubuntu Server 18.04.3
CPU : 2
Memory : 2GB

VM2 (NVIDIA-based Transcoding)
OS : Ubuntu Server 18.04.3
CPU : 2
Memory : 2GB

VM3 (Software Transcoding)
OS : Ubuntu Server 18.04.3
CPU : 4
Memory : 4GB

 

You will notice that the VMs that use hardware accelerated transcoding only have two (2) CPU's assigned.. as the GPU does all the heavy lifting and you don't need a bunch of CPU power. I have tested this using using my cell phone to 'force' the GPU to transcode - I have noticed on occasion if the source is HEVC it defaults to software transcoding for some unexplicable reason.

 

I have been able to re-build this system numerous times with full Transcoding functionality. FYI, I am using the Ubuntu drivers for ALL GPUS (Intel i915, NVIDIA and AMD)... if you are interested, I can make another post with links to the sources I used.


I will also be testing AMD-based Hardware accelerated transcoding using the AMD Ryzen 5-based APU (R5-3400G).

I have an AMD-based test-server that I will be using to do the same as above, but without the Intel-based transcoding option).

For those interested.. my File('Media') server is:

AMD Ryzen 5 3400G (4core/8thread) w/Vega 11 graphics
ASRock Fatal1ty B450 Gaming-ITX/ac motherboard
G.Skill Aegis 16GB (2x8GB) DDR4-3000
LSI SAS9211-8i 8-port 6GB SATA+SAS PCIe2.0
2x Mini-SAS to SATA Cable (SFF-8087 to SATA Forward Breakout) - SAS to 4 SATA connectors
6x Hitachi HGST Ultrastar 7k4000 3TB 7200RPM SATA III
*NOTE: Storage is ZFS Raid-Z2 (2 drive fault) - 10GB available storage.

I hope this is beneficial information. @softworkz

HiTekAgPilot



#2 softworkz ONLINE  

softworkz

    Advanced Member

  • Developers
  • 2408 posts
  • Local time: 07:31 AM

Posted 05 January 2020 - 09:57 PM

@mcdonald9809 - Thanks for sharing your experience.
 

From my experimentation, each GPU used must have its own Emby Server instance

 

 

As for the current situation, it's that Emby supports and detects all available GPUs that provide hw acceleration.

Emby is also able to select and address a specific GPU, even when there are multiple ones of the same type available.

You can select and prioritize multiple GPUs - individually for each encoding and decoding codec type.

 

An example: You have an Intel CPU that can decode H.264 but only up to full HD resolution and you have an Nvidia GPU that can decode H.264 up to 8k.

When you put the Intel QuickSync/VAAPI decoder to the first position and Nvidia to the second position, Emby will always use the Intel acceleration for resolutions up to full HD but Nvidia hw acceleration for resolutions greater than that.

 

The one thing that Emby isn't able to do (yet) is that it cannot do any kind of "load balancing" across multiple devices. It is always following the same rules and priorities - which means that the result is deterministic. We had stated that we want to deliver that soon - quite a while a ago - but it turned out that it can't be implemented as straight-forward as expected. Determinism is one of the  problems. Imagine, you're watching a video, then you skip to another position, transcoding will be restarted - but: load balancing chooses a different hw acceleration and that encodes video at a different quality level -> user would wonder why the video quality changes after skipping.

(that's not "the" problem - it's just the best example I could think of - to illustrate that there are certain consequences you wouldn't have expected)

 

I still really like to deliver that feature, but we need to make sure that we don't add a checkbox that sounds cool (load balancing) and it's primary effect for some users would be like "sometimes it works, sometimes, not, unable to reproduce reliably" (because different hw accelerations are chosen for each attempt).



#3 softworkz ONLINE  

softworkz

    Advanced Member

  • Developers
  • 2408 posts
  • Local time: 07:31 AM

Posted 05 January 2020 - 10:02 PM

From my experimentation, each GPU used must have its own Emby Server instance


So - yes, that's right, when you want to leverage multiple hw accelerations to the full extend - AND - it doesn't matter that clients need to connect to multiple servers.

 

 

What's NOT necessarily needed though, is to run Emby server in separate virtual machines. At least not when it's about accessing hw accelerations.

It might need some Linux expertise to get multiple instances of Emby running in parallel, but it's no problem having one instance using hw1 and another one using hw2.



#4 mcdonald9809 OFFLINE  

mcdonald9809

    Newbie

  • Members
  • 2 posts
  • Local time: 11:31 PM

Posted 06 January 2020 - 10:35 AM

In-light of this information from Softworkz... I will begin testing this on my testbench and see what I get.  Thank You for the information -- This would seem like a much better configuration.

 

hitekagpilot







Also tagged with one or more of these keywords: Linux, GPU Passthrough, transcoding

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users