Jump to content

Emby system has high load average after playing videos


Go to solution Solved by DarinM,

Recommended Posts

Posted

top - 09:49:11 up 11 days, 6 min,  4 users,  load average: 4.06, 3.35, 1.81
Tasks: 354 total,   1 running, 353 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  63431.6 total,   8439.2 free,   2356.9 used,  53461.9 buff/cache
MiB Swap:   5120.0 total,   5118.7 free,      1.2 used.  61074.7 avail Mem
 

System is running RockyLinux 10.1.  This also happened on RockyLinux 8.10.  I had hoped that updating to the newer OS would resolve the issue.  If someone starts playing a video and it uses DirectPlay, everything is fine.  If not, the  person trying to watch gets a spinner for a long time until it errors and tries a different method of playback.  When that happens it appears that threads are left hanging.  The extra threads stay there and the more this happens the more threads are left hanging.  I've seen 30 or more threads hanging on my system.

virtual logs]# ps -T -p 1001073
    PID    SPID TTY          TIME CMD
1001073 1001073 ?        00:00:01 EmbyServer
1001073 1001074 ?        00:00:00 .NET SynchManag
1001073 1001075 ?        00:00:00 .NET EventPipe
1001073 1001076 ?        00:00:00 .NET DebugPipe
1001073 1001077 ?        00:00:00 .NET Debugger
1001073 1001078 ?        00:00:00 .NET Finalizer
1001073 1001080 ?        00:00:00 .NET Long Runni
1001073 1001081 ?        00:00:00 .NET SigHandler
1001073 1001082 ?        00:00:00 .NET Sockets
1001073 1001084 ?        00:00:00 .NET TP Gate
1001073 1001086 ?        00:00:00 Kestrel Timer
1001073 1001090 ?        00:00:00 .NET Timer
1001073 1001096 ?        00:00:00 .NET TP Worker
1001073 1001098 ?        00:00:00 .NET File Watch
1001073 1001099 ?        00:00:00 .NET File Watch
1001073 1001110 ?        00:00:00 .NET File Watch
1001073 1001116 ?        00:00:00 .NET File Watch
1001073 1001119 ?        00:00:00 .NET File Watch
1001073 1001120 ?        00:00:00 .NET File Watch
1001073 1001121 ?        00:00:00 .NET File Watch
1001073 1001122 ?        00:00:00 .NET File Watch
1001073 1001150 ?        00:00:00 .NET File Watch
1001073 1001154 ?        00:00:00 .NET File Watch
1001073 1001159 ?        00:00:00 .NET File Watch
1001073 1001160 ?        00:00:00 .NET File Watch
1001073 1001161 ?        00:00:00 .NET File Watch
1001073 1001299 ?        00:00:00 .NET TP Worker
1001073 1001375 ?        00:00:00 .NET TP Worker
1001073 1001387 ?        00:00:00 .NET TP Worker
1001073 1002226 ?        00:00:00 .NET TP Worker
1001073 1002303 ?        00:00:00 .NET TP Worker
1001073 1002480 ?        00:00:00 .NET TP Worker
 

If you need any other data let me know.  

I have been trying to keep ahead of this by having cron restart Emby-server  a couple of times per day.

 

embyserver.txt ffmpeg-transcode-8ccf517a-c34b-4d83-a4e7-74a11af969ca_1.txt

Posted (edited)

Try and disable subtitles all together.
If that works then try and just disable the subtitle options in the transcode settings.

Often see people with problems with subtitles.

Edited by yocker
Posted
2 hours ago, DarinM said:

top - 09:49:11 up 11 days, 6 min,  4 users,  load average: 4.06, 3.35, 1.81
Tasks: 354 total,   1 running, 353 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  63431.6 total,   8439.2 free,   2356.9 used,  53461.9 buff/cache
MiB Swap:   5120.0 total,   5118.7 free,      1.2 used.  61074.7 avail Mem
 

System is running RockyLinux 10.1.  This also happened on RockyLinux 8.10.  I had hoped that updating to the newer OS would resolve the issue.  If someone starts playing a video and it uses DirectPlay, everything is fine.  If not, the  person trying to watch gets a spinner for a long time until it errors and tries a different method of playback.  When that happens it appears that threads are left hanging.  The extra threads stay there and the more this happens the more threads are left hanging.  I've seen 30 or more threads hanging on my system.

virtual logs]# ps -T -p 1001073
    PID    SPID TTY          TIME CMD
1001073 1001073 ?        00:00:01 EmbyServer
1001073 1001074 ?        00:00:00 .NET SynchManag
1001073 1001075 ?        00:00:00 .NET EventPipe
1001073 1001076 ?        00:00:00 .NET DebugPipe
1001073 1001077 ?        00:00:00 .NET Debugger
1001073 1001078 ?        00:00:00 .NET Finalizer
1001073 1001080 ?        00:00:00 .NET Long Runni
1001073 1001081 ?        00:00:00 .NET SigHandler
1001073 1001082 ?        00:00:00 .NET Sockets
1001073 1001084 ?        00:00:00 .NET TP Gate
1001073 1001086 ?        00:00:00 Kestrel Timer
1001073 1001090 ?        00:00:00 .NET Timer
1001073 1001096 ?        00:00:00 .NET TP Worker
1001073 1001098 ?        00:00:00 .NET File Watch
1001073 1001099 ?        00:00:00 .NET File Watch
1001073 1001110 ?        00:00:00 .NET File Watch
1001073 1001116 ?        00:00:00 .NET File Watch
1001073 1001119 ?        00:00:00 .NET File Watch
1001073 1001120 ?        00:00:00 .NET File Watch
1001073 1001121 ?        00:00:00 .NET File Watch
1001073 1001122 ?        00:00:00 .NET File Watch
1001073 1001150 ?        00:00:00 .NET File Watch
1001073 1001154 ?        00:00:00 .NET File Watch
1001073 1001159 ?        00:00:00 .NET File Watch
1001073 1001160 ?        00:00:00 .NET File Watch
1001073 1001161 ?        00:00:00 .NET File Watch
1001073 1001299 ?        00:00:00 .NET TP Worker
1001073 1001375 ?        00:00:00 .NET TP Worker
1001073 1001387 ?        00:00:00 .NET TP Worker
1001073 1002226 ?        00:00:00 .NET TP Worker
1001073 1002303 ?        00:00:00 .NET TP Worker
1001073 1002480 ?        00:00:00 .NET TP Worker
 

If you need any other data let me know.  

I have been trying to keep ahead of this by having cron restart Emby-server  a couple of times per day.

 

embyserver.txt 114.46 kB · 3 downloads ffmpeg-transcode-8ccf517a-c34b-4d83-a4e7-74a11af969ca_1.txt 105.01 kB · 3 downloads

Hi.  Can you try searching for our standard android app (Just "Emby" on Amazon and "Emby for Android on Google) on the same device's app store and see how that compares?

Thanks.

 

Posted

Hello,

    I disabled subtitles in the Transcode settings and made sure they were off before starting the video.. I also switched to the Android version instead of the AndroidTV version.  I still get the spinner that runs forever before it finally starts.  It did use DirectPlay but I still have 4 threads hanging after playing the video.  According to htop those 4 threads are in uninterruptable sleep and cannot be woken up by signals.

 

 

 

Screenshot 2025-11-22 131038.png

embyserver.txt ffmpeg-remux-e5fbce46-9504-4870-bf94-61d905ab9378_1.txt

Posted

What kind of storage are you using? A status of D means the process is in a blocked state waiting for an I/O operation to complete. It could be disk or network and could also explain the delays/hangs you've noticed. Nothing that you've posted so far points a server workload problem, CPU time on everything is pretty low.

 

Posted

The storage is shared via NFS.  It is on a FreeBSD 14.3 server on a gigabit network and is a RaidZ2 ZFS array.  The NFS server is similar in that it has very low load.  Every TV in the house is tied into the NFS server.  Kodi loads videos just fine with no delay or slowness.  

Posted

Processes holding a prolonged D state are a sign of a system level problem, not application level. It's a system/kernel level operation that isn't completing fast enough. And since you're using NFS then client mount and server options along with the network would be good places to start looking into why they're hanging.

Are the media files the only stuff on NFS and the rest of the config and data for Emby are local?

 

Posted

Correct.  Config and Emby are all on RockyLinux. Video files are all on the other system.   I have used Kodi successfully without problem since the days of XBMC.  I set up Emby so that I could share my library with people outside my home.  Kodi has never ever had this issue.  

Posted (edited)

Is Kodi also running on the same RockyLinux host and using the same mount points as Emby?

 

Edited by Q-Droid
Posted (edited)

No.. It's running on TV boxes like ONN 4K Pro but it is mounting the same NFS shared filesystem.

 

@Alexandria:/var/log # showmount
Hosts on localhost:
192.168.1.10
192.168.1.169
192.168.1.171
192.168.1.228
 

Edited by DarinM
Posted

Right. The only benefit of comparing or even mentioning Kodi is that those devices are not having problems with the NFS server. The host you're using for Emby does have something going on with some threads hanging on what looks like I/O operations. The process state of D points to a system level issue happening on the OS.

What mount options are you using for NFS?

 

Posted

From the fstab:

192.168.1.9:/media /mnt/alexandria nfs ro,_netdev 2 2

Posted

The dump and fsck pass fields (2 2) should be (0 0) for NFS and likely ignored in the fstab because of the filesystem type. But who knows. You could also add the soft option to allow for timeouts and perhaps better feedback/error reporting if the problem is with NFS. You might get something useful instead of the indefinite hang you get with the default hard option. Since you're mounting read-only there's no danger of corruption from interrupting an operation and allowing it to timeout and fail.

Running the mount command from the shell prompt should also show what defaults are in use for the NFS mount. Some are negotiated at mount time.

The modified entry would look like this:

192.168.1.9:/media /mnt/alexandria nfs ro,soft,_netdev 0 0

Also check your syslog to see if errors are reported around the time window when your server is having playback problems.

 

Posted

I modified the fstab and added the soft option.. I also changed dump and fsck pass fields to 0.  I also set timeout and retrans options.  I then unmounted and remounted the filesystem.

When I go to start a movie, there are no errors from the system.

This is output from mount:
192.168.1.9:/media on /mnt/alexandria type nfs (ro,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=50,retrans=5,sec=sys,mountaddr=192.168.1.9,mountvers=3,mountport=927,mountproto=udp,local_lock=none,addr=192.168.1.9,_netdev)
 

FWIW, I started out trying Plex and it wanted to start deleting files.  That's why I have things mounted read only.  That's also what prompted me to try Emby, which has been perfect in every other way.  I also have the lifetime premiere license.

 

Posted

Did the problems come back? If they did is it easy to reproduce and do you have logs for the times it happened?

 

Posted

Are you still getting threads stuck in a D state? I was thinking about tracing them if you're up for it. I can post the details later.

 

Posted

Yes, still getting hung threads.  What do I need to do to trace them?

Posted

I'm really curious about these stuck threads. They could be a red herring but should not happen on system that's functioning and why I'd like to trace. If anything it's not normal and the only thing to go on right now.

Stop your Emby server and make sure all related processes are gone, then start it again. If those hung threads remain then maybe reboot the box.

Login or su - to the root user. Create a new directory and cd into it, can be from root home so /root/something.

Get the PID for the Emby server process, the main one.

From the shell prompt run: strace -p <emby pid> -ff -o embytrace

You can open a second terminal and monitor the threads with htop, etc.

Now launch Emby client and run through the media playback scenarios that reproduce the problems, including the stuck threads. When you see any with a state D that persists make a note of the PIDs. If your sessions get stuck and have enough hung threads you can ctrl-c from the strace command.

The command will generate a trace file for each thread with the PID as a suffix. Find the ones that match the hung threads and start looking from the bottom to see what the last few operations were. At this point you're looking for hints of what it might have been trying when it hung. You can also grep all of the trace files using a word that's part of the media file name you played to find the ones involved in transcoding and streaming. These might have hints also pointing to issues. Most of it will look like gibberish but it will also include low level operations, file names and paths, http calls, etc.

 

Posted

Were these threads stuck and for how long before you stopped the strace?

They were all doing the same thing. Trying to set a shared lock on the media file over NFS. The "?" at the end of the flock call means there was no response/result by the time you interrupted the strace.

clock_gettime(CLOCK_MONOTONIC, {tv_sec=1132078, tv_nsec=684739388}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=1132078, tv_nsec=684794702}) = 0
mprotect(0x7f90938d5000, 4096, PROT_READ|PROT_EXEC) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 8, 0x431b000) = 0x7f910a774000
munmap(0x7f910a775000, 4096)            = 0
lstat("/mnt/alexandria/movies/Movies/A/Atomic Blonde.mkv", {st_mode=S_IFREG|0777, st_size=14208174769, ...}) = 0
openat(AT_FDCWD, "/mnt/alexandria/movies/Movies/A/Atomic Blonde.mkv", O_RDONLY|O_CLOEXEC) = 371
fstat(371, {st_mode=S_IFREG|0777, st_size=14208174769, ...}) = 0
flock(371, LOCK_SH|LOCK_NB)             = ?
+++ exited with 0 +++

A few things you could try:

- As a test remount the NFS share on the Emby server in read-write mode.
- Is NFSv4 an option for you on the FreeBSD server? It looks like it's using v3 now.
- There might be other services that need to run on the NFS server side to accommodate certain operations. But try the above first to see if anything changes.
 

Posted (edited)

You mean state = D for these?

Looks like the same thing hanging so at least it's consistent. Try remounting the share(s) with the nolock option.

192.168.1.9:/media /mnt/alexandria nfs ro,nolock,_netdev 0 0

Edit to add: If the above makes a difference then you might want to look into why NLM is not responding on your FreeBSD server. You might need to configure and start additional services. The NFSv3 server and client need rpcbind, rpc.lockd and rpc.statd running. NFSv4 doesn't need these.

 

Edited by Q-Droid
  • Like 1
  • Solution
Posted

OK.. I think that might have been the issue.. I had rpc.lockd running but not rpc.statd.  Now that both are running, movies start right away without delay and I now see the following in /var/log/messages when I mount it

Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK
Nov 24 17:18:49 virtual nfsrahead[1227644]: setting /mnt/alexandria readahead to 128
Nov 24 17:18:51 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9
 

I also see the following on the FreeBSD server:

Nov 24 17:19:43 Alexandria rpc.statd[986]: Invalid hostname to sm_mon: virtual
Nov 24 17:19:43 Alexandria kernel: Local NSM refuses to monitor virtual

 

There are no more threads in D state in htop also.  No more hangups so far.. I've stopped and started multiple different videos now.. Each one starts right away. No hang ups.

I think this is good. I appreciate your help and guidance.

 

  • Like 1
Posted

Thanks for following up.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...