DarinM 11 Posted November 22, 2025 Posted November 22, 2025 top - 09:49:11 up 11 days, 6 min, 4 users, load average: 4.06, 3.35, 1.81 Tasks: 354 total, 1 running, 353 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 63431.6 total, 8439.2 free, 2356.9 used, 53461.9 buff/cache MiB Swap: 5120.0 total, 5118.7 free, 1.2 used. 61074.7 avail Mem System is running RockyLinux 10.1. This also happened on RockyLinux 8.10. I had hoped that updating to the newer OS would resolve the issue. If someone starts playing a video and it uses DirectPlay, everything is fine. If not, the person trying to watch gets a spinner for a long time until it errors and tries a different method of playback. When that happens it appears that threads are left hanging. The extra threads stay there and the more this happens the more threads are left hanging. I've seen 30 or more threads hanging on my system. virtual logs]# ps -T -p 1001073 PID SPID TTY TIME CMD 1001073 1001073 ? 00:00:01 EmbyServer 1001073 1001074 ? 00:00:00 .NET SynchManag 1001073 1001075 ? 00:00:00 .NET EventPipe 1001073 1001076 ? 00:00:00 .NET DebugPipe 1001073 1001077 ? 00:00:00 .NET Debugger 1001073 1001078 ? 00:00:00 .NET Finalizer 1001073 1001080 ? 00:00:00 .NET Long Runni 1001073 1001081 ? 00:00:00 .NET SigHandler 1001073 1001082 ? 00:00:00 .NET Sockets 1001073 1001084 ? 00:00:00 .NET TP Gate 1001073 1001086 ? 00:00:00 Kestrel Timer 1001073 1001090 ? 00:00:00 .NET Timer 1001073 1001096 ? 00:00:00 .NET TP Worker 1001073 1001098 ? 00:00:00 .NET File Watch 1001073 1001099 ? 00:00:00 .NET File Watch 1001073 1001110 ? 00:00:00 .NET File Watch 1001073 1001116 ? 00:00:00 .NET File Watch 1001073 1001119 ? 00:00:00 .NET File Watch 1001073 1001120 ? 00:00:00 .NET File Watch 1001073 1001121 ? 00:00:00 .NET File Watch 1001073 1001122 ? 00:00:00 .NET File Watch 1001073 1001150 ? 00:00:00 .NET File Watch 1001073 1001154 ? 00:00:00 .NET File Watch 1001073 1001159 ? 00:00:00 .NET File Watch 1001073 1001160 ? 00:00:00 .NET File Watch 1001073 1001161 ? 00:00:00 .NET File Watch 1001073 1001299 ? 00:00:00 .NET TP Worker 1001073 1001375 ? 00:00:00 .NET TP Worker 1001073 1001387 ? 00:00:00 .NET TP Worker 1001073 1002226 ? 00:00:00 .NET TP Worker 1001073 1002303 ? 00:00:00 .NET TP Worker 1001073 1002480 ? 00:00:00 .NET TP Worker If you need any other data let me know. I have been trying to keep ahead of this by having cron restart Emby-server a couple of times per day. embyserver.txt ffmpeg-transcode-8ccf517a-c34b-4d83-a4e7-74a11af969ca_1.txt
yocker 1248 Posted November 22, 2025 Posted November 22, 2025 (edited) Try and disable subtitles all together. If that works then try and just disable the subtitle options in the transcode settings. Often see people with problems with subtitles. Edited November 22, 2025 by yocker
Luke 42078 Posted November 22, 2025 Posted November 22, 2025 2 hours ago, DarinM said: top - 09:49:11 up 11 days, 6 min, 4 users, load average: 4.06, 3.35, 1.81 Tasks: 354 total, 1 running, 353 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 63431.6 total, 8439.2 free, 2356.9 used, 53461.9 buff/cache MiB Swap: 5120.0 total, 5118.7 free, 1.2 used. 61074.7 avail Mem System is running RockyLinux 10.1. This also happened on RockyLinux 8.10. I had hoped that updating to the newer OS would resolve the issue. If someone starts playing a video and it uses DirectPlay, everything is fine. If not, the person trying to watch gets a spinner for a long time until it errors and tries a different method of playback. When that happens it appears that threads are left hanging. The extra threads stay there and the more this happens the more threads are left hanging. I've seen 30 or more threads hanging on my system. virtual logs]# ps -T -p 1001073 PID SPID TTY TIME CMD 1001073 1001073 ? 00:00:01 EmbyServer 1001073 1001074 ? 00:00:00 .NET SynchManag 1001073 1001075 ? 00:00:00 .NET EventPipe 1001073 1001076 ? 00:00:00 .NET DebugPipe 1001073 1001077 ? 00:00:00 .NET Debugger 1001073 1001078 ? 00:00:00 .NET Finalizer 1001073 1001080 ? 00:00:00 .NET Long Runni 1001073 1001081 ? 00:00:00 .NET SigHandler 1001073 1001082 ? 00:00:00 .NET Sockets 1001073 1001084 ? 00:00:00 .NET TP Gate 1001073 1001086 ? 00:00:00 Kestrel Timer 1001073 1001090 ? 00:00:00 .NET Timer 1001073 1001096 ? 00:00:00 .NET TP Worker 1001073 1001098 ? 00:00:00 .NET File Watch 1001073 1001099 ? 00:00:00 .NET File Watch 1001073 1001110 ? 00:00:00 .NET File Watch 1001073 1001116 ? 00:00:00 .NET File Watch 1001073 1001119 ? 00:00:00 .NET File Watch 1001073 1001120 ? 00:00:00 .NET File Watch 1001073 1001121 ? 00:00:00 .NET File Watch 1001073 1001122 ? 00:00:00 .NET File Watch 1001073 1001150 ? 00:00:00 .NET File Watch 1001073 1001154 ? 00:00:00 .NET File Watch 1001073 1001159 ? 00:00:00 .NET File Watch 1001073 1001160 ? 00:00:00 .NET File Watch 1001073 1001161 ? 00:00:00 .NET File Watch 1001073 1001299 ? 00:00:00 .NET TP Worker 1001073 1001375 ? 00:00:00 .NET TP Worker 1001073 1001387 ? 00:00:00 .NET TP Worker 1001073 1002226 ? 00:00:00 .NET TP Worker 1001073 1002303 ? 00:00:00 .NET TP Worker 1001073 1002480 ? 00:00:00 .NET TP Worker If you need any other data let me know. I have been trying to keep ahead of this by having cron restart Emby-server a couple of times per day. embyserver.txt 114.46 kB · 3 downloads ffmpeg-transcode-8ccf517a-c34b-4d83-a4e7-74a11af969ca_1.txt 105.01 kB · 3 downloads Hi. Can you try searching for our standard android app (Just "Emby" on Amazon and "Emby for Android on Google) on the same device's app store and see how that compares? Thanks.
DarinM 11 Posted November 22, 2025 Author Posted November 22, 2025 Hello, I disabled subtitles in the Transcode settings and made sure they were off before starting the video.. I also switched to the Android version instead of the AndroidTV version. I still get the spinner that runs forever before it finally starts. It did use DirectPlay but I still have 4 threads hanging after playing the video. According to htop those 4 threads are in uninterruptable sleep and cannot be woken up by signals. embyserver.txt ffmpeg-remux-e5fbce46-9504-4870-bf94-61d905ab9378_1.txt
Q-Droid 989 Posted November 22, 2025 Posted November 22, 2025 What kind of storage are you using? A status of D means the process is in a blocked state waiting for an I/O operation to complete. It could be disk or network and could also explain the delays/hangs you've noticed. Nothing that you've posted so far points a server workload problem, CPU time on everything is pretty low.
DarinM 11 Posted November 22, 2025 Author Posted November 22, 2025 The storage is shared via NFS. It is on a FreeBSD 14.3 server on a gigabit network and is a RaidZ2 ZFS array. The NFS server is similar in that it has very low load. Every TV in the house is tied into the NFS server. Kodi loads videos just fine with no delay or slowness.
Q-Droid 989 Posted November 22, 2025 Posted November 22, 2025 Processes holding a prolonged D state are a sign of a system level problem, not application level. It's a system/kernel level operation that isn't completing fast enough. And since you're using NFS then client mount and server options along with the network would be good places to start looking into why they're hanging. Are the media files the only stuff on NFS and the rest of the config and data for Emby are local?
DarinM 11 Posted November 22, 2025 Author Posted November 22, 2025 Correct. Config and Emby are all on RockyLinux. Video files are all on the other system. I have used Kodi successfully without problem since the days of XBMC. I set up Emby so that I could share my library with people outside my home. Kodi has never ever had this issue.
Q-Droid 989 Posted November 22, 2025 Posted November 22, 2025 (edited) Is Kodi also running on the same RockyLinux host and using the same mount points as Emby? Edited November 22, 2025 by Q-Droid
DarinM 11 Posted November 22, 2025 Author Posted November 22, 2025 (edited) No.. It's running on TV boxes like ONN 4K Pro but it is mounting the same NFS shared filesystem. @Alexandria:/var/log # showmount Hosts on localhost: 192.168.1.10 192.168.1.169 192.168.1.171 192.168.1.228 Edited November 22, 2025 by DarinM
Q-Droid 989 Posted November 22, 2025 Posted November 22, 2025 Right. The only benefit of comparing or even mentioning Kodi is that those devices are not having problems with the NFS server. The host you're using for Emby does have something going on with some threads hanging on what looks like I/O operations. The process state of D points to a system level issue happening on the OS. What mount options are you using for NFS?
DarinM 11 Posted November 22, 2025 Author Posted November 22, 2025 From the fstab: 192.168.1.9:/media /mnt/alexandria nfs ro,_netdev 2 2
Q-Droid 989 Posted November 23, 2025 Posted November 23, 2025 The dump and fsck pass fields (2 2) should be (0 0) for NFS and likely ignored in the fstab because of the filesystem type. But who knows. You could also add the soft option to allow for timeouts and perhaps better feedback/error reporting if the problem is with NFS. You might get something useful instead of the indefinite hang you get with the default hard option. Since you're mounting read-only there's no danger of corruption from interrupting an operation and allowing it to timeout and fail. Running the mount command from the shell prompt should also show what defaults are in use for the NFS mount. Some are negotiated at mount time. The modified entry would look like this: 192.168.1.9:/media /mnt/alexandria nfs ro,soft,_netdev 0 0 Also check your syslog to see if errors are reported around the time window when your server is having playback problems.
DarinM 11 Posted November 23, 2025 Author Posted November 23, 2025 I modified the fstab and added the soft option.. I also changed dump and fsck pass fields to 0. I also set timeout and retrans options. I then unmounted and remounted the filesystem. When I go to start a movie, there are no errors from the system. This is output from mount: 192.168.1.9:/media on /mnt/alexandria type nfs (ro,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=50,retrans=5,sec=sys,mountaddr=192.168.1.9,mountvers=3,mountport=927,mountproto=udp,local_lock=none,addr=192.168.1.9,_netdev) FWIW, I started out trying Plex and it wanted to start deleting files. That's why I have things mounted read only. That's also what prompted me to try Emby, which has been perfect in every other way. I also have the lifetime premiere license.
Q-Droid 989 Posted November 23, 2025 Posted November 23, 2025 Did the problems come back? If they did is it easy to reproduce and do you have logs for the times it happened?
DarinM 11 Posted November 23, 2025 Author Posted November 23, 2025 Yes, it still does it. Here's the logs.. There were no mount errors in the messages log on either system embyserver.txt hardware_detection-63899508892.txt
Q-Droid 989 Posted November 23, 2025 Posted November 23, 2025 Are you still getting threads stuck in a D state? I was thinking about tracing them if you're up for it. I can post the details later.
DarinM 11 Posted November 23, 2025 Author Posted November 23, 2025 Yes, still getting hung threads. What do I need to do to trace them?
Q-Droid 989 Posted November 23, 2025 Posted November 23, 2025 I'm really curious about these stuck threads. They could be a red herring but should not happen on system that's functioning and why I'd like to trace. If anything it's not normal and the only thing to go on right now. Stop your Emby server and make sure all related processes are gone, then start it again. If those hung threads remain then maybe reboot the box. Login or su - to the root user. Create a new directory and cd into it, can be from root home so /root/something. Get the PID for the Emby server process, the main one. From the shell prompt run: strace -p <emby pid> -ff -o embytrace You can open a second terminal and monitor the threads with htop, etc. Now launch Emby client and run through the media playback scenarios that reproduce the problems, including the stuck threads. When you see any with a state D that persists make a note of the PIDs. If your sessions get stuck and have enough hung threads you can ctrl-c from the strace command. The command will generate a trace file for each thread with the PID as a suffix. Find the ones that match the hung threads and start looking from the bottom to see what the last few operations were. At this point you're looking for hints of what it might have been trying when it hung. You can also grep all of the trace files using a word that's part of the media file name you played to find the ones involved in transcoding and streaming. These might have hints also pointing to issues. Most of it will look like gibberish but it will also include low level operations, file names and paths, http calls, etc.
DarinM 11 Posted November 24, 2025 Author Posted November 24, 2025 It took me a couple of tries to make sure I got it right, but here's some trace files. embytrace.1205073 embytrace.1205085 embytrace.1205138 embytrace.1205172
Q-Droid 989 Posted November 24, 2025 Posted November 24, 2025 Were these threads stuck and for how long before you stopped the strace? They were all doing the same thing. Trying to set a shared lock on the media file over NFS. The "?" at the end of the flock call means there was no response/result by the time you interrupted the strace. clock_gettime(CLOCK_MONOTONIC, {tv_sec=1132078, tv_nsec=684739388}) = 0 clock_gettime(CLOCK_MONOTONIC, {tv_sec=1132078, tv_nsec=684794702}) = 0 mprotect(0x7f90938d5000, 4096, PROT_READ|PROT_EXEC) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 8, 0x431b000) = 0x7f910a774000 munmap(0x7f910a775000, 4096) = 0 lstat("/mnt/alexandria/movies/Movies/A/Atomic Blonde.mkv", {st_mode=S_IFREG|0777, st_size=14208174769, ...}) = 0 openat(AT_FDCWD, "/mnt/alexandria/movies/Movies/A/Atomic Blonde.mkv", O_RDONLY|O_CLOEXEC) = 371 fstat(371, {st_mode=S_IFREG|0777, st_size=14208174769, ...}) = 0 flock(371, LOCK_SH|LOCK_NB) = ? +++ exited with 0 +++ A few things you could try: - As a test remount the NFS share on the Emby server in read-write mode. - Is NFSv4 an option for you on the FreeBSD server? It looks like it's using v3 now. - There might be other services that need to run on the NFS server side to accommodate certain operations. But try the above first to see if anything changes.
DarinM 11 Posted November 24, 2025 Author Posted November 24, 2025 I switched to read/write for the mount and the sleeping thread trace files are attached below. I also let it sit for a couple of minutes after stopping the video before killing the strace. Switching to NFSV4 will take me some time. It is apparently a completely different animal than NFSV3. embytrace.1221288 embytrace.1221292 embytrace.1221337 embytrace.1221410
Q-Droid 989 Posted November 24, 2025 Posted November 24, 2025 (edited) You mean state = D for these? Looks like the same thing hanging so at least it's consistent. Try remounting the share(s) with the nolock option. 192.168.1.9:/media /mnt/alexandria nfs ro,nolock,_netdev 0 0 Edit to add: If the above makes a difference then you might want to look into why NLM is not responding on your FreeBSD server. You might need to configure and start additional services. The NFSv3 server and client need rpcbind, rpc.lockd and rpc.statd running. NFSv4 doesn't need these. Edited November 24, 2025 by Q-Droid 1
Solution DarinM 11 Posted November 24, 2025 Author Solution Posted November 24, 2025 OK.. I think that might have been the issue.. I had rpc.lockd running but not rpc.statd. Now that both are running, movies start right away without delay and I now see the following in /var/log/messages when I mount it Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 not responding, still trying Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:26 virtual kernel: lockd: server 192.168.1.9 OK Nov 24 17:18:49 virtual nfsrahead[1227644]: setting /mnt/alexandria readahead to 128 Nov 24 17:18:51 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 Nov 24 17:18:52 virtual kernel: lockd: unexpected unlock status: 9 I also see the following on the FreeBSD server: Nov 24 17:19:43 Alexandria rpc.statd[986]: Invalid hostname to sm_mon: virtual Nov 24 17:19:43 Alexandria kernel: Local NSM refuses to monitor virtual There are no more threads in D state in htop also. No more hangups so far.. I've stopped and started multiple different videos now.. Each one starts right away. No hang ups. I think this is good. I appreciate your help and guidance. 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now