Jump to content

PLEASE HELP! Really not sure what is happening here.


TolkienBard

Recommended Posts

TolkienBard

I'm really not sure at all what is going on with my server, so I'm going to just put it all out there with some included logs and hope that this issue (or perhaps these issues) can be worked through in steps. The problem I have run into is apparently a progressive one. I'm not certain exactly when it started though, as I originally thought the issue was Roku client related, so I paid the problem little mind. The long and the short of things though, before I get into a more detailed description of the issue as it is presenting, is that my Emby server is behaving as though it doesn't have the necessary horsepower to function properly. The thing is, I've been using Emby for a year now, and it used to run just fine. I could service 3-5 transcoding streams and run scheduled tasks like library scans all at the same time. Now, I am lucky to get one stream going to an Android client, and that still only fires up properly if the server is idle. If there happens to be a library scan going, forget about it. Library scans are now taking nearly 2.5 hours, even if there have been no changes made to the library in days. Furthermore, these library scans are getting held up fairly regularly around 36% or 67%. Eventually (almost always) the scan gets beyond whatever point it gets hung up at, but it is not out of the question for that to take 30+ minutes to happen. When this latest beta was released, the library scan took a bit over half a day to complete and the Roku thumbnails took an additional 18 hours. During those 31+ hours, the server was unable to service any clients at all.

 

If it were not for the fact that the server used to perform beautifully and provide several streams without issue, I would be less concerned. But, as it is, with all the improvements made over the past year that have largely gone to improving server performance, I am beginning to get more and more confused as to what could possibly be causing the error. The following is a more detailed description of how things are going awry. Hopefully the completeness helps to more rapidly assist someone in helping me track down the culprit.

 

Roku: As I said, this started off as a problem I thought was just within the Roku client. My brother's Roku, which is hardwired to the server's network, started having issues about a bit over a month ago. He can launch the Roku's Emby channel, but that's about it. When he does, the folders that go across the top of the home page UI show up, or at least most of them usually do, but then the rest of the page never populates. It doesn't matter how long he lets the channel sit there trying to finish loading, nothing happens. We tried completely uninstalling and then installing fresh. Nothing seemed to help. Since we are on opposite schedules, we decided to put off spending dedicated time fixing it until the vacation he started this morning. Unfortunately, in that time, it seems the problem has spread.

 

Samsung: About 3 or 4 weeks ago, I started having some issues with the Samsung client slowing down horribly. I checked the Samsung section of the forums and couldn't find any mention of others having a similar issue, so I continued to poke about on my end trying to figure out what was wrong. Then, about 10 days ago, the client went from slow, but still usable to nearly broken. Now, if I can get it to load, it is a miracle. Furthermore, what used to be a somewhat slow, but still highly responsive client has become a plodding thing that cannot be trusted. When I go to log in using the Samsung app, it can now take 5-8 MINUTES for the initial sign in screen to finish loading. After a user is selected, it can take 30 seconds to as long as another 5 minutes for the app to finish transitioning to the user's home screen.  From that point, sometimes the app functions, but with plenty of lag. Far more often than not though, button presses can take 30 seconds to a few minutes to register. Unfortunately, it is impossible to tell when a button press has been registered or not. This leads to the button being pressed more than once sometimes. That would not necessarily be a big issue, except that sometimes the app is apparently responsive enough that it queues up the button presses. Then it gets entirely crazy as suddenly the app will go from seeming not responsive to suddenly jumping through 3 or 4 commands before becoming seemingly unresponsive again. For the last two days though, getting any sort of response beyond loading the initial screen has become almost impossible.

 

Windows App: About 2 weeks ago I started experiencing a problem with the Windows app loading. However, when I checked the forums, I discovered I was not the only person with this issue. There were a number of us having trouble with the eternally spinning loading circle. A quick fix was applied, and for a few days, everything was right as rain. Then, the app started slowing down again. Now, there is a significant, but not intolerable delay for the app to initially load. This load almost always results in a fully populated home screen. But then, if I select a category, such as "Movies" or "TV", I get another long delay before the new page loads. This page will only fully populate about 75% of the time. If I choose to watch media already listed on that page, I have about a 1 in 3 chance of the media loading at all. Almost always, I get a long screen of attempting to load before the app just suddenly closes. If I attempt to navigate to a deeper page beyond the folder, I get the irritating loading circle just staring at me forever - except for those times when the app then just decides to close itself.

 

Android: If the server is idle, my wife is able to log in and stream a show using the Android app on her tablet or mobile. If, by chance a library scan is running, she runs into the very same issue with the app constantly just reporting that it is loading. If she is streaming something, I can indeed launch other tasks via the server dashboard if she is watching via the Android app, but when the show ends, her menu screens become unresponsive and simply keep trying to load unless she just happens to be playing whatever is "up next".

 

Web Client: Despite all these issues, one thing I have always been able to count on is the web client. No matter what sort of issues I might be having with any other means of accessing the server, I could always count on the web client to make it possible for me to watch something. That is no longer the case. About 3 or 4 days ago now, the web client started slowing down as well. Now, if I attempt to log onto the media server using the web client, I get numerous long waits between page populations. It is not uncommon for the pages to simply fail to load at all. When they do load, they are slow to respond, and if I try to select something to watch, the client attempts futilely to load whatever it is I select. This behaviour persists whether I log in through direct connection via the web address, or if I use Emby Connect. The PC is hardwired to the server's network, so it is not an issue of lag. Meanwhile properly populated pages are becoming more and more of a rare occurrence.

 

Apple: I have not personally witnessed it, but I am being told that attempts to log in to the server now result in a blank loading screen that never evolves into a populated page.

 

Given how universal the issue seems to have become, I can only assume the problem is server side, and not client side. The thing is, given how strong the server used to perform before, this simply makes no sense.

 

Here are some specifics, followed by a few different logs.

 

Server PC:

Processor: Dual Xeon L5520 @ 2.27 GHz (This is by no means a world-beating machine, but should be far, far more than enough to get the job done and then some.)

Memory: 24 GB

OS: Windows 8.1

 

Client PC:

Processor: i5-4570 @3.2 GHz

Memory: 8 GB

OS: Windows 10 

 

Samsung TV:

PN60E550 running the latest client app as of 26 October 2015

 

 

When the server received a few patch updates during the last week, I ran into some unhandled exception errors when the server would restart in order to install them, but for something in the vicinity of 5 months, that has been the issue now.

 

Furthermore, when I open the process monitor ad task manager, I am told that Emby server is only taking the server at about 5-10%, while memory holds steady at 13% usage. The total server load is usually under 10%. This tracks pretty well with what I remember before this problem started, and would seem to indicate that the server machine itself is indeed more than capable of handling the workload.

 

Going through the logs attached below, I can find instances of an error being thrown here or there, but I cannot find anything that seems to be a persistant culprit, nor am I savvy enough to figure out how to address the issues I am only sort of seeing.

 

 

 

Problem Example Log.txt

unhandled log.txt

Server-Current Log.txt

Link to comment
Share on other sites

In these kinds of situations it's always recommended to remove plugins and go from there. you have about 20 plugins installed, many of which don't come from the core team so we're unable to manage what they do. CoverArt and Roku thumbnails both provide benefits at a cost of additional cpu usage. So I would start just using the core server and working your way up from there.

Link to comment
Share on other sites

TolkienBard

That was what I was in the process of doing right after I posted. Stripped of absolutely every plugin, my performance when using the Windows app and trying to log in through the web client remains the same. I have attached the log for the server that should hopefully be less cluttered now that there are no plugins (including all those spiffy Radeon ones  :( )

 

Trying to log in using the web client bring up a blank Emby page once I finish logging in and the Windows app still eventually just force closes itself.

Stripped Server.txt

Link to comment
Share on other sites

TolkienBard

Have you fully rebooted the server since removing the plugins.

I selected "Restart Emby Server" after stripping all the plugins. Just to be sure though, I have now gone through and restarted the entire machine. Sadly, the results appear to be the same. Here are the server logs for after the total reboot.

Rebooted Server.txt

Link to comment
Share on other sites

You have the library scan scheduled to run at startup which is no longer necessary, you might consider removing that.

 

You some queries coming from the windows app that are taking some time and also some image processing, so maybe those are adding up. I would not measure based on your first try after making these changes. Browse around a little in your apps and then re-assess once image caches have been built back up. Since you just removed cover art, the apps have to redownload images again, so the first time you browse through various sections there will be a little extra cpu load, and then it should improve after that.

Link to comment
Share on other sites

Steven

 

Web Client: Despite all these issues, one thing I have always been able to count on is the web client. No matter what sort of issues I might be having with any other means of accessing the server, I could always count on the web client to make it possible for me to watch something. That is no longer the case. About 3 or 4 days ago now, the web client started slowing down as well. Now, if I attempt to log onto the media server using the web client, I get numerous long waits between page populations. It is not uncommon for the pages to simply fail to load at all. When they do load, they are slow to respond, and if I try to select something to watch, the client attempts futilely to load whatever it is I select. This behaviour persists whether I log in through direct connection via the web address, or if I use Emby Connect. The PC is hardwired to the server's network, so it is not an issue of lag. Meanwhile properly populated pages are becoming more and more of a rare occurrence.

 

I have been having a similar experience recently too and decided to delete all cache and rebuild it in an attempt to resolve slow loading times. This has for the most part worked but I do still have issues with nothing loading up at all when I open the web client and like you I've also felt this in the clients where it initially takes a while to connect. The majority of times all images will load instantly but other times it can take 10-20 seconds. I even cloned my server to an SSD to make sure it wasn't a slow disk.

 

I tested it just before posting this and after opening the web client I waited for over 2 minutes and nothing loaded at all. All I get is this:

 

562fdc123374f_Capture.jpg

 

Hitting refresh triggers it though and it was fine once in.

 

 

 

 

One thing I have wondered is why I have different web clients depending on whether I use Emby Connect or connect directly using my internal/external IP?

Edited by Steven
  • Like 3
Link to comment
Share on other sites

TolkienBard

You have the library scan scheduled to run at startup which is no longer necessary, you might consider removing that.

 

You some queries coming from the windows app that are taking some time and also some image processing, so maybe those are adding up. I would not measure based on your first try after making these changes. Browse around a little in your apps and then re-assess once image caches have been built back up. Since you just removed cover art, the apps have to redownload images again, so the first time you browse through various sections there will be a little extra cpu load, and then it should improve after that.

I eliminated the library scan trigger and restarted the server again. Even before removing all of the plugins, my CPU load never indicated anything beyond 10% of load, even when doing a scan or serving the one Android client. Until this problem started to creep in, the machine was easily able to handle much more than what is happening now. I logged in using the Windows app, and eventually the images did populate. Strangely, the background theme played for the movie I selected, despite the plugins no longer being there. However, when I attempted to play the  movie, the app once again simply stayed constant and never proceeded to load.

 

The web client was a different story. The pages populated, but then, when selecting a movie to play, I would get random trailers instead of the chosen film. If I select "Watch trailer" I get the proper one associated with the film. However, pressing the big green play arrow button results in unrelated trailers loading. Very odd indeed. 

 

Looking at CPU load while the server is doing its thing, I'm still sitting under 15% of load, even when something is playing.

 

Cover Art has never been much of an issue before. It was one of the very first plugins I added about a year ago. The only time it posed a problem was early on, and that was when almost everyone seemed to be having Cover Art issues.

New Server Log.txt

Link to comment
Share on other sites

random trailers are playing because you've enabled cinema mode for movies.

 

this log shows some response times due to image processing although nothing that is extremely high. we're going to be looking at improving that soon, but in the meantime, this image resizing won't happen repeatedly so as the apps build up their image caches the times will go down.

Link to comment
Share on other sites

TolkienBard

random trailers are playing because you've enabled cinema mode for movies.

 

 

Doh!

 

I'll continue using the web client then until I can get some sort of stable response from the Windows App. It still seems odd that CoverArt is suddenly causing this problem when it never created an issue before and the CPU load, even with all those plugins was very negligible.

Link to comment
Share on other sites

i wouldn't necessarily blame cover art. all image scaling is like video transcoding. it is costly and if you look at TheMoviedb and Fanart, their are continuing to increase the resolution that they offer their images. They have 4k movie backdrops now and 2k posters, which are fantastic for the htpc but scaling them down to a small size for mobile is going to hammer your server. At this point we might have to think about options to download multiple copies in different sizes because anything that has any kind of cpu impact tends to be real big problem for people.

Link to comment
Share on other sites

TolkienBard

I appreciate the input. As I said, the server has not been showing any sort of load that would even mildly be considered taxing for the hardware. However, if it is the images causing the issue, then I'll just have to deal with it for now I guess. I have not had a chance to try the Samsung, Android, Apple, or Roku apps yet, but the web client is indeed working now. That's a solid first step in the right direction.

Link to comment
Share on other sites

once the apps build up an image cache the problem largely goes away because then the only scaling required is for the occasional new image as opposed to every image on screen at once

Link to comment
Share on other sites

Nathanio

Hi TolkienBard, 

 

I had a similar issue on my WHS2011 box (hardware is Xeon E3-1240, 8GB RAM, 6HDDs, gigabit network) where the pages would take ages to load and sometimes timing out. I was getting frustrated about it so sat down and wrote down what was my server doing at this time. It appeared that it coincided with other activities on my server and it was in fact disk I/O that was the problem and not CPU/Memory related. 

 

I was adding the burden of HTPC activities to it when traditionally it had been 'just' a file server with client backups. 

 

My solution (and it has massively improved the system) was to change the boot drive to a 256GB SSD and add an old 32GB SSD as a scratch drive. 

 

The server responsiveness is (almost) straight away and live tv start is a 1.5s instead of 5 or 6. The scratch drive contains all of the metadata for speedy access and is also the temp directory for transcoding.

 

It has changed the server so much that I replaced the HTPC with a Amazon FireTV box, sold my PCIe DVB-S2 tuners and bought 2 x HD Homerun DVB-T2 for live TV. Gone is the Windows box in our bedroom and replaced with a FireTV Stick.

 

TL;DR, you're probably running out of disk IO, add SSD for boot drive and big rotational disks for storage.  

Link to comment
Share on other sites

Steven

Disk IO could definitely be a factor. I have Couch Potato, Sonaar and Headphones all monitoring my drives so this could have an affect. And like I said I recently put an SSD in and have also seen a big improvement all round.

 

The hanging issue at initial load seems to be something different though.

 

When watching the resource monitor on the disks it is actually quite amazing seeing Emby pull images from so many different drives and folders and does make me appreciate the speed it generally delivers, but also wonder if I centralised the metadata location if this could improve it? 

Link to comment
Share on other sites

FrostByte

 

I tested it just before posting this and after opening the web client I waited for over 2 minutes and nothing loaded at all. All I get is this:

 

 

Hitting refresh triggers it though and it was fine once in.

 

 

I get that blank Home Page thing a lot when I first start Edge.  Refresh usually works as you said.  I usually don't use Connect, but may try to see if the problem goes away then

Link to comment
Share on other sites

jluce50

i wouldn't necessarily blame cover art. all image scaling is like video transcoding. it is costly and if you look at TheMoviedb and Fanart, their are continuing to increase the resolution that they offer their images. They have 4k movie backdrops now and 2k posters, which are fantastic for the htpc but scaling them down to a small size for mobile is going to hammer your server. At this point we might have to think about options to download multiple copies in different sizes because anything that has any kind of cpu impact tends to be real big problem for people.

 

Are there plans to add something like a "Maximum backdrop download width"? There is a setting for minimum, but with 4k images being available it would be nice to limit the max size. Anything above 1920x1080 would just be a drain on my system.

Link to comment
Share on other sites

Happy2Play

Are there plans to add something like a "Maximum backdrop download width"? There is a setting for minimum, but with 4k images being available it would be nice to limit the max size. Anything above 1920x1080 would just be a drain on my system.

Or an automatic image resizer similar to how MCM does it.

 

563113193c260_art.jpg

Link to comment
Share on other sites

Steven

I get that blank Home Page thing a lot when I first start Edge.  Refresh usually works as you said.  I usually don't use Connect, but may try to see if the problem goes away then

 

It generally only happens the first time I open the web client on a different device, it doesn't matter how long I leave it. Here's part of the log from earlier when it happened.

server-63581375271-copy.txt

Link to comment
Share on other sites

Or an automatic image resizer similar to how MCM does it.

 

563113193c260_art.jpg

 

I think limiting the download size in the first place (if possible) would be much better.  Of course, if the provider doesn't provide that, then this could be a backup but they must have max image size as one of their parameters.

Link to comment
Share on other sites

Or an automatic image resizer similar to how MCM does it.

 

563113193c260_art.jpg

 

we already have this, it just happens on demand, not metadata download. the result is cached. we can also look at separating different kinds of caches and allowing image caches to live longer.

Link to comment
Share on other sites

TolkienBard

Hi TolkienBard, 

 

I had a similar issue on my WHS2011 box (hardware is Xeon E3-1240, 8GB RAM, 6HDDs, gigabit network) where the pages would take ages to load and sometimes timing out. I was getting frustrated about it so sat down and wrote down what was my server doing at this time. It appeared that it coincided with other activities on my server and it was in fact disk I/O that was the problem and not CPU/Memory related. 

 

I was adding the burden of HTPC activities to it when traditionally it had been 'just' a file server with client backups. 

 

My solution (and it has massively improved the system) was to change the boot drive to a 256GB SSD and add an old 32GB SSD as a scratch drive. 

 

The server responsiveness is (almost) straight away and live tv start is a 1.5s instead of 5 or 6. The scratch drive contains all of the metadata for speedy access and is also the temp directory for transcoding.

 

It has changed the server so much that I replaced the HTPC with a Amazon FireTV box, sold my PCIe DVB-S2 tuners and bought 2 x HD Homerun DVB-T2 for live TV. Gone is the Windows box in our bedroom and replaced with a FireTV Stick.

 

TL;DR, you're probably running out of disk IO, add SSD for boot drive and big rotational disks for storage.  

I only use the machine for server duties, but I do see where disk I/O could be an issue. I'll have a look at it this weekend (after recovering from Halloween) and see if that is indeed an issue. I had already been considering upgrading the boot drive to an SSD anyway. However, given the issues, I didn't want to drop the money and time on swapping out the boot drive if I was in fact going to need to replace the server itself in order to restore performance.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...