Jump to content

memory leak in library scans since 7.8


Recommended Posts

Posted (edited)

as part of my library I have some home movie/photo libraries that are very large esp with many thousands of photos

when running a library scan the memory usage slowly increases and keeps increasing as emby parses over the various folder contents, and this memory usage continues to increase until the scan is completed and then it is freed again

the problem is so bad that I can not complete a full library scan, even if there are no changes made to the library - I have successfully scanned my entire library by individually selecting each sub-folder and scanning those one by one, by doing this I have imported all items and the library looks complete. if I do this across every sub folder, selecting each one and "scan library files" one by one, so they all get queued up, the memory usage usually goes up to around 1-1.5GB then drops again, rises to a similar level then drops, repeating this until it's all completed. it looks to me like the usage increases during each queued up scan then the memory is returned after each one is completed. with no changes made to the library, I can do this across my entire library and it doesn't take long, the entire process is completed in just a few minutes. so as a workaround, this is working for now

but even in this current state where my library is fully imported and there are no changes that need importing or deleteing, if I start a full library scan then it only reaches around 50% until the Emby process is consuming over 4GB of RAM (I have 6GB in my synology) at which point the system has to start using the swap file, which of course is mirrored across all drives so this is extremely slow - Emby becomes very slow and unresponsive (pages take many seconds to load, if they load at all), all the drives start thrashing and the scan slows to a crawl. eventually after hours of constant drive thrashing and the scan barely moving at all, synology DSM reports that it has terminated the process due to excessive memory usage, at this point I think the scan has still only reached about 60%. again there are no library contents that need updating - no video files are being inspected, no scraping is being done, no thumbnails being generated - Emby is just checking for changes and not finding any changes

I previously (on v7.5) was able to complete full library scans on the same library content without any such problem

I updated to 7.8 and started having a number of unrelated problems, eventually I deleted Emby and did a clean reinstall again to 7.8 which solved most of those however this memory problem persists. I've since updated to 7.9 beta and the problem remains there as well

I've attached a server log showing the full library scan taking place, as you can see there's very little in the log because there's no changes to the library content, but you can see some extremely slow response times which are indicative of the problem. not long after the last entry in this log I had no choice but to force stop the process as the server wouldn't respond to shutdown requests and the constant drive thrashing is not good for drive lifespan

 

Screenshot2024-04-30at11-37-40DiskStation-SynologyDiskStation.png.6ad463daa65f5548004b58dbdb7bf458.png

 

embyserver.txt

Edited by Statick
Posted (edited)

You are running emby server 4.8.3 - could you  switch to the current version 4.8.5 - see https://emby.media/download.html

If the problem remains when on 4.8.5, please enable debug logging and provide a fresh log file captured

A fresh launch of the server but running with debug logging enabled - enabled before the server is re-launched and problem problem reproduced and log captured,  Please indicate what the memory usage was at the time

See settings for Logs to enable debug logging

Not much showing in the log without debug

2024-04-29 00:42:43.599 Info TaskManager: Executing Scan media library

2024-04-29 00:42:43.599 Info Server: http/1.1 Response 204 to host1. Time: 3ms. POST http://host2:8096/emby/ScheduledTasks/Running/6330ee8fb4a957f33981f89aa78b030f?X-Emby-Client=Emby Web&X-Emby-Device-Name=Firefox Windows&X-Emby-Device-Id=9f858a49-a835-4b2f-a0b8-11b93c63228d&X-Emby-Client-Version=4.8.3.0&X-Emby-Token=x_secret1_x&X-Emby-Language=en-gb
2024-04-29 01:53:22.481 Info Server: http/1.1 Response 500 to host3. Time: 255ms. GET http://192.168.1.2:8096/emby/system/info/public?format=json
2024-04-29 01:53:22.481 Info Server: http/1.1 Response 500 to host3. Time: 275ms. GET http://192.168.1.2:8096/emby/system/info/public?format=json

 

 

 

Edited by sa2000
Posted (edited)

heya the log above was from 4.8.3 a couple of days ago, but I've since updated to 4.8.5 and then updated again to the latest beta and the problem remains on all of them. I just had a full log from a clean start to a crashed system saved from a few days ago so that's what I've uploaded here

I'm not at home currently so will get the debug logs over another time

memory usage is very low when not running library scans, usually around 150-200MB, and total system memory usage is under 1GB, as soon as I initiate a library scan the memory usage just goes up and up and up until either the scan completes or the whole system becomes unusable (or DSM terminates the process)

 

Edited by Statick
Happy2Play
Posted

@Lukeis there an issue with the failed sync as that is really the only thing shown in that log?

 

Posted
3 hours ago, Happy2Play said:

@Lukeis there an issue with the failed sync as that is really the only thing shown in that log?

 

I don't think that's related, but will investigate.

Posted

Hi, have you updated to Emby Server 4.8.5? Has that helped?

Statick
Posted

updated to 4.8.5 and then again to 4.9.0.16

no change

 

Posted
1 hour ago, Statick said:

updated to 4.8.5 and then again to 4.9.0.16

no change

 

Can you please provide a new server log example? Thanks.

Statick
Posted

okay here's a debug log attached

I didn't want to let the scan eat up all available RAM and start causing drive thrashing again so I let it run for 20 minutes with the debug log running and then cancelled the scan

I monitored the memory usage during this time in the Synology process monitor, here's how much RAM Emby was using for each minute since the server started, and the % of the library scan progress. as you can see the RAM usage just goes up and up for the entire duration of the library scan, upon cancelling the scan the RAM is freed up again and Emby returns to ~200mb usage

 

server start	200mb
01:00 	10% 	320mb
02:00 	11% 	430mb
03:00 	11% 	640mb
04:00 	12% 	820mb
05:00 	13% 	960mb
06:00	14%	1.06gb
07:00	15%	1.18gb
08:00	17%	1.32gb
09:00	18%	1.45gb
10:00	20%	1.59gb
11:00	21%	1.71gb
12:00	22%	1.82gb
13:00	23%	1.93gb
14:00	24%	2.08gb
15:00 	26%	2.24gb
16:00 	26%	2.36gb
17:00	27%	2.48gb
18:00	27%	2.56gb
19:00 	28%	2.67gb
20:00	29%	2.78gb

 

embyserver debug.txt

  • Thanks 1
Statick
Posted

oh one more point - as I'm not sure that this is visible in the server logs - I'm running DSM7 not DSM7.2 as my Synology is quite an old model (DS1815+) and cannot take 7.2. you do have different packages for these two DSM versions so I thought this may be important!

 

 

Posted

Hi, thanks for the update. We're looking into it. Thnaks.

  • 4 weeks later...
Statick
Posted (edited)

hey guys. I have increased the RAM in my synology to 16GB (it's supposed to be maximum 6GB but with the right RAM modules it will go higher) and updated to 4.9.0.22 beta

I can now complete library scans, during the scan Emby RAM usage will increase to over 9GB before it completes, so it's no wonder I was unable to complete scans with only 6GB installed

I have taken a screenshot of the RAM usage over time so you can see how it increases. it increases fairly steadily until the scan reaches around 60% at which point it still increases but more slowly (in the screenshot you can see the RAM usage goes down a bit before going back up again, roughly half way along the graph, this is at around 60-65% library scan). it then continues to increase at a slower rate until the scan hits 90%, then the usage remains constant until the scan completes. after the scan is completed, the RAM is returned to the system

 

 

 

Screenshot 2024-05-27 at 21-46-54 DiskStation - Synology DiskStation.png

Edited by Statick
DonMacaroni
Posted

"Task Manager" screenshot would be better as current screenshot does not tell anything, what (process) actually consumes memory and how much.
From developer perspective, it is useless...

  • Agree 1
Statick
Posted (edited)

that's the Emby process consuming almost all of that, as I've explained in my earlier posts. the reason for this screenshot was to show how the increase in memoy usage changes at different points in the library scan i.e it slows down when the scan reaches around 60% then levels off when the scan reaches 90% and does not increase anymore as the scan continues past 90%. I hoped this information may help track down exactly where the problem lies, as different stages of the library scan consume memory differently - obviously there's no problem with the part of the scan that happens beyond 90%, so developers trying to solve this can focus their investigations on the earlier stages of the library scan. showing this information seems like a helpful thing for tracking down this issue

I already noted down from the task manager how much RAM the Emby process consumed, every minute for 20 minutes, and posted the figures in a table along with what % of library scan the server had reached, so sharing a screenshot of a single moment of that same information isn't going to tell anyone anything I haven't already posted, because that is literally the same information I already posted

 

 

Edited by Statick
sa2000
Posted (edited)

Is this reproducible with a scan of one specific library? The debug log from the 1st May showed the schedule task for Scan Media Library being started and then after 24 minutes cancelled. Would be useful to have timestamps on the memory usage climb to 60% and to relate that to the debug server log timestamps. 

Also if this can be repro'ed for one library, a full directory listing of all media folders and files in that library would be useful. That can be sent to me by private message in a zip - alternatively a zip of the library database sent privately and indication as to which library was scanned - debug server log to go with the list or db file copies

/var/packages/EmbyServer/var/data/library.db
/var/packages/EmbyServer/var/data/library.db-wal
/var/packages/EmbyServer/var/data/library.db-shm

 

Edited by sa2000
  • 1 month later...
Posted (edited)

So, I'm running Emby Server on a debian machine (bookworm) and starting around end of may this year (upgrading from 4.7.11.0 to 4.8.7.0 on may 17th, upgrading to 4.8.8.0 on may 31), I experienced memory leaks of the EmbyServer processes, filling up the 16GB. I've obviously enabled logging with debug, disabled a few plugins and after a few weeks of diagnosing I came accross the fact that it is the scheduled « Scan media library » that is triggering the behavior. Set it to run every two hours, after a full night the process consumed ~22% of my 16GB.

While the scale make it a bit hard to see, you can observe plateaus on the chart from tonight: those are moments I disabled the scheduled media library scan. As soon as I reenabled it, it is obvious there's some slow leaking.

To the best of my experiments, disabling the scheduled task and manually running I can see e.g., the memory consumption of the process following the cinematic below:

  • startup ~1.8% RAM consumed
  • start media scan on one of the library, peaks ~2.1%
  • scan completed within ~20s, finished and memory goes back to ~2.0%
  • scan again, other library, peaks ~2.2%
  • scan completed within ~25s, finished and memory goes back to ~2.1%

Attached is a view of the leak from its beginning, as well as the experiment from tonight.

There's nothing actionable in the debug log, all that is visible is that it is performing media scan successfully, listing the content like that:

Quote

2024-07-16 10:41:54.185 Info TaskManager: Executing Scan media library
2024-07-16 10:41:54.225 Debug SqliteItemRepository: GetitemById AggregateFolder 15488 root
2024-07-16 10:41:54.492 Debug SqliteItemRepository: GetitemById Series 16683 [...]
2024-07-16 10:42:06.672 Debug ProviderManager: WaitForRefreshQueue waiting for 0 items in refresh queue
2024-07-16 10:42:06.687 Debug App: Running post-scan task CleanDatabaseScheduledTask
2024-07-16 10:42:06.751 Debug App: Running post-scan task CollectionLibraryPostScanTask
2024-07-16 10:42:06.753 Info TaskManager: Scan media library Completed after 0 minute(s) and 12 seconds
2024-07-16 10:42:06.755 Debug ProviderManager: WaitForRefreshQueue complete
2024-07-16 10:42:22.175 Info TaskManager: Executing Scan media library
2024-07-16 10:42:34.110 Debug ProviderManager: WaitForRefreshQueue waiting for 0 items in refresh queue
2024-07-16 10:42:34.110 Debug App: Running post-scan task CleanDatabaseScheduledTask
2024-07-16 10:42:34.143 Debug App: Running post-scan task CollectionLibraryPostScanTask
2024-07-16 10:42:34.143 Info TaskManager: Scan media library Completed after 0 minute(s) and 11 seconds
2024-07-16 10:42:34.151 Debug ProviderManager: WaitForRefreshQueue complete

 

Since you mention database, the settings I have are a cache of 256MB, 4000 rows for analysis and trying to optimize on server shutdown

Based on the charts and the upgrade log, I suspect it might be after 4.8.7.0 -> 4.8.8.0 upgrade that it started happening.

 

leak.png

leak-oneday.png

Edited by eldoctor
Posted
32 minutes ago, eldoctor said:

After a few weeks of diagnosing I came accross the fact that it is the scheduled « Scan media library » that is triggering the behavior. Set it to run every two hours, after a full night the process consumed ~22% of my 16GB.

Thanks for the feedback. I will see if I can repro what you observed. Will try on windows as that is my normal platform. Will see if frequent Scan Media Library shows a climb with VMMap.

Posted

My default value was ~8h I believe, changing to 2h for debugging purposes, even now sedt to every 15 mins, restarted the process when I posted the previous message and we're already at ~7.5% consumed.

 

I have also at the same time reduced the cache size to 64MB and lines to 400, after checking recommended settings. My database file is ~33MB

leak-now.png

Posted
22 hours ago, eldoctor said:

Set it to run every two hours, after a full night the process consumed ~22% of my 16GB.

 

20 hours ago, eldoctor said:

changing to 2h for debugging purposes, even now sedt to every 15 mins, restarted the process when I posted the previous message and we're already at ~7.5% consumed.

 

22 hours ago, eldoctor said:

There's nothing actionable in the debug log, all that is visible is that it is performing media scan successfully, listing the content like that:

I think it would still be useful to have a debug set of log files covering the period of the memory usage climbing (with say 1 hour or 15 minute run of the scheduled task) and to have with that a log file of memory usage by the EmbyServer process logged every 1 second or 10 seconds with a timestamp for each logged memory usage

The EmbyServer log would show what sore of action was being performed and if any new media files were picked and what type of media and this together with the detailed memory usage record should help narrow the problem area,

Having just the scan repeated running with no new media will not show the problem and the problem area may be very specific 

I am still going to run my windows EmbyServer with a scheduled library scan every 15 minutes and using powershell record the memory usage periodically - but I don't think it will reproduce the problem

Posted (edited)
24 minutes ago, sa2000 said:

The EmbyServer log would show what sore of action was being performed and if any new media files were picked and what type of media and this together with the detailed memory usage record should help narrow the problem area,

As I pasted above, there's 0 action taken, and no new media. Scan completes successfully within <12 seconds on both libraries. And yet it leaks.

The  log above is complete, I just removed the noisy HTTP requests and the actual filesystem directories that were listed.

The content i removed is:

Quote
2024-07-16 13:04:19.277 Debug SqliteItemRepository: GetitemById AggregateFolder 15488 root
2024-07-16 13:04:19.530 Debug SqliteItemRepository: GetitemById Series 16683 /media/b65826a5-641a-44cc-b656-a5c11154c434/Vidéos/Séries/XXX

[...]

There's nothing new discovered.

Regarding logs, the anonymized version still leaks quite a lot of personal data.

What I could try is:

  • Restart with schedule set to 15 min
  • monitor with e.g. `$ cat /proc/3196917/statm | ts`

This could produce a second-based output like:

Quote

$ cat /proc/3196917/statm | ts
juil. 17 11:40:03 1180823 84607 21330 22 0 110042 0

Edited by eldoctor
Posted (edited)
1 hour ago, eldoctor said:

As I pasted above, there's 0 action taken, and no new media. Scan completes successfully within <12 seconds on both libraries. And yet it leaks.

OK thanks - I will monitor on windows and if memory stays at same level, I will try on a NAS

1 hour ago, eldoctor said:

Regarding logs, the anonymized version still leaks quite a lot of personal data.

If you are ok with it, you can zip and send me by private message

 

1 hour ago, eldoctor said:

What I could try is:

  • Restart with schedule set to 15 min
  • monitor with e.g. `$ cat /proc/3196917/statm | ts`

I am not a linux expert. I know that I can see the memory used by Emby Server using commands like

ps | grep EmbyServer
top | grep EmbyServer

When I tried cat /proc/pid/statm on my WD NAS, I could not correlate the 1st or 2nd values displayed with the memory usage displayed by top or ps for the process.

VmSize in output from cat /proc/pid/status appears to correspond with the value in top or ps

So long as I get a log file with date/time and the total memory used by the process in kb or mb or bytes, I would be happy

Edit: OK - multiplying first value from statm by 4 gives the memory. Fine - thank you

Edited by sa2000
Posted (edited)

Shared you the server log in DM, I have to wait a bit before sharing the statm log. Consumption in htop went from ~1.8% at startup to ~15.5%, running every 15min for ~2h

Edited by eldoctor
Posted

Thank you @eldoctorfor your help and diagnostics. I am now running same capture on my MyCloudPR4100 NAS and will refer to the development.

I have already referred the high memory footprint on launch

Posted

@eldoctor I have referred the suspected memory leaks issue to the development team with diagnostiscs from your linux server and my MyCloudPR4100 NAS emby server. In your case there was a clear correlation between memory climbs of 512Mb at every 15 minute scheduled task Scan Media Library action. In my case, it was not that clear but the final outcome the same with memory getting too high and the NAS itself became unresponsive after 24 hours and the OS did an OOM Kill on the Emby Server process. I had two sets - one that crashed out after 2 hours of 15 minute scheduled task Scan Media Library and the second test took 24 hours to crash. The first was with the TuneIn plugin installed. I think the starting base line high memory footprint was probably a factor in the crashes.

the 15 minute frequency was just for testing to bring on the problem

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...