Jump to content

Large library imported overnight, no media displayed this morning


Recommended Posts

rbjtech
Posted

I believe @Lukehas said there are some efficiencies on the new Beta with regards to library scanning - so maybe worth waiting to see what that brings ?

 

Posted

I know I mentioned that as I've been working with Luke on changes that greatly speed up scanning. I believe they will be in the next beta.

Posted
3 hours ago, Happy2Play said:

Not sure about the KB but could be in Tutorials/Guides section.

This should not be part of any documentation. Removing the library scan schedule is something that almost nobody should ever do, except in controlled scenarios where you're committing to running it manually as needed.

Neither should removing fetchers and putting them back on, because when you re-enable them later, it's not going to do anything for the content that was already scanned.

Posted
On 1/20/2023 at 11:57 AM, bruor said:

No, the media folder is a dynamically generated/update set of STRM files that are ripped from an M3U.  Out of date files and empty directories are purged each time the M3U is rescanned. I'd have to add handling to check for and skip deletion of the info files for which STRM files exist etc. 

I will likely implement a file creation throttle so that it doesn't generate the entire set in one run, instead only adding X new STRM files per run so I can slowly add content to the system.  

I've got gigabit here and will try a manual scan before I attempt to remove the STRM folders from the library and rescan to shrink it. 

If you're changing the time stamp on your strm files or directories, even if the content hasn't changed, you're making Emby reprocess them when there is nothing to do.

Adding files in batches with a vacuum between them helps, if for no other reason than you knowing how large the batch of new files are and how long it should take. You can remove files no longer present in the m3u as well as add new entries but want to first check the URL inside files that exist to see if it's changed before rewriting the strm file so Emby Server won't need to reprocess it.  Otherwise the library will be treated as if every item needs reprocessing causing a lot more needless work.

I wrote my own parser that loads all m3us into a database in bulk.  I can then load a new day's m3u into a temp table in bulk that creates some additional info such as the file name and path it will be written to on disk. I then run a couple queries against this.  First I compare existing rows previously entered in the database to the temp table using the path to compare on.  Any rows in this query are not in the new m3u file so I can remove the directory including all files in it as well as remove the row from the existing database. Query 2 removes all rows from the temp table that match the URL of existing entries.  This will be the bulk of entries and are items not needing to be touched as they haven't changed since the last run.  What's left is items that need to be written to disk including the folder structure.

When you then run a scan in Emby Server it only needs to remove items no longer present as well as process the new files that don't exist in it's database.

 

Posted
34 minutes ago, cayars said:

If you're changing the time stamp on your strm files or directories, even if the content hasn't changed, you're making Emby reprocess them when there is nothing to do.

Adding files in batches with a vacuum between them helps, if for no other reason than you knowing how large the batch of new files are and how long it should take. You can remove files no longer present in the m3u as well as add new entries but want to first check the URL inside files that exist to see if it's changed before rewriting the strm file so Emby Server won't need to reprocess it.  Otherwise the library will be treated as if every item needs reprocessing causing a lot more needless work.

I wrote my own parser that loads all m3us into a database in bulk.  I can then load a new day's m3u into a temp table in bulk that creates some additional info such as the file name and path it will be written to on disk. I then run a couple queries against this.  First I compare existing rows previously entered in the database to the temp table using the path to compare on.  Any rows in this query are not in the new m3u file so I can remove the directory including all files in it as well as remove the row from the existing database. Query 2 removes all rows from the temp table that match the URL of existing entries.  This will be the bulk of entries and are items not needing to be touched as they haven't changed since the last run.  What's left is items that need to be written to disk including the folder structure.

When you then run a scan in Emby Server it only needs to remove items no longer present as well as process the new files that don't exist in it's database.

 

It doesn't seem to cause any issues now that the complete library metadata has been fetched.  However, I'd be interested in a better way to update/maintain the strm files on disk.  I'm unsure how I'd tap into the actual emby database to set up something like you are describing, what's the database engine that is in use? 

  • Like 1
Posted
1 hour ago, bruor said:

It doesn't seem to cause any issues now that the complete library metadata has been fetched.  However, I'd be interested in a better way to update/maintain the strm files on disk.  I'm unsure how I'd tap into the actual emby database to set up something like you are describing, what's the database engine that is in use? 

sqlite, but what is the problem you're trying to solve? Are you trying to minimize reaction to changes? Could the strm process be updated to only be incremental and not rewrite existing files that aren't' changing?

Posted
54 minutes ago, Luke said:

sqlite, but what is the problem you're trying to solve? Are you trying to minimize reaction to changes? Could the strm process be updated to only be incremental and not rewrite existing files that aren't' changing?

So I'm ripping content from the provider into STRM files but I don't have a way to track what they've removed.  I suppose I could write something scan the directory tree first and then do a if not exist check before I write the files, and then do a another check to figure out what wasn't in the m3u so I can delete it from disk if needed.  

I just didn't want to be making emby do extra work on every nightly content refresh because I'm updating the modify time on the files. 

rbjtech
Posted
9 hours ago, Luke said:

This should not be part of any documentation. Removing the library scan schedule is something that almost nobody should ever do, except in controlled scenarios where you're committing to running it manually as needed.

Neither should removing fetchers and putting them back on, because when you re-enable them later, it's not going to do anything for the content that was already scanned.

Then lets hope the changes made to the Beta implement a sensible scanning mode for brand new 'out of the box' installs - as currently the perception it gives to the end users is 'it's slow' and they would be correct in the observation.

Rather than process each item in series (metadata, thumbs, intro's etc), imo emby needs to batch the functions and do them one at a time - still in series, but in priority order.  ie do all metadata, then do all thumbs, then do all intro's.  

That way, the libraries become 'populated/usable' much more quickly - and you'll get less issues with reports of it being 'slow' - when it really isn't.

  • Like 3
Posted
15 minutes ago, Luke said:

Neither should removing fetchers and putting them back on, because when you re-enable them later, it's not going to do anything for the content that was already scanned.

I too agree with @rbjtechand give very similar instruction to people rebuilding libraries that already have NFO, BIFs, graphics, etc in the media folder.  I've not tested this with the current or soon to be beta with the changes we've made but this technique was cutting down scanning by about a factor of 10+. I'll retest this again once we have a Synology Beta build.

I had just tested this before the changes to the Library scanning was recently done and disabling the fetchers loaded 12 times faster! The pauses we add to not hammer the provider sites adds up but having the information already in the NFO files makes hitting the providers sort of pointless anyway. Why not skip all that and just use the info from the NFO file as well as graphics, BIF and other files already present that Emby previously wrote?

Being able to turn off the fetchers and remove metadata providers can be quite handy with advanced setups as well as it allows you to delegate the scanning to your backup server with the primary using only the NFO and support files in the media folders.

 

  • Like 2
Posted
On 3/15/2023 at 9:14 PM, bruor said:

 I'm unsure how I'd tap into the actual emby database to set up something like you are describing, what's the database engine that is in use? 

On 3/16/2023 at 12:04 AM, bruor said:

I just didn't want to be making emby do extra work on every nightly content refresh because I'm updating the modify time on the files. 

I wasn't suggesting you touch the Emby database in any way but instead use a database of your own to compare last run and present run.  If you bulk load the m3u entries you could easily delete rows from present table that are the same as previous. Any rows in previous table not in present need removing from the file system as well as from the previous table.  What's left in present table are new additions so they get written to disk and copied to the previous table.
 

  • 9 months later...
Ugnaughts
Posted
On 3/15/2023 at 11:06 PM, Luke said:

sqlite, but what is the problem you're trying to solve? Are you trying to minimize reaction to changes? Could the strm process be updated to only be incremental and not rewrite existing files that aren't' changing?

i agree with luke, make sure your script dont replace the strm files each time it runs. im in exact same scenario as you by adding the same amount of strm files. i created a script then found out allot of it was the fact it was replacing the existing strm files. then doing a full scan each time.  emby must of been picking up on that, maybe hash or date time of file. i changed my script to not replace if it was there. major improvement !!!!

also i had my script todo only a couple m3us at a time, then i would vacuum and restart emby. this helped from emby from being buggy/not working right. 

also agree with disabling some features / fetchers temporally. 

also since im on a pc with decent resources. i changed the database cache 1024. not sure if this helped but figured it was low for nas units. 

BUT even then i still have a bug i cant fix at the moment or trace down.  at one point i lost 40 titles then at another point i lost 400. out of 130,000 that's not bad stats ,but watch it be the shows i watch lol .  it shows total number of strms in folder correct, but in emby under episodes its short.....

but still takes many hours to get these added for me so far. good thing is once its done ill backup everything. then it just has to process the changes. 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...