Jump to content

Will the embarassingly slow initial scan ever get faster?


popeye246810

Recommended Posts

popeye246810

I used to use Emby as my main server, but when skip intro on plex was released well i jumped ship.  Now that Emby has skip intro i decided to come back to give it a whirl, and well wow how embarassing is the initial library scan?? 

 

I have done all the stuff on the forums to speed it up but as an example 

 

i have a 4k movie folder mounted via rclone with about 1900 movies

Plex takes about 2 hours to scan in

Jellyfin about an hour 

Emby 24+ hours

 

i mean why so slow?

 

When i add the library it takes almost half an hour just to list the movies i have then it will scan just 1 at a time, both plex and jellyfin will immediately start scanning in the films.

 

I mean im all for making sure everything is scanned in perfectly, but what on earth is Emby doing?  

whom ever is resposible for the initial library scan should be really ashamed of this, its just simply no good for anyone with a large library.  it seems there are plenty of complaints scattered all over the internet, but yet no one does anything to fix it?

Link to comment
Share on other sites

Happy2Play

Well every option you enable will make things slower.  Personally, I disable everything until after media is imported.

As I assume you have metadata for all media for others to import that fast.  As you cannot spam providers that often and not get blacklisted.

Link to comment
Share on other sites

popeye246810

No metadata store at all for anything, it's literally just emby.  I keep trying it to see if it's any faster but it never seems to improve.

 

I have a lot of content and it roughly equates to 2 movies per minute, and going by that, it would take 17 hours to complete my smallest library.  I have tried everything possible only having 1 provider enabled, turning off nfo reader, no subtitles etc etc.  It makes no difference at all.  It's just simply not for anyone with a large library at all.

 

My estimate would be over a week to s an in my content, jellyfin took a day and Plex 2 days, I don't know what you have Emby doing in the background, but you really need to fix this age old problem.

 

Both Plex and Jellyfin have skip intro enabled chapter image extraction enabled and all images to download in advance, so they can possibly go even quicker.  It's embarrassing and you charge people for it too!

It's something that really needs to be looked at and not just passed off, or ignored.  

 

Link to comment
Share on other sites

Happy2Play

Then I see no way they are not violating provider speed/request limits.  As you can see in Emby logs throttling per TMDB requests.

 

Link to comment
Share on other sites

popeye246810

But surely if both Plex and Jellyfin can do it at such speeds without issue, Emby should definitely be able to speed up?

 

If I manually identify each item it goes much faster but that is such a ballache.

 

I know that this will be ignored, but it really does need someone to look into it.  It's clearly only an Emby issue as others can do it at much higher speeds.  

Link to comment
Share on other sites

rbjtech

Log ?  Configuration ?

If it's generating video preview thumbnails then it will be slow as it runs sequential and won't move onto the next Item until it's complete.

I can scan a multi thousand film library in an hour or so if I turn off the thumbnail generation.

  • Agree 1
Link to comment
Share on other sites

Happy2Play

@popeye246810 But the devs have done it faster in the past and we go 429 errors and had more complaints why am I not getting metadata as the provider blacklisted user for to many requests to fast.  And the provider reached out to the Devs from my understanding.

My understand is one of the other guys has their own metadata mirrors also.

But overall options is a major factor as I can import over 8000 movies and 15,000 in less than a day.  But everything is local also.

 

  • Agree 1
Link to comment
Share on other sites

popeye246810

But if that is the case, why does Jellyfin not suffer from this? I know Plex has their own metadata "servers" or whatever but I'm pretty sure Jellyfin won't have that option.

 

I know in Jellyfin it scan 6 items at the sames time do will do roughly 18 movies per minute.

 

I understand that with all my stuff on a Google drive it will take longer but not nearly a week.  

 

There has to be a solution to this issue and I know I'm not alone with it.

 

Link to comment
Share on other sites

arrbee99

Something to do with the order of metadata providers maybe ? Do some not throttle and they're the ones used by Plex / Jellyfin ? Just wondering.

Link to comment
Share on other sites

popeye246810

Like I said I turned all options off and just used tmdb but read that is slow due to their API and to consider just using tvdb so I tried that too and neither made any difference.

 

I just wish Emby would up their game with this, apart from their music section everything else is spot on.  I would really like to try it with the skip intro, but I guess it's just a pipe dream.  Unless I'm willing to let my library scan at ridiculously slow speeds and wait over a week to scan in, then even longer to generate the skip intro, and then not to mention I then have to wait possibly months for it to get image previews.

 

At this time Emby just isn't a viable option with this issue always present.

 

I just wish the devs would look at the other 2 and think yeah we really need to look into the initial scan and the way we do music, if these issues were sorted I'd have no issue giving my money to them.

Link to comment
Share on other sites

Happy2Play

I can potentially see improvements for existing metadata scans but not from a requires metadata standpoint.  But in the end only the devs cans comment on this.

Link to comment
Share on other sites

rbjtech

So I just did a test - it was checking all providers but filtered the logs for 'MovieDbProvider'

Log filtered to show 1 minute of activity 20:51 - 20:52

I count 53 films in 1 minute ...

image.png.45144f7c7a60cfc1e88ae24b1028a031.png

image.png.a25a648513d2135d6c2efce9d065e9d6.png

 

Link to comment
Share on other sites

popeye246810

I did go down the route of adding all the metadata to the files myself and whilst this does help with the initial scan, performance is extremely poor as it does not copy all metadata to local (my m.2 drive) and instead leaves on the network drive and when the cache deletes itself it will take an exceptionally long time to show the images.

Even if they had it so Emby would just copy all metadata to my superfast m.2 drive I can use tinymedia manager or similar to get all metadata for my media, which only takes hours to do my entire library 

Link to comment
Share on other sites

popeye246810
1 minute ago, rbjtech said:

So I just did a test - it was checking all providers but filtered the logs for 'MovieDbProvider'

Log filtered to show 1 minute of activity 20:51 - 20:52

I count 53 films in 1 minute ...

image.png.45144f7c7a60cfc1e88ae24b1028a031.png

image.png.a25a648513d2135d6c2efce9d065e9d6.png

 

But as you have already stated all your stuff is local, so this is not relevant to my case.

 

All My stuff is mounted on an rclone drive with each folder have a 1tb ssd cache drive, I also have a 1gb internet connection.

 

And as stated before this issue of slow library scans with my setup is not present in both Jellyfin and Plex

Link to comment
Share on other sites

popeye246810

Like I said I know this issue won't even be considered to be looked at, but I do think they should.

 

As it's one of 2 things (the main thing) stopping me and others with large libraries on network drives moving away from Plex/Jellyfin

Link to comment
Share on other sites

rbjtech
8 minutes ago, popeye246810 said:

But as you have already stated all your stuff is local, so this is not relevant to my case.

 

All My stuff is mounted on an rclone drive with each folder have a 1tb ssd cache drive, I also have a 1gb internet connection.

 

And as stated before this issue of slow library scans with my setup is not present in both Jellyfin and Plex

If you have a 1Gig internet connection (as do I) then I'm puzzled - it should technically be no different to using a 'nas' over 1gig (WAN latency aside).

If you can provide a log,  PM it to the Dev's if you like (or PM me) then we can see where the 'bottleneck is'.   It will be CLEAR in the log where the delays are.  It may be the ffprobe, it may be the providers or one in particular ..

If you are not willing to provide the info to help - then just saying it's slow vs JF or Plex is not going to help you or others resolve the issue.

Link to comment
Share on other sites

rbjtech
16 minutes ago, Happy2Play said:

I am guessing from a jellyfin standpoint this is a factor also.

Library scan fanout concurrency & Library metadata refresh concurrency - Feature Requests - Emby Community

But still don't see how not getting 429 errors from provider.

Agree - I see provider responses being throttled now (it says so in the debug log) - it may be because I have a fast internet connection, but compounding the issue with a parallel scan is not going to improve the situation if your bottleneck is the provider.

Link to comment
Share on other sites

akacharos
1 minute ago, rbjtech said:

Agree - I see provider responses being throttled now (it says so in the debug log) - it may be because I have a fast internet connection, but compounding the issue with a parallel scan is not going to improve the situation if your bottleneck is the provider.

Well, it depends on the setup. I save all metadata in the same folders as the media content (pre-generated metadata with TMM) so no need to call any external metadata provider whatsoever. Emby should just read the metadata stored next to the media file. 
So I also don't get it why Emby is painfully slow (days) to do a full initial scan of all my library (no thumbnail generated or skip intro etc). . Indeed, the difference between Emby and Jellyfin is night and day for the initial scan (both using the same rclone mount)

Link to comment
Share on other sites

In the log you sent me I can see that one of your mounts is currently unavailable, but the server still goes through the process top to bottom of trying to scan every single level.

This is something I've been meaning to adjust - to stop the scan at the top when we can't query the contents of the folder. So at least in that example you provided, that could provide major improvement. 

Feel free to send other examples because that did not look like an initial library scan. Thanks.

Link to comment
Share on other sites

justinrh
4 hours ago, popeye246810 said:

But as you have already stated all your stuff is local, so this is not relevant to my case.

I understand what you are saying about comparing the scans using the same config, but why not try a local library with Emby?  That would tell everyone if the problem is only Emby's algorithm or if it has something to do with the way Emby interacts with your hardware configuration.  That could be most valuable.

Oh, and don't forget the logs!  😄

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...