Jump to content

More intelligent library change detection


embylad892746

Recommended Posts

embylad892746

If you change your library path names despite movie folders/files  themselves being untouched, the entire library will rebuild. This can take considerable time.

e.g /root1/movies1 to /root/movies

 Would it not be possible that upon scan a quick hash of file is performed. And then upon rescanning in the future it realises it already has this file metadata and therefor no need to nuke it from its metadata and rebuild it

i think you get the idea.

I know some might suggest putting metadata alongside the media but this is surely less optimal if your media is on slow storage

If I’m wrong on anything please correct me. Thanks.

Edited by embylad892746
  • Like 1
  • Agree 1
Link to comment
Share on other sites

rbjtech

This is another reason why I always suggest using UNC path names instead of local paths  as it makes things 'portable' behind the scenes.

As an example, for as long as I can remember, my 'movie' path has always been the UNC path \\media\movies - what's 'behind' that share has been changed many times - but emby doesn't know any different, so it doesn't have to re-scan anything if and when I change it.  Could be a different local filesystem, different host - doesn't matter, the UNC stays the same.

You can also do something similiar with symlinks (folder level) but thats restricted to the local machine - shares are universal.

  • Agree 2
Link to comment
Share on other sites

embylad892746

I completely agree @rbjtechthat this is "the way". The reason i have this mess is that i've just done the equivalent thing, but on the filesystem level usering mergerfs - now all that emby knows about is /movies /tv which both respectively point to different merged drives.

However, for those not yet using UNC/mergerfs it just seems to me that a small amount of extra intelligent logic could save thousands of scan hours for people in the future when these things inevitably happen again.

 

 

  • Agree 1
Link to comment
Share on other sites

3 hours ago, rbjtech said:

This is another reason why I always suggest using UNC path names

And some sort of drive pooling and then you never have to have this issue.

Still, this is a valid request but it will probably come down to cost/benefit.  One of the most common complaints we get is around library scan speed and every feature/computation we add into that mix increases that.  Even when we make them optional and put red warnings on them, people still complain that their scan is slow and then we find they've enabled all these options that slow it down.

So, it comes down to is the cost on every single scan worth it for a situation that is really gonna occur < 1% of the time...?

  • Like 3
Link to comment
Share on other sites

adminExitium

The hash should only be calculated & used if the modtime & size match, both of which are present in the directory listing on every platform. And OSHash should be even faster than doing an ffprobe on most files since it needs to read even lesser data.

Link to comment
Share on other sites

2 minutes ago, adminExitium said:

The hash should only be calculated & used if the modtime & size match, both of which are present in the directory listing on every platform. And OSHash should be even faster than doing an ffprobe on most files since it needs to read even lesser data.

The hash has to be calculated on initial ingestion or it will never exist to compare to.  And, no mater how much faster it is than any other operation, it is an additional operation.

So, again, valid request, but would need to be very carefully vetted.

  • Agree 1
Link to comment
Share on other sites

Q-Droid

The suggested approach doesn't have to be followed for the request to remain valid. Basically if the media item can be identified by name and tracked atomically then its location is less relevant and can be decoupled. This may not account for multiple versions but that shouldn't be too much of a challenge to make part of the data. To me the big question is where is most of this time spent now? Is it removing the "old" and re-acquiring the metadata for the "new"? Does it repeat any processing for thumbs, chapters, intro-skip and the like? If so then that is time worth saving if one decides to move media around.

One thing about pooling and UNC paths (or any network shares) is that OSs like Linux lose the ability for RTM.

 

Edited by Q-Droid
  • Thanks 1
Link to comment
Share on other sites

embylad892746

The only thing that I can repair after a library rebuild are the playlist.xml files since i have these under version control. Luckily it's easy to search and replace the <path> elements with my updated paths to fix... However, this brings up another point (and with the spirit of this thread): surely the playlists should dynamically find the path names using an id lookup in the db for example, and should not require hard coded <path> elements in case they change? This to me is another example of a feature that breaks, but shouldn't upon library rebuilds.

My library is pretty modest compared to many people but for thumbnails, metadata, video thumbnails etc it's been scanning an entire day now and I'll be going to bed soon. I'd honestly be surprised if it's finished in the morning. 

I completely sympathize with @ebrin regards to the question of adding another operation in the scanning workflow. I also agree that although the frequency of complete library rebuilds might be rare on average, the energy and time wasted when it does happen, are pretty significant IMO as aforementioned. If this happens even 1 time per user, then i personally think it's worth implementing or thinking about further. In reality, i suspect it happens more than this on average.

Every time an emby user has to rebuild their library unnecessarily, the earth warms by 0.1 degrees....😛

Thanks for the discussion so far.

Link to comment
Share on other sites

rbjtech
10 hours ago, embylad892746 said:

The only thing that I can repair after a library rebuild are the playlist.xml files since i have these under version control. Luckily it's easy to search and replace the <path> elements with my updated paths to fix... However, this brings up another point (and with the spirit of this thread): surely the playlists should dynamically find the path names using an id lookup in the db for example, and should not require hard coded <path> elements in case they change? This to me is another example of a feature that breaks, but shouldn't upon library rebuilds.

On the beta at least, playlists have all been moved as db objects now - you can query them like any other item.    The external presence of playlists is now a .m3u file - tbh I'm not 100% why as it now contains the path still, but also another value - which is not the item id, nor a ProviderId.   Strange. 

@LukeWhat is the value in the m3u - 6461 in the example below ? 

#EXTM3U
#PLAYLIST:PlayList_Public
#EXTINF:6461,2 Fast 2 Furious
file:\\media\Films\2 Fast 2 Furious (2003) [tmdbId=584]\2 Fast 2 Furious (2003) - WEBDL-1080p.mkv

Thanks.

  • Like 1
Link to comment
Share on other sites

10 hours ago, rbjtech said:

On the beta at least, playlists have all been moved as db objects now - you can query them like any other item.    The external presence of playlists is now a .m3u file - tbh I'm not 100% why as it now contains the path still, but also another value - which is not the item id, nor a ProviderId.   Strange. 

@LukeWhat is the value in the m3u - 6461 in the example below ? 

#EXTM3U
#PLAYLIST:PlayList_Public
#EXTINF:6461,2 Fast 2 Furious
file:\\media\Films\2 Fast 2 Furious (2003) [tmdbId=584]\2 Fast 2 Furious (2003) - WEBDL-1080p.mkv

Thanks.

Is that the runtime in seconds?

  • Thanks 1
Link to comment
Share on other sites

rbjtech
13 hours ago, Luke said:

Is that the runtime in seconds?

It is - thanks.  Stange thing to add to the m3u ? 🤔

Link to comment
Share on other sites

  • 2 weeks later...
2 hours ago, lorac said:

In the *arr apps you can move your media so why not in emby?

HI, what exactly are you asking?

Link to comment
Share on other sites

lorac

The ability to move media within Emby / let Emby know the media has been moved w/o triggering a full scan. Using Metadata manager would seem the logical choice.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...