Jump to content

Locally cache metadata


Crusher21

Recommended Posts

Is the option to follow the three tier cache>metadata>remote-share-art scheme that online scraping uses being considered at all for offline scraped data as discussed here?

 

If not, I am not sure there is any value in hammering out any more details or symptoms of the existing two tier `cache>remote-share-art` system here when the core architecture wont do whats needed.

Better we expend effort looking at ways to work around the shortcoming for those that feel it worth the effort. I have some ideas but for most, if they manage to wade through this "wall of words", the best solution would be to simply not use offline art scraping at all with Emby and use URL only nfos as a middle ground.

Link to comment
Share on other sites

I don't understand what the 3 tier thing is you're referring to.

There is nothing wrong with Emby but you need to set it up for YOUR environment. If you don't power down your disc and they are local then you can store your graphics with the media.

If you use remotely mounted discs or let them spin down then DO NOT store your graphics on the disc but allow Emby to manage this for you in the Meta-Data folder that you can put where ever you want it to be on any disc.

You can have the cache and meta-data folders on the same drive or different depending on your needs.

What doesn't make sense is putting the graphics on a disc you know won't be quickly available.

Right now you're trying to fight the system and you're loosing.

Link to comment
Share on other sites

 

8 minutes ago, cayars said:

I don't understand what the 3 tier thing is you're referring to.

I will do my best to explain.

When scraping art metadata from an internet source Emby employs a 3 tier system "internet > metadata folder in Emby install > cache folder in Emby install"

When scraping art metadata from non internet source e.g. images beside media, Emby employs a 2 tier system " internet> cache folder in Emby install"

I am asking if Emby plans to Emby employ the 3 tier version for both.

 

I appreciate its much easier for you to just to ask people to work around it outside of Emby but can we (and I mean this with all humility)  just for a moment accept that perhaps you dont fully understand how every storage design in the world works and that there may be architectures and technologies you haven't been exposed to where functions other than Emby access the same storage especially at scale. If we can accept this we can quickly realise that just changing everything to suit just Emby is not always an option.

I really do welcome the advice and insights but it feels like you have decided already this is how it all works for you so thats how it must work for everyone.

 

Link to comment
Share on other sites

No that is not how it works.

A) use the graphics in the media folder if existing
B) Pull the artwork and put it EITHER with your media or in the meta-data folder. It can either pull the graphics at time of media discovery OR only on first use (configurable per library). FULL STOP

When Emby apps display graphics for any media item it gets pulled from the cache.  If the data is not already rendered in the cache it get's pulled from meta-data folder, rendered, saved in cache location and returned to the app.

So if you want to have discs that spin down when not in use BUT want the server to be able to display them you can't have the graphics on a disc not spinning (doesn't make sense) so you NEVER place graphics with media.  You set your Emby meta-data & Cache folders on a disc that doesn't sleep.  DONE

What you can't do is have the SOURCE of the graphics on a disc not spinning and expect to be able to load them as that doesn't make much sense. If you want to speed things up on your system turn on for each library the download in advance graphics option, do not store graphics with media.  Then delete all graphics from the library mount points and refresh the meta-data.  DONE

 

  • Like 1
Link to comment
Share on other sites

 

13 minutes ago, cayars said:

No that is not how it works.

....

When Emby apps display graphics for any media item it gets pulled from the cache.  If the data is not already rendered in the cache it get's pulled from meta-data folder, rendered, saved in cache location and returned to the app.

 

If this is the case then there is a bug because it can be demonstrated that once the cache purge happens, if you have art beside the media, it will bypass the metadata folder and re-cache it direct from the original art beside the media.

This has been observed previously as seen by this comment:

On 11/25/2020 at 4:57 PM, GrimReaper76 said:

It does use it for the Libraries that don't store artwork in media folders. For those that do, it doesn't because that art is not present in metadata folder. 

...

and consideration that it is not clear cut given by this post

On 11/25/2020 at 12:55 PM, Luke said:

It's possibly on demand.

It is also worth pointing out that Emby is just one app in a media centre ecosystem and we have be be careful when suggesting solutions like "dont ever have art beside media"  without realising that will have an impact elsewhere.

I am pretty much reserved to the fact that Emby wont entertain any enhancements here and thats ok but the previous posts brings us full circle to suggest it may still actually be a bug (although I dont honstly think so I think is it more likely the last post claiming the 2 tier model is wrong was in error).

Link to comment
Share on other sites

22 minutes ago, xe` said:

 

If this is the case then there is a bug because it can be demonstrated that once the cache purge happens, if you have art beside the media, it will bypass the metadata folder and re-cache it direct from the original art beside the media.

This is what you're not following.  Meta-Data folder is where ever the graphics resides.  It can be in the CENTRAL meta-data folder OR with the media.  It can be different for EVERY piece of media you have on your system.  If you have graphics with your media THAT FOLDER IS THE META-DATA folder for that media.

To complicate this a piece of media can have graphics stored in both the central and media folder.  If for example you use a 3rd party app that downloads some but not all the graphics Emby uses then Emby will use what you provide from the media folder BUT depending on your library settings download other graphics and store in according to your settings.

So again you adjust your libraries to NOT place graphics with media.  You manually remove graphics from the media libraries and rescan/refresh libraries to force it to put the graphics in the system defined meta-data folders.

You're overthinking this.  It is very clear cut how it works.

  • Like 1
Link to comment
Share on other sites

PenkethBoy

Yep - there is no bug that i see and emby works the same way as @cayarshas explained

The simple issue here is you have a fixation with HDD spinning up to provide data - get passed that issue and you would not care about where the data comes from

Insist that you have disks spin down and .......

  • Like 1
Link to comment
Share on other sites

OK everyone thanks for the replies. this confirms that Emby uses a two different tiering models as I posted last (and was subsequently incorrectly quoted as not true).

 

The subtly of my points is now getting lost, with support replies skimming my posts and considering only very basic old and small scale storage designs which admittedly represent the majority of current users and what they will be used to. When seen in that context my observations on Emby back-end storage assumptions, whilst still 100% true, comes at such a small cost as it is easy to waive off as minuita.

@Luke If nothing else comes of this long thread please ponder this quote and what it really means next time you are thinking about this subsystem.

19 hours ago, xe` said:

It boils down to this basic, in my opinion, incorrect architecture assumption:

When Emby scrapes metadata from the internet it preserves a duplicate copy in the local `metadata` folder because online sources are unreliable, slow and expensive.

When Emby scrapes metadata from images saved beside the media files it does not duplicate a copy locally because it assumes this data is already reliable, fast and cheap. This is a demonstrably bad assumption especially at scale.

 

Mentioning disk spin down didnt help the matter as whilst I hoped it would give an understandable example at small scale for demonstration purposes it unfortunately backfired locking everyone into a self affirming way of thinking that their idea of storage design is the only way ... and not what I hoped... to show that there are other factors to take into account at scale.

I thank everyone for all the advice and whilst it did get quite passionate I certainly know more now about how Emby works than I did when I started.

I will leave it there but if you would like to expand your knowledge and have a sneak peak at the future have a look into what vendors are doing with on-prem cold storage and related tiering technologies. Like F1 it will trickle down soon enough to the prosumer.

Good luck.

Link to comment
Share on other sites

PenkethBoy
When Emby scrapes metadata from images saved beside the media files it does not duplicate a copy locally because it assumes this data is already reliable, fast and cheap. This is a demonstrably bad assumption especially at scale.

This is a very valid reason - and nothing to do with scale

only time this might be valid if you are using cloud storage as your primary location - but other options are available to make this a minimal issue.

Link to comment
Share on other sites

  • 1 month later...

We are still really struggling living with this day to day.

My current work around is to record a keyboard macro that pages slowly through the TV and movie views in the web interface but this is ugly and human time consuming.

Is there any way I can directly tell Emby to check/refresh the cache of the top level views... open to any idea.

 

help :)

 

 

Link to comment
Share on other sites

I'm not sure where the struggle is?

If you create a metadata folder on an SSD drive or fast HDD and also put the cache folder there you will have very quick loading.  Emby has features built in to accommodate these types of things but you have to use them properly.

Link to comment
Share on other sites

We use mostly kodi addon here which means that art caches out more often as kodi adds another layer of caching emby cant see (so it thinks main view art hasnt been consumed).

When this happens the next proper emby gui access causes a load of storage to wake up to grab new copies of covers. During this time the performance of the emby tanks and it looks semi broken as art is blank until it is re-cached.

Unfortunately SSD doesnt help here as the images have cached out by design.

 

My only two options i know off are to:

 

stop using art beside media (which I am working on but its  a big change to accommodate as Emby is not the only tool we use)

or

Script something to simulates a user paging the web gui over and over. (this is what I am doing but its ugly solution)

 

 

 

Link to comment
Share on other sites

1 hour ago, xe` said:

We use mostly kodi addon here which means that art caches out more often as kodi adds another layer of caching emby cant see (so it thinks main view art hasnt been consumed).

When this happens the next proper emby gui access causes a load of storage to wake up to grab new copies of covers. During this time the performance of the emby tanks and it looks semi broken as art is blank until it is re-cached.

Unfortunately SSD doesnt help here as the images have cached out by design.

 

My only two options i know off are to:

 

stop using art beside media (which I am working on but its  a big change to accommodate as Emby is not the only tool we use)

or

Script something to simulates a user paging the web gui over and over. (this is what I am doing but its ugly solution)

Not sure what you don't understand but both of those options are very poor choices.  Put the metadata and cache on an SSD or fast HDD that doesn't spin down and your issues will vanish.  

Fix your setup vs putting yet another band-aid on the situation.

Link to comment
Share on other sites

Unfortunately even though i have already `metadata and cache on an SSD ` it will not help since Emby does not use the metadata folder at all when the source of the covers is of the `art beside video` source type. This means that when the images in the cache folder expire the original `art beside video` is recached and not the metadata folder.

I am not looking to change this model I am simply looking for a way to recache this data on a schedule other than the semi-random browsing of my kids or by having to manually page after page once a month.

Link to comment
Share on other sites

22 minutes ago, cayars said:

those options are very poor choices

Just saying works for Kodi since decades without issues and it is no fancy design or something, just keep stuff you greped before in cache. Its a basic feature, doesn't need anything fancy and works even at super low hardware.

To use a SSD to workaround is really odd. First I really don't want my meta data outside of the actually folder (there is no proper reason to do it - works everywhere) and I expect that a cache does what it is by definition - also how should that work? Manually copy everything to SSD? 

To cache it for 30 days and then just drop if because 30 days are over is just strange. Can anyone point me to a usecase where local metadata are frequently changed every 30 days that this stuff is needed? I get that you need some kind of cleanup that the db doesn't grow endless, 180days or some similar values are likely more fitting here.

To be clear I have no problem if the default would be 30 days and the whole setting is configurable.

Btw the cache does not work at all even if you put it at the SSD. Then it still recaches everything because 30days are over, just the speed penalty is not there anymore.

A other way the cache would work super well in its current form is to change all NAS drivers to SSDs. The cache still does not work properly but you won't see the speed penalty. 

Link to comment
Share on other sites

1 hour ago, xe` said:

Unfortunately even though i have already `metadata and cache on an SSD ` it will not help since Emby does not use the metadata folder at all when the source of the covers is of the `art beside video` source type. This means that when the images in the cache folder expire the original `art beside video` is recached and not the metadata folder.

I am not looking to change this model I am simply looking for a way to recache this data on a schedule other than the semi-random browsing of my kids or by having to manually page after page once a month.

BUT YOU DON'T have it set up this way and that's what we have been saying.  You can't have the graphics next to the media or it will ALLWAYS get used. This is exactly the source or your issue.  You need to remove that and turn off the library option to save meta-data in the media folders.  At that point your graphics will be in the meta-data folder on your SSD and things will work much faster in your environment.

1 hour ago, CvH said:

Just saying works for Kodi since decades without issues and it is no fancy design or something, just keep stuff you greped before in cache. Its a basic feature, doesn't need anything fancy and works even at super low hardware.

To use a SSD to workaround is really odd. First I really don't want my meta data outside of the actually folder (there is no proper reason to do it - works everywhere) and I expect that a cache does what it is by definition - also how should that work? Manually copy everything to SSD? 

To cache it for 30 days and then just drop if because 30 days are over is just strange. Can anyone point me to a usecase where local metadata are frequently changed every 30 days that this stuff is needed? I get that you need some kind of cleanup that the db doesn't grow endless, 180days or some similar values are likely more fitting here.

To be clear I have no problem if the default would be 30 days and the whole setting is configurable.

Btw the cache does not work at all even if you put it at the SSD. Then it still recaches everything because 30days are over, just the speed penalty is not there anymore.

A other way the cache would work super well in its current form is to change all NAS drivers to SSDs. The cache still does not work properly but you won't see the speed penalty. 

There is nothing wrong with Emby setup or configuration and it works works great with Kodi or standalone.

The problem is that you can choose to have the meta-data with your media or in the meta-data folders.  I have it stored with my media BUT don't allow my drives to spin down so I do not experience any slow loading of cache reloads.

What you can't do and have fast cache reloads is meta-data on drives not spinning and expect fast loading.  It's physically impossible.  So you either keep the drives spun up or move data you want to load fast off the non-spinning drives.  Emby can do this out of the box.

Meta-data doesn't have to get updated.  That's the ADMIN'S Choice in Library Settings.

image.thumb.png.12769058038d1267c2de101c06634143.png

Actors, Die, Marry, Divorce, have Children, win awards, etc that all become part of their bio. They age and have their pictures online changed to reflect this. So it only makes sense to update this information on some regular basis so it's not forever stale and outdated.

If you have your meta-data graphics on a drive spun up the reloading of the data in the cache is VERY, VERY FAST especially when on SSDs.

But again, a system not optimized or setup to handle spun down drives is going to have 10 to 15 seconds delay (time it takes to spin up drives) and their config is working against them. This delay should be expected because you're putting your data on a non spinning drive.

This is easily fixed BUT you MUST stop putting the graphics with your media and allow the system to store it in the meta-data folder instead.

Link to comment
Share on other sites

All i want to do is have Emby not expire covers cached for the top level content views or increase the timeout greater than 30 days or reset the covers timeout without needing to page down manually over and over for all TV/Movie lists.

How can I achieve this?

 

Link to comment
Share on other sites

Happy2Play
1 hour ago, xe` said:

All i want to do is have Emby not expire covers cached for the top level content views or increase the timeout greater than 30 days or reset the covers timeout without needing to page down manually over and over for all TV/Movie lists.

How can I achieve this?

 

I guess you could go into the API and change the hidden daily "Cache file cleanup" task trigger to say 180 day interval instead of every 24hrs or disable it but that will leave bloat behind if items are ever deleted/moved around.

So from 864000000000 to 155520000000000

  • Like 1
Link to comment
Share on other sites

  • 2 months later...

Just to close this one down. I gave up. I tried everything and whilst I started to get somewhere I was straying so far into atypical setup land i threw in the towel and accepted defeat.

I have now stopped using XML nfos and art beside media, converting to URL only nfos and the real world performance of Emby has visibly improved.

Thanks everyone for the advice

Edited by xe`
Link to comment
Share on other sites

rbjtech

This is an interesting topic - so I looked into it in depth myself.

Using sysinternals (free Microsoft Developed toolset) ProcessMonitor in File mode - it is extremely easy to see in real time what files emby is reading.

Just set the filters to show EmbyServer.exe, exclude any local disk access (ie cache) and it will then show you all other disk activity from EmbyServer.exe

By loading the top level screens, then clearing the browser cache, then reloading - you can clearly see that it is not accessing the remote drives - BUT there has been occasion when it has and I'm not 100% why as the image was clearly listed on the screen.  

I do have a persistent offender - one item will NOT stay in the cache for some reason - and always shows (see below).

I'm digging deeper to find out why this is.

But this is a sure way to find out WHY remote disks are being accessed, if you believe they should not be.

 

Capture.thumb.PNG.8247afa2d5aa56f4f6fc6ddd19746873.PNG

 

Edited by rbjtech
Link to comment
Share on other sites

tl;dr Emby has two "caches" (excuse terminology).. an actual cache for everything delivered via the human interface and a separate cache for things Emby scraped from the internet itself to deliver on demand (covers, peopel etc). This is the key point ... "art beside media" is not scraped from the internet by Emby so only makes use of the short term true cache and not the second cache.

What does this mean .... well unless you access every art asset within this true cache TTL in order to keep it fresh it will expire (rightly). This is no problem for stuff thats on the second "cache" since typically this is typically also on fast local to Emby storage (SSD etc) however since art beside media is not part of that scheme it will look to source to refresh the cache after TTL expired but when needed and whamo you are accessing slow bulk storage.

Because the cache expiry is driven by a fixed clock but also human access it can seem quite random even though it is not.

This is not a bug, it is by design and the reason I have abandoned art beside media.

Link to comment
Share on other sites

  • 2 years later...
Statick
On 16/01/2021 at 22:02, Happy2Play said:

I guess you could go into the API and change the hidden daily "Cache file cleanup" task trigger to say 180 day interval instead of every 24hrs or disable it but that will leave bloat behind if items are ever deleted/moved around.

So from 864000000000 to 155520000000000

just resurrecting this thread to point out that if you change this in the API and then emby installs an update, the setting gets changed back. I too suffer from performance issues due to the cache being expired every 30 days, thought I'd sorted it with this somewhat annoyingly difficult to reach setting (can't just change it in the settings page - have to go digging into the API to do it) only to find a couple of months later the problem was back already, because the setting has been reset. can we at least have this setting be left alone in future please? I just want my cached artwork to persist

 

 

Edited by Statick
Link to comment
Share on other sites

  • 2 weeks later...
Statick

update, this setting actually gets changed back every time you restart Emby. can we please just have a way to turn this off and it stays off please - every time the cache gets cleaned out, my emby slows to a crawl due to it pulling in all the artwork over a slow VPN, when that artwork was already cached and working just fine before that happened. I never delete any content from my libraries so there's literally no reason for me to need to clear out the cache more than once a year. it's really frustrating that this can't just be adjusted with a simple setting but we have to go digging into API commands, and even then whatever setting we choose just gets replaced with the default every time it restarts anyway

 

 

 

Edited by Statick
Link to comment
Share on other sites

19 minutes ago, Statick said:

update, this setting actually gets changed back every time you restart Emby. can we please just have a way to turn this off and it stays off please - every time the cache gets cleaned out, my emby slows to a crawl due to it pulling in all the artwork over a slow VPN, when that artwork was already cached and working just fine before that happened. I never delete any content from my libraries so there's literally no reason for me to need to clear out the cache more than once a year. it's really frustrating that this can't just be adjusted with a simple setting but we have to go digging into API commands, and even then whatever setting we choose just gets replaced with the default every time it restarts anyway

 

 

 

HI, options to control this are certainly possible, yes.

Link to comment
Share on other sites

I too really want this feature as well! 

Ideally, I would never want it to auto-refresh the cache but instead do a cache update only when triggered by something in Emby Server itself such as a change picked up by scanning (RTM or Scheduled), graphic updates, sub downloads, bif, etc...

Put another way, if something hasn't been updated by Emby Server itself there is no need to update the cache just for the sake of it. 
Updating the cache for no reason just put a toll on the server that isn't needed.  With a million videos, plus music, pics etc spread equally over a 30 day period that would be 33K+ graphic cache updates alone just for cover art each day!

Cache updates on smaller library systems can be done pretty fast but as your library grows and grows these types of automatic updates start to take their toll on performance as they no longer fit a "nightly window" timeframe.

Carlo

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...