Jump to content

Show Intro Skip Option


Liquidfire88

Recommended Posts

I moved the database clean up to the end of each season  during the second task.

It runs more often, but not by much. Only if the season is complete (doesn't contain virtual/unaired items), and has the information readily available.

 

I also enabled the button to remove the entire season worth of data. But, we had better have a confirmation dialogue... 😳

I will do that now.

 

@Micael456 I'd love to know if you have a better experience with this upcoming release.  There are still ways to limit resources we use during the task. But it would make the task a lot longer. 

 

@rbjtech  Thank you! 

 

 

 

 

 

Edited by chef
  • Like 1
Link to comment
Share on other sites

2 hours ago, Micael456 said:

@rbjtech,

I think we also had a concept about a community database based on hashes and titles etc, probably as a step 8.

An online resource of chromaprints! Sorted by Tmdb IDs. 

We host an AWS resource API and database, that we can request the fingerprints from for each series. 

Snag the print, do the calculation, dispose the print.

Wouldn't that be something 🤔

Link to comment
Share on other sites

rbjtech
28 minutes ago, chef said:

An online resource of chromaprints! Sorted by Tmdb IDs. 

We host an AWS resource API and database, that we can request the fingerprints from for each series. 

Snag the print, do the calculation, dispose the print.

Wouldn't that be something 🤔

It would - but it has a big 'but' on the version it is using.

As an example - for any episode, you may have -

1. Original air episode - likely has a 'previously' in it.

2. Edited air episode - 'previously' removed and not always started at the same I-frame.

3. DVD/BluRay version - will not have a previously in it.

So while we can accurately determine the Intro length, we cannot determine when it happens just from a Tvdb ID.. ?

What we can do, is verify the accuracy of the Intro length vs an online lookup - and if the Detect results are way off this, then it's likely wrong - and we can flag this.

It's an interesting idea for sure, but I'm not convinced it will negate the need for chromeprints from the original source.

I've added to this list !

Edited by rbjtech
  • Like 4
Link to comment
Share on other sites

rbjtech

One other interesting point that needs discussion/agreement is the saving of the IntroStart/End data and the ability to re-create on a dB rebuild.

We currently use the emby ID number for lookup - which is fine - but what happens if that ID changes.  This will happen during a rename, rebuild etc ? - so while the tvdb ref will never change, the emby ID may.

The chapter points are built/rebuilt from the media file (if available) - so my personal view is we also need to write the IntroStart/End chapters into either a) the original file and/or b) a <filename>-chapters.xml. 

This a) makes the Intro data survive a complete emby dB rebuild (assuming emby can read the chapters.xml) and b) makes the data available for other platforms.

The downside of course is emby will need write access to either the media file (to modify the header) or access to the directory to write the chapter.xml.

Any updates, will be re-read and saved into the emby chapter database as part of normal processing, so use of the Intro data can continue as normal.

What are peoples thoughts here ? 

  • Agree 1
Link to comment
Share on other sites

v_2.0.2.7

  • Remove Season Data button works
  • Confirmation dialogue before season data removal
  • DB clean up switched to second task
  • Code to implement chapter editing (disabled ATM).

IntroSkip_v2.0.2.7.zip

 

The DB doesn't change here, so if you wanted to try removing episodes, or even an entire season and running the two task again, it would give an idea of how long a processed database would take to process new episodes.

Clear browser data to see the new confirmation dialogue.

  • Like 2
Link to comment
Share on other sites

Micael456
6 minutes ago, rbjtech said:

It would - but it has a big 'but' on the version it is using.

As an example - for any episode, you may have -

1. Original air episode - likely has a 'previously' in it.

2. Edited air episode - 'previously' removed and not always started at the same I-frame.

3. DVD/BluRay version - will not have a previously in it.

So while we can accurately determine the Intro length, we cannot determine when it happens just from a Tvdb ID.. ?

What we can do, is verify the accuracy of the Intro vs an online lookup - and if the Detect results are way off this, then it's likely wrong - and we can flag this.

It's an interesting idea for sure, but I'm not convinced it will negate the need for chromeprints from the original source.

I've added to this list !

@rbjtech, that's why anything has to have a hash in it. File is a different version = different hash. Each ID might have hundreds or thousands of hashes as we use different rip settings, but I guess fundamentally it would work. The odds of two different files of the same episode having a hash collision are almost infintesimal, even with a quick and lazy md5.

So it goes, broadly speaking pseudo-code.

forEach (episodeId) {

#calculate hash

md5hash = hash(md5,episodeFile)

#Check against onlineDB.

	if (hashExists(episodeId,md5hash) {
    
    
    *Exists, pull down intro and end times.
  
    
    
    } else {
    
    
   		*Fingerprint it!
        
        uploadToDB(episodeId,md5hash,introStart,introEnd)
    
    }

} # end loop

 

 

My main concern with that practicality is the hosting cost of the databse. Nothing comes without a price.

 

Link to comment
Share on other sites

12 minutes ago, rbjtech said:

One other interesting point that needs discussion/agreement is the saving of the IntroStart/End data and the ability to re-create on a dB rebuild.

We currently use the emby ID number for lookup - which is fine - but what happens if that ID changes.  This will happen during a rename, rebuild etc ? - so while the tvdb ref will never change, the emby ID may.

The chapter points are built/rebuilt from the media file (if available) - so my personal view is we also need to write the IntroStart/End chapters into either a) the original file and/or b) a <filename>-chapters.xml. 

This a) makes the Intro data survive a complete emby dB rebuild (assuming emby can read the chapters.xml) and b) makes the data available for other platforms.

The downside of course is emby will need write access to either the media file (to modify the header) or access to the directory to write the chapter.xml.

Any updates, will be re-read and saved into the emby chapter database as part of normal processing, so use of the Intro data can continue as normal.

What are peoples thoughts here ? 

Yeah if emby changes the InternalId of the item, then the item would be scanned again. This could cause the entry in the database to be duplicated.

 

If the user removes a series, and adds it back, or changes the physical location of a series in the file system, does that change the InternalId in Emby?

 

We would need to handle the db entry somehow. Maybe we need to forget InternalIds and use providerIds instead? Not sure.

Link to comment
Share on other sites

rbjtech

ok - so as each md5 hash is going to be unique for every possible combo of file - a single byte different will produce a different md5 as you know.

Storing the fingerprints for hundreds of thousands (millions..?) of episode combinations will need a huge amount of storage ...

On a test of my library 30K episodes needed a 3Gb db..

It's a cool idea for sure - but I have my doubts on how achievable /useful it would be vs just creating your own temporary fingerprints. 

Link to comment
Share on other sites

Okay, after some testing, the answer is yes, the internalId will change if the series is removed and put back in.

This causes an issue.

What I suggest we do, is tie into embys ItemRemoved event, and then follow suit with the item that is removed.

 

Every item removed in emby,  removes an item from our data table as well.

What could go wrong? 😳😶🙃 Maybe to many items removed to quickly causes our table to become fragmented? ... ... 🤔

 

Edited by chef
Link to comment
Share on other sites

rbjtech

Each episode also has a unique provider ID - could we link to that instead of the emby ID - as this never changes - thus on a emby dB rebuild, you will automatically link back to the correct FP ?

It also makes the db very portable - as it can be used on any instance of emby and it would work (using the SAME media source obviously..)

Edited by rbjtech
  • Like 1
Link to comment
Share on other sites

Just now, rbjtech said:

Each episode also has a unique provider ID - could we link to that instead of the emby ID - as this never changes - thus on a emby dB rebuild, you will automatically link back to the correct FP ?

This is also possible. Implementation would super easy, barely an inconvenience.

Does the provider ID depend on which providers are selected in the dashboard?

If the user changes provider hierarchy, would that ID change?

Link to comment
Share on other sites

rbjtech

Good call - I was just going by the metadata for an Episode.

If the metadata for External ID is blank - then I assume you could fallback to the embyID - but with the caveats that if this changes, you lose the link and need to recalc the FP ?  Same caveat if you change the provider - ie - if you use TheTVDBId and the user removes this, then they'll need to recalc the FP again ?

But all 3 external references will not change - so take your pick on which one to use I guess ?!

hmm.PNG.31538b9e96feb8dbecba378a9ba7cdde.PNG

  • Like 1
Link to comment
Share on other sites

Micael456
21 minutes ago, rbjtech said:

Each episode also has a unique provider ID - could we link to that instead of the emby ID - as this never changes - thus on a emby dB rebuild, you will automatically link back to the correct FP ?

It also makes the db very portable - as it can be used on any instance of emby and it would work (using the SAME media source obviously..)

@rbjtech, I'd be wary on that front- we don't necessarily know that's the same instance of the file (similar to the centralised DB point). e.g. Emby DB is rebuilt, scans and detects "Game of Thrones S01e01. Only thing is at some point the underlying MKV was swapped out for a version without the HBO intro, and suddenly it's out of sync.

 

Speaking of, @chef, how does / does the plugin handle file replacements at the moment? Does emby automatically assign a new ID if the file is replaced?

  • Like 1
Link to comment
Share on other sites

Micael456
1 hour ago, rbjtech said:

Storing the fingerprints for hundreds of thousands (millions..?) of episode combinations will need a huge amount of storage ...

We wouldn't have to store the FPs themselves, just the start and end times. Not sure how much space that would save though, and I agree the DB would be v. large for a community effort.

Link to comment
Share on other sites

3 minutes ago, Micael456 said:

@rbjtech, I'd be wary on that front- we don't necessarily know that's the same instance of the file (similar to the centralised DB point). e.g. Emby DB is rebuilt, scans and detects "Game of Thrones S01e01. Only thing is at some point the underlying MKV was swapped out for a version without the HBO intro, and suddenly it's out of sync.

 

Speaking of, @chef, how does / does the plugin handle file replacements at the moment? Does emby automatically assign a new ID if the file is replaced?

No ID doesn't change, although the file was changed. 

Nice catch! Good work!

 

There is an ItemUpdated event that we can tie into as well.

That would tell us that it changed.

So, we have to tie into ItemUpdated, and also ItemRemoved.

Item removed will remove from  our data table.

ItemUpdated will remove the item from our data table and kick off the task.

Link to comment
Share on other sites

rbjtech
18 minutes ago, Micael456 said:

@rbjtech, I'd be wary on that front- we don't necessarily know that's the same instance of the file (similar to the centralised DB point). e.g. Emby DB is rebuilt, scans and detects "Game of Thrones S01e01. Only thing is at some point the underlying MKV was swapped out for a version without the HBO intro, and suddenly it's out of sync.

Thanks - a valid point :)

Once we get the IntroStart/End into the emby ChapterdB (or wherever it ends up) then the method of how we got it there becomes irrelevant.  This is why I think it's really important to get that info HARD linked to the media (and independent of emby) - somehow.

Taking an extreme example, I upgrade my PC, I rebuild emby, I restore the settings, but all the ID's have now changed, thus my FP/Detect dB is no longer good - I have to do it all again - UNLESS the IntroData is stored with the media.  If the FP/Detect dB linked to external episode ID - then it's still valid.

Link to comment
Share on other sites

5 hours ago, rbjtech said:

That's an interesting view when using Rich Text from an Excel table, I though it would use a scrollbar like it did when I use the editor

Good day,

The problem here, that "fixed" table width, so best to use the "editor table option" to post for table.

My best

  • Thanks 2
Link to comment
Share on other sites

guardianali

This is why I always always choose the option to store data with the media. No matter how nice something is written, eventually it's nice to do a fresh install to clean up things. Like Windows itself. Get rid of unused ophan files, database entries etc.... 

So when you reinstall emby, it's super easy cause all the pictures and art and meta data are all already in the media folders. Just a quick reindexing and no downloads. 

If you make the plugin create a singular txt file in the media directory. Call it something like Intro.nfo. And then have it put the start and end time of all episodes of the seasons episodes into it. Bam. One time scan. Future reinstalls, etc, would avoid any future scans. If you replace files with a fresh new rip, just delete the Intro.nfo file in each season folder and let it rescan. Then it's quick and short as it's a single show only. 

And if you use something like Filebot or your renamer of choice to rename the new rip, using the naming convention you've already created for yourself, the new rips file names will match the old ones anyway so the info in the nfo wulluld match fine. 

Edited by guardianali
  • Like 1
  • Agree 1
  • Thanks 1
Link to comment
Share on other sites

I've got most of the library change sync'ing completed (added/removed items). 

 

It is a bit more difficult to add the ItemUpdated Event.

This event doesn't really tell us enough information to make any decision about  processing the file again.

I suppose that if an episode changes in the library to another encoding, there would be the chance the sequence data wouldn't match. At that point, the User is going to have to remove the sequence in the plugin UI, and scan the library again.

Everything else looks good.

  • Like 1
Link to comment
Share on other sites

44 minutes ago, guardianali said:

This is why I always always choose the option to store data with the media. No matter how nice something is written, eventually it's nice to do a fresh install to clean up things. Like Windows itself. Get rid of unused ophan files, database entries etc.... 

So when you reinstall emby, it's super easy cause all the pictures and art and meta data are all already in the media folders. Just a quick reindexing and no downloads. 

If you make the plugin create a singular txt file in the media directory. Call it something like Intro.nfo. And then have it put the start and end time of all episodes of the seasons episodes into it. Bam. One time scan. Future reinstalls, etc, would avoid any future scans. If you replace files with a fresh new rip, just delete the Intro.nfo file in each season folder and let it rescan. Then it's quick and short as it's a single show only. 

And if you use something like Filebot or your renamer of choice to rename the new rip, using the naming convention you've already created for yourself, the new rips file names will match the old ones anyway so the info in the nfo wulluld match fine. 

This is an interesting idea. 

If your titlesequence.db is lost on a reinstall, you would have to rescan. 

Edited by chef
Link to comment
Share on other sites

@Luke During the LibraryManager_ItemAdded event, can the ItemUpdateType.None refer to an item being added to the library?

Link to comment
Share on other sites

Micael456
40 minutes ago, chef said:

It is a bit more difficult to add the ItemUpdated Event.

This event doesn't really tell us enough information to make any decision about  processing the file again.

Hmm. Maybe try the hash for the local DB as well? On an update event compare hashes to see if the file is still the same?

Though obviously an official Emby "FileHasChanged" tag would be better, as saving hashes increases DB size.

Link to comment
Share on other sites

3 minutes ago, Micael456 said:

Hmm. Maybe try the hash for the local DB as well? On an update event compare hashes to see if the file is still the same?

Though obviously an official Emby "FileHasChanged" tag would be better, as saving hashes increases DB size.

I'd have to check, but maybe the FileSystem interface has events we can use.

  • Like 1
Link to comment
Share on other sites

rbjtech

 

1 hour ago, guardianali said:

This is why I always always choose the option to store data with the media. No matter how nice something is written, eventually it's nice to do a fresh install to clean up things. Like Windows itself. Get rid of unused ophan files, database entries etc.... 

So when you reinstall emby, it's super easy cause all the pictures and art and meta data are all already in the media folders. Just a quick reindexing and no downloads. 

If you make the plugin create a singular txt file in the media directory. Call it something like Intro.nfo. And then have it put the start and end time of all episodes of the seasons episodes into it. Bam. One time scan. Future reinstalls, etc, would avoid any future scans. If you replace files with a fresh new rip, just delete the Intro.nfo file in each season folder and let it rescan. Then it's quick and short as it's a single show only. 

And if you use something like Filebot or your renamer of choice to rename the new rip, using the naming convention you've already created for yourself, the new rips file names will match the old ones anyway so the info in the nfo wulluld match fine. 

I would like to try and keep to open and agreed standards where possible to make it fully portable for use with scripts and 3rd party utils.

chapter.xml is an agreed and documented standard for MKV and can be used for MP4 as well.

So why not just write a <filename>-chapter.xml (or .nfo) alongside the file - holding all the chapter points including the new Intro points ?  If people want to mux it back into the MKV, then they can easily do that outside of emby.  Write the emby chapter dB as well - as currently emby will not pick up the external chapter file.

As mentioned before, once we have this chapter file, then all entries for that episode can be removed from the FP database - as that chapter file  has a 1:1 mapping with the media file - the same as it's current nfo file for example.

  • Agree 1
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...