Jump to content

SERVER - Compare library name with file/folder path name


CharleyVarrick
 Share

Go to solution Solved by PenkethBoy,

Recommended Posts

CharleyVarrick

I first posted this in server support (and realized it was probably better to post it in Feature Request)

https://emby.media/community/index.php?/topic/41410-compare-library-name-with-filefolder-name/

 

Get this done and I'm purchasing a lifetime subscription in a heartbeat! Heck, promote it on the Emby Media home page and I'm sure you'll get hundreds, if not thousands of additional subscription. This could be a differentiating factor for choosing Emby over anything the competition has to offer, because I far as I know, its not offered elsewhere. Imagine being able to analyze thousands of media files and get a QUICK report of any scraping discrepancies. And I'm pretty sure it would not be that difficult to get it done for you coding experts.

 

Pretty please,

Free beer,

Edited by jlr19
  • Like 7
Link to comment
Share on other sites

Hi, we base our priorities mostly on user demand so like all others we will wait and see what the community feedback is. Thanks.

Link to comment
Share on other sites

  • 1 month later...
topbanana

Please can we have this ability!

Emby makes mistakes... It's always going to...  I've just found another 5 out of my 3k movies (found about 30 last year).  Often annoying ones like different movie databases lists different years for the same movie Grrrrr!

Can we have a filename and path columns in the Reports feature...  It seems to be the most obvious place to put it as a simple implementation, so we can at least scan down the list of scraped names & years next to filename and/or path (hiding all the columns we don't need).

It's the only practical way of finding most of the mistakes...  

  • Like 1
Link to comment
Share on other sites

  • 1 month later...
topbanana

Surely added two columns to the reports page doesn't take a huge amount of coding?  And doesn't change the UI look 'n' feel, or behavior?

We need this functionality badly.

Emby is making mistakes all the time and there's no easy way of finding them.
 

Link to comment
Share on other sites

@@topbanana, can you please discuss some specific examples? You do realize we are just punching search text into moviedb/tvdb api search engines and going with those results. What can we do to prevent you from falling into the perception of "Emby making mistakes"?

Link to comment
Share on other sites

CharleyVarrick

If I may jump in, I would ballpark figure Emby (including whatever info it finds) is 99.5% accurate, which is excellent.

So someone with a small (100) collection will likely get a perfect accuracy score. But with thousands of items, errors will invariably creep in.

 

By same reasoning, fine-combing thru a small movie collection for identification error is an hour job maybe, but with tens of thousands items, it becomes an insurmountable task.

 

If someone has perfect file names & file paths, the accuracy would still never reach 100%, because db info is volatile (same movies with different names, a year off, etc). And a movie thats properly identified today might become wrongly tagged tomorrow because some user over at movieDB updates it with different name/year. Hence a tool to efficiently check everything is as it should be.

  • Like 1
Link to comment
Share on other sites

topbanana

...I would ballpark figure Emby (including whatever info it finds) is 99.5% accurate, which is excellent.

Ditto.

 

We're talking computers here, so Garbage In, Garbage out...

 

Emby does what it's designed to do well.

If we give it slightly misnamed movies, or if there's some movies with very similar names, or an obscure title that gets mis-ID'd as a similar mainstream title, or the most common one, the movies's year is different, depending on which db you look at...

 

Emby is therefore IDing some 'wrongly'

 

It's fair to say that 'IT's' not making the mistakes, granted, but it's finishing up with the wrong title for the movie file submitted.

 

How common is this?  You're thinking this is very rare i guess, you don't seem to acknowledge it as a problem, but i can assure you that IF you give us a simple tool to check the file to the ID'd title, almost every user will find errors. Especially those with larger collections.  My estimate is 2-4% with my 3k movie collection.

 

At the moment, Emby gives us no way of knowing/checking.

 

All we need is a table with 4 columns.

 

ID'd Name  |  ID'd Year  |  Filename  |  Path

 

This is such a simple feature to ask for (especially as you have the Report page existing)

 

Please give us the ability to check for errors.

  • Like 1
Link to comment
Share on other sites

CharleyVarrick

Come to think of it, report tool could offer even more:

1) Missing movie scanner: for an existing file that's not showing up in library

2) Missing file scanner: for a missing file for which a library item exist

3) Duplicate scanner: eg: to find abc.mkv and abc.mp4 within a single library entry

Edited by jlr19
  • Like 1
Link to comment
Share on other sites

bfir3

Come to think of it, report tool could offer even more:

1) Missing movie scanner: for an existing file that's not showing up in library

2) Missing file scanner: for a missing file for which a library item exist

3) Duplicate scanner: eg: to find abc.mkv and abc.mp4 within a single library entry

 

These are actually great ideas and would be really useful.

  • Like 1
Link to comment
Share on other sites

topbanana

So i decided to repeat an audit, which i'd done before, to check if Emby's IDs synchronises up with the filenames...
This takes several hours of concentration, using at excel.

It involves exporting the Report as an Excel spreadsheet, deleting out all the columns but the Name and Year in columns A and B, then printing a DOS file listing to file, and importing that in such that it gave me column D as the filenames.
I'd already hacked emby to forget about ignoring a, an & the, so that is sorted all the moves to resemble the file system alphanumeric sorting (another feature missing/requested many times, sort by path/filename!)

 

So now i dragged blocks of filenames over from column D to C, shuffling as necessary.  Shuffling was needed a few times where the file system and emby sorted commas and symbols differently, so a few files were out of order...  Then there were several moves which have different names, usually foreign names.  I always include the original name in brackets in the filename of the movie, so there were a few of these too.  No problem.  All of these were identified correctly (As in, i'd had to use the Identify Feature as emby still cannot handle filenames with the original title in brackets, as commonly used (see RT's foreign films!) grrrrrr!).

But then there were another 10-15 files that were just wrong.  Incorrect identifications by emby.
This was on top of the 30+ that i think i found when i did this task a while ago, probably with about 2,000 movies.  Now there's over 3,000.  So 10-15 in the 1,000 that were added.

A few of these were the classic year-out problem, which sometimes meant that the title identified was actually a making-of documentary of the movie!!!  (i'd seen this before).  Or just the common discrepancy yoou get with the different databases.  Just rename the file to the year before, so it's in-line with the movie databases the emby uses, and it should be good if i have to do another library rebuild in the future.
Others it just got plain wrong.

 

I would love to see someone else repeat a task like this to see how many mistakes they find in their libraries... Then perhaps Luke might up the priority due to more demand...  But actually, it's a lot of work.

 

 

We cannot just go on blindly, with no way of checking the accuracy of our libraries of movies, other than occasionally stumbling across the errors.

 

We just need two columns, filename & path, added to an existing Reports feature.

Edited by topbanana
  • Like 1
Link to comment
Share on other sites

CharleyVarrick

All this work involved is crazy, believe me I feel your pain.

 

The worse thing is thinking (no, knowing) that however accurate your library is today doesn't warrant anything tomorrow.

I am a regular contributor on themovieDB and know how easy it is to update a perfectly identified movie with wrong infos, hence what's correctly ID'ed today might go south next week or next month.

 

I am shocked to see only 4 emby forum users hit the "Like this" button on this thread so far, in effect sending a very weak message to Emby brass, and I fail to understand why most people don't seem to care at all about scraping accuracy.

 

If its complicated to fix the Report (and it looks like it is, seeing how funny it smells), why not a new plugin then?

Edited by jlr19
  • Like 1
Link to comment
Share on other sites

  • Solution
PenkethBoy

Guys

 

There is a quicker way that will give you a list of what you need

 

The Emby DB is SQLite - so you can download the free "DB Browser for SQLite" and install it.

 

Make a copy of your library.db file (with emby shutdown so its up to date) and point the DB Browser at the copy of the db file

 

Then go to the Execute SQL tab and run the following

 

 

select path, name, sortname, cleanname, slugname from TypedBaseItems where TypedBaseItems.type = "MediaBrowser.Controller.Entities.Movies.Movie" or TypedBaseItems.type = "MediaBrowser.Controller.Entities.TV.Episode" or TypedBaseItems.type = "MediaBrowser.Controller.Entities.MusicVideo" or TypedBaseItems.type = "MediaBrowser.Controller.Entities.Video"

 

 

This will give you a list of "video" files with their path and the various name fields within the database - virtually instantly :)

 

Its a very simple sql query that you can modify to add or subtract db fields to show more or less info

 

example (run on my test server which has the same file used in the various library types)

 

PATH                                                                                          NAME                                                       SORTNAME                                                     CLEANNAME                                        SLUGNAME

"F:\EmbyTest\MovieTesting\Testing\testing.mp4"                       "TED: Chinaka Hodge (2016 Women)"    "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"     "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\Unset\testing.mp4"                                               "TED: Chinaka Hodge (2016 Women)"    "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"     "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\MusicVid\Foo Fighters\testing.mp4"                     "TED: Chinaka Hodge (2016 Women)"    "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"     "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\TVTesting\Black Adder\Season 1\testing.mp4"    "TED: Chinaka Hodge (2016 Women)"     "001 - TED: Chinaka Hodge (2016 Women)"   "ted: chinaka hodge (2016 women)"   "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\HomeVid\testing.mp4"                                          "TED: Chinaka Hodge (2016 Women)"     "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"    "TED: Chinaka Hodge (2016 Women)"
 
If you want to sort the list just add "Order By TypedBaseItems.Path" to the end of the sql query above.
 
Have fun :)
  • Like 2
Link to comment
Share on other sites

schmitty

Perhaps, if a search has multiple matches, it could show a list of matches... similar to the identify feature.

 

 

Sent from my iPhone using Tapatalk

Edited by schmitty
Link to comment
Share on other sites

CharleyVarrick

 

Guys

 

There is a quicker way that will give you a list of what you need

 

The Emby DB is SQLite - so you can download the free "DB Browser for SQLite" and install it.

 

Make a copy of your library.db file (with emby shutdown so its up to date) and point the DB Browser at the copy of the db file

 

Then go to the Execute SQL tab and run the following

 

 

select path, name, sortname, cleanname, slugname from TypedBaseItems where TypedBaseItems.type = "MediaBrowser.Controller.Entities.Movies.Movie" or TypedBaseItems.type = "MediaBrowser.Controller.Entities.TV.Episode" or TypedBaseItems.type = "MediaBrowser.Controller.Entities.MusicVideo" or TypedBaseItems.type = "MediaBrowser.Controller.Entities.Video"

 

 

This will give you a list of "video" files with their path and the various name fields within the database - virtually instantly :)

 

Its a very simple sql query that you can modify to add or subtract db fields to show more or less info

 

example (run on my test server which has the same file used in the various library types)

 

PATH                                                                                          NAME                                                       SORTNAME                                                     CLEANNAME                                        SLUGNAME

"F:\EmbyTest\MovieTesting\Testing\testing.mp4"                       "TED: Chinaka Hodge (2016 Women)"    "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"     "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\Unset\testing.mp4"                                               "TED: Chinaka Hodge (2016 Women)"    "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"     "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\MusicVid\Foo Fighters\testing.mp4"                     "TED: Chinaka Hodge (2016 Women)"    "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"     "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\TVTesting\Black Adder\Season 1\testing.mp4"    "TED: Chinaka Hodge (2016 Women)"     "001 - TED: Chinaka Hodge (2016 Women)"   "ted: chinaka hodge (2016 women)"   "TED: Chinaka Hodge (2016 Women)"
"F:\EmbyTest\HomeVid\testing.mp4"                                          "TED: Chinaka Hodge (2016 Women)"     "ted: chinaka hodge (0000002016 women)"    "ted: chinaka hodge (2016 women)"    "TED: Chinaka Hodge (2016 Women)"
 
If you want to sort the list just add "Order By TypedBaseItems.Path" to the end of the sql query above.
 
Have fun :)

 

Thanks,

I might try your suggestion, but I am generally not too fond of 3rd party solution; it might be simple stupid for you to use "DB Browser for SQLite" but I was driven away from stand-alone Kodi because of its lack of centralized DB; I was told to use SQ lite as it was so easy and yada-yada-yada, it turned to be Klingon algebra to me.

 

A half-decent coder within team Emby should be able to come up with an internal solution in 15 minute's work, my gut feeling tells me...

Edited by jlr19
Link to comment
Share on other sites

PenkethBoy

Yep its esay to code this into emby as it would be to drastically improve the reporting functions generally - but as there are only two devs other things get priority

 

would also be nice to know what some of those priorities where  

  • Like 1
Link to comment
Share on other sites

CharleyVarrick

@PenkethBoy

 

As @@topbanana said we need to be able to compare side-by-side:

 

A ) folder path + year (in parenthesis)

B ) file name + year (in parenthesis)

 

to

 

C ) library name

D ) library year

 

With your previously proposed querry, I'm getting A ) & B ) together

(I can live with that, but I'd prefer seeing them separated, as sometime file name and path name can slightly differ)

 

4 versions of C )

In your proposed query I see you're asking for Name, SortName, CleanName and SlugName

First 2 are self explanatory but I am unsure what are the last 2: can you explain what they are

 

 

But not D )

Can you update the Klingon algebra  ;)  so I get the library year

 

Thanks again buddy!

PS: I saw what you mean by getting results virtually instantly, 23k+ in less than half a sec!

PPS: I can copy/paste the cells in Excel and sort/annotate to my heart's content from there

Edited by jlr19
Link to comment
Share on other sites

CharleyVarrick

would also be nice to know what some of those priorities where  

I can't count how many times I've seen great feature requests and suggestions on this forum getting an enthusiastic Thumb's Up from Luke and others ("this is a great idea", Like This button, etc etc), or some response in the lines of "this is an area we need to improve in a future release"

but then what happens? I'm afraid it gets buried down by more, newer, day to day stuff.

58df6feb4477c_papergarbage1.jpg

Edited by jlr19
Link to comment
Share on other sites

CharleyVarrick

OK, I added "productionyear' to the select field and I have this working

 

What I can't seem to find is how to add "filename" (so I can compare path and file name)

Link to comment
Share on other sites

PenkethBoy

@@jlr19

 

to split the filename and path i would do that in excel - texttocolumns would do that

 

the filename is not stored as a separate field in the db - only the path + filename

 

cleanname and slugname are for internal use - i just listed them for completeness

 

good that you are learning Klingon :)

  • Like 1
Link to comment
Share on other sites

CharleyVarrick

yIn 'u' Hoch 'ej

all of you, and the universe of life

 

:unsure:  yes, of course!

 

We need a better universal translator... SCOTTY!!!

Edited by jlr19
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...