Jump to content

Compare library name with file/folder name


CharleyVarrick

Recommended Posts

CharleyVarrick

Hello,

(first the big picture, then the question...)

I have a pretty big movie collection (12k) that I try to maintain and keep error free, error as in unmatched/mismatched. I've been using xbmc (now Kodi) for over 3 years, with tmDB as main scraper; I found this setup pretty much near perfect as far as accurately identifying movies.

The main issue with tmDB is volatile data (them being open source/community driven), there is lots of opportunity for imperfect metadata. It is astounding how many different title some movies can hold (aka). Also, release dates often vary within tmDB, as registered user can easily edit a wrong or irrelevant title/date. Finally the fact tmDB and imDB often disagree on an "official" title and release year.

Adding to this, I always had 2 Kodi clients, one for the living room HTPC, and the other on my workhorse home office pc. My media actually reside on a 3rd networked pc crammed with multiple, big hard drives. So, a close maintenance of my data always meant having to do everything (at least) twice, because of the 2 kodi clients. With this in mind, a year ago I tried PLEX but quickly abandon it because of its dismal overall match rate, for me anyway.

Just about a week ago, I discovered Emby for Kodi and I am already thrilled by the server/client approach + the ease of remote access. But all my media has been re-scraped in the switchover process, and sure enough, I am finding some unmatched/mismatched here and there, not a lot, maybe between 1 and 2%, but on the scale of my collection, I have a huge task in front of me. Without a better way to do it, I am probably looking at 200 hours of work.

Now the question!

How could I automate or streamline the process of comparing movie libray name (scraper result) with the actual file and folder name in my directories? This is what I need to achieve:

  1. A list of every file/folder movie name vs library movie names discrepancies
  2. A list of every file/folder name missing in library
  3. A list of all duplicates files, eg: The X-Files (1998).mkv and The X-Files (1998).mp4

Thanks in advance for your help.

Link to comment
Share on other sites

I was just asking for a similar list yesterday. I had a look at the Reports but I wasn't able to find a field showing media path. My other required field was the IMDB unique ID. Of course I didn't see neither of them!

 

You may want to analyse with your data by exporting your directory structure to a text file and from there to a spreadsheet. It's not a magic solution, but it might at least decrease your 200 hours of work, until your required fields are included in the reports, if ever!

 

Good luck with that any way.

  • Like 1
Link to comment
Share on other sites

CharleyVarrick

Before Emby, I was asking for a similar feature on the Kodi forum, they suggested a few 3rd party solution that "might" fill the need...

 

1) MySQL

2) Texture Cache Maintenance Utility

3) Ember Media Manager

4) Media Companion

 

While some or all "might" work indeed, I just haven't got around to give any of them an honest try yet.

Link to comment
Share on other sites

CharleyVarrick

I played a bit with Ember Media Manager today, it looks a little busy (daunting distracting), but I guess it could fill my need. Not sure I'll save that much time over just going tru Emby media 1 by 1. I saw an "Ember for Emby" node on the skin support side, but I need to learn more about it 

Link to comment
Share on other sites

CharleyVarrick

I had to abandon Ember Media Manager, and continue to go thru my collection manually, one movie detail at a time, comparing path (folder (yyyy)/file (yyyy) with how the scrapers named it. I'm down to movie# 3801 of 12700... It's been over a week since I started; I'm finding around 0.75% of mismatch, mostly because 1 in 150 file is a year off.

 

If at least I knew this huge task would insure no more mismatch, but with the volatility of data entry/editing on IMDB tmDB and all, this will not happen.

There has to be a more efficient way to do this. Wondering what you big collections holders do to make sure everything is properly named and matched.

Link to comment
Share on other sites

  • 2 years later...

Hello,

I don't know if this issue was resolved before. A couple of years ago I requested a report that shows the media library info with 2 mandatory fields for me the are the media path and its corresponding IMDB id.

Ever since that time I never got a report that does that. I even tried downloading the "Statistics plugin". Unfortunately it doesn't solve the issue.

 

Then I decided to search for the fields in the db files the following steps is what I've done to generate the report. Please proceed at your own risk doing so. I cannot be held responsible for any damage that happens to your emby server, media library or any part of your system.

 

1. Download DB Browser for SQLite from: https://sqlitebrowser.org/ . Chose the version that suites you and install it.

2. In windows launch run and '%appdata%\Emby-Server\programdata\data' without quotes

3. Copy file library.db to desktop

4. Launch your Db Browser and go to File --> Open Database... then locate your desktop and choose library.db which was copied in step 3. I would urge never to open the db file from its original location to avoid any error messages due to read/write locks or corruption to emby or its files.

5. Go to Execute SQL tab and copy the following script:

select Name, OfficialRating, [Path], Genres, SortName, ProductionYear, datetime(PremiereDate,'unixepoch') as 'PremiereDate', datetime(DateCreated,'unixepoch') as 'DateCreated', datetime(DateModified,'unixepoch') as 'DateModified', datetime(DateLastRefreshed,'unixepoch') as 'DateLastRefreshed', UnratedType, Images, ProviderIds
, case when instr(ProviderIds,'Imdb=' )=0 then '' else substr(ProviderIds, instr(ProviderIds,'Imdb=' )+5,9) END as 'IMDB'
from "MediaItems"
where UnratedType=0 --and instr(ProviderIds,'Imdb=' )<>0
order by path

6. Press F5 or Ctrl+R to execute this sql query, which will produce a spreadsheet like table.

7. The table can be copied to the clipboard by left clicking on the top/right most cell in the header then write click and chose 'Copy With Headers' from the appearing context menu.
8. Now paste it to the spreadsheet app of your chosing.
 
I hope this works for you and helps you to continue to manage your library more effeciently.
Edited by mkrbu50
Link to comment
Share on other sites

For me, The IMDB id is very important  and it's no where to be found in existing reports.

I'm a registered user of IMDB since 2000. Yup, that's A.D. for all you millennial! And I have my movie ratings stored over there. I usually export my ratings and my emby library to a spreadsheet then I cross reference them using IMDB ID.

  • Like 1
Link to comment
Share on other sites

CharleyVarrick

For me, The IMDB id is very important  and it's no where to be found in existing reports.

I'm a registered user of IMDB since 2000. Yup, that's A.D. for all you millennial! And I have my movie ratings stored over there. I usually export my ratings and my emby library to a spreadsheet then I cross reference them using IMDB ID.

You're a Maniac ! I like you :D

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...