ginjaninja 536 Posted January 2, 2023 Share Posted January 2, 2023 (edited) Background The script logs movies it believes are misidentified on TMDB. The preference algorithm is a work in progress. Currently (highest preference first) nearest year with exact match on title (title, original title and alternative titles), otherwise best word match count (title, original title and alternative titles) best levenshtein distance (title, original title and alternative titles) highest vote count It works better on a library where the filenames match TMDB and religiously follow the Emby movie naming scheme. Emby's default TMDB choice is better when there is a discrepancy with TMDB esp. year but perversely Emby's default choice was not always as reliable when media exactly matches TMDB. So far i would characterise the algorithm as a useful check on Emby's choice rather than an improvement. Requirements Powershell 7. TMDB api key and add to config. (they are free for personal use) edit config.psd1 to your preferences, add un and pw. Sample log showing 35 misidentifications (from 3500). log.csv The log contains links for easier review and actioning of suggested corrections item in emby item in tmdb (current id) item in tmdb (the suggested id) risks are low as the only changes made are to a log file. issues and suggestions welcome, interested to hear peoples results. Particularly interested in ideas/properties to improve the matching algorithm. If i can make it reliably better than Emby i would slave it to a ScriptX omediaaddded event to update the default choice. v0.0.0.2 corrected emby item url in log CheckIdentity v0.0.0.2.zip CheckIdentity v0.0.0.1.zip Edited January 2, 2023 by ginjaninja 2 1 Link to comment Share on other sites More sharing options...
ginjaninja 536 Posted January 27, 2023 Author Share Posted January 27, 2023 (edited) v0.0.0.3 Changes reduced weighting of year to lowest (except edge cases), save when a exact match with a level of vote count is found check to stop further api requests when an 100 confident match is already found - slight speed improvement. added a preference function with sample rules to handle edge cases and assess a confidence score (early days). Amendments straight forward. rule to address limited filesystem character set unfairly diminishing legitimate matches eg 50/50 rule to increase weight of year discrepancy in low vote count scenarios eg The Angel with the Trumpet Verbose mode - displays the preference engine output reduced tmdb api page count, when year exists to 2, 3+ never helped in my test set - slight speed improvement Logs incorrect filenames on filesystem in relation to matched title/year wired up native and preferred languages in case they became useful in disambiguation - not currently in use This version keeps the the weighting for exact matches whilst be similarly accommodating of Emby's tolerance of year discrepancy. In time, the confidence score might allow for making changes to Emby and Filesystem in high confidence scenarios, whilst alerting user to lower confidence scenarios. hope 1 day to turn into a plugin CheckIdentity v0.0.0.3.zip Edited January 28, 2023 by ginjaninja Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now