Jump to content

[Prototype] Tag Pacifier - Simple Python Script using local OMDB


Killface69

Recommended Posts

Killface69

While looking Terminator 2 up on OMDB.org, I found the information provided very elaborate and well ordered, especially the nested genres πŸ˜‰image.png.460bbf7a5e3735d4ddaf512b080d4f14.png

Unfortunately, OMDBAPI doesn't provide all those information. I thought of creating a channels plugin, no idea how well that would work, but nested information would be possible.

So I did the next best thing. I've started working on a script in Python to get those thinformation and put it into tags. The website provides CSV files of the database, which I use for local scraping.

The script currently reads the nfo file for a film, gets the imdb id, looks up several csv files, and outputs something like this:

Kategorien fΓΌr den Film 'The Terminator': ['Action', 'Cult Movie', 'Time Travel', 'Apocalypse & Post-Apocalypse', 'Thriller', 'Destiny', 'Low Budget Film', 'Independent Film', 'Guy Movie', 'Adult', 'SciFi', 'Popcorn Movie', 'Classic', 'Franchise Film', 'Exciting', 'Thrilling', 'Funny', 'Serious', 'Rough', 'National Film Registry']

As my tags are a mess (the usual tmdb tags), I decided to kill'em all. Long story short, after succesful testing I've ran the script against my UHD collection consisting of 70 films.Β 
The result: - 167 different tags instead of 905.

image.thumb.png.08177e92ee2b8b0c227792fff6d4c542.png

Tags for Terminator 1
image.thumb.png.111191a5e6dea342e8e72cd7af75b832.png

I guess it would be more elegant/faster in updating the database directly, also haven't thought of keeping actual useful tags. My regular movie database has 11556 well curated tags like 'accountant' and 'alchemist', any last words and/or suggestions before I begin the large scale testing?

Link to comment
Share on other sites

Killface69
3 hours ago, Luke said:

Hi, did you mean to post the script for others to try?

Yes I will, possibly later today. I just wanted to get some additional ideas and thoughts before trying it on a larger number of films, as the process is destructive.

Link to comment
Share on other sites

Killface69
1 hour ago, Oracle said:

I'm happy to test this out on my library, @Killface69

Look forward for the script to be posted.

That's the spirit, and here we go!

OMDB local CSV scraper 2024-02-20

A Python script designed to populate the tags with information from OMDB, like awards, production, etc.

HowTo: Download the file, unzip. EditΒ OMDB_GenreScraper.py, change line 6 directory = "M:\Movies" to match your film folder.
The terms are scraped in English language if available, you can put any available language in line 35 =>Β if row[language_column] == 'en': I suggest either selecting de or en, as most terms are available in those two languages. seeΒ category_names.csv as a reference. There's currently no fallback for missing translations.

This script needs the IMDB id or the title for an identification, in that order. If a film is not found, there might be no entry in OMDB.org.
The CSV files bundled were downloaded from OMDB.org a few days ago, so unless you replace the CSV files, there won't be any online updates.
This script is destructive. It will delete all tags in an nfo file in any case and either re-populate it or leave them empty if no match found.
In case of uncertainty, use xcopy /s/d D:\YOURMOVIEFOLDER\*.nfo E:\BACKUPFOLDER to save your nfo files with the folder structure intact.

Please consider donating and/or contributing to OMDB.org if you like my script.Β 

Results - Main movie library: 4,582 films - 11,556 TMDB tags => 491 OMDB tags

ToDo: Maybe remove the genre from the tags, maybe hierarchie through channels?

omdb_local_csv_scraper.zip

Edited by Killface69
Added results
Link to comment
Share on other sites

  • 1 month later...
Killface69

image.thumb.png.229eb53f306e7337637d3509b46acd72.png

Still working on the script. I decided to add add TMDB tags and format them with Unicode icons for better grouping and came up with something like depicted above.Β 

Here are my formatted tags for 'Who Framed Roger Rabbit' as an example.
1940s, cartoon, los angeles, california, movie business, neo-noir, whodunit, ⭐ Classic, ⭐ Mainstream, ⭐ Milestone, ⭐ National Film Registry, ⭐ Popcorn Movie, ⭐πŸ₯‡ Blockbuster, β­• Literature, β­•πŸ“‘ Novel, 🎞 Film in Film, 🎞 Happy End, πŸ† Academy Award - Winner, 🏭 Hollywood Film, πŸ’° Big Budget Film, πŸ¦„ Hommage, πŸ¦₯ Maneater, 🧸 Funny, 🧸 Understated, 🧸 Uplifting

Some famous film
image.thumb.png.d0b76fe9dc553d2d95d86507c8873619.png

It's currently using OMDB data combined with TMDB tags, blockbuster/flop tag is added by the script according to the films' budget and revenue. Categorizes B-Movies and low budget according to the budget. It's possible to set a min amount of films for the genres and tags. I've added a globe icon to countries, but am hestitant to also include e.g. "las angeles, california" as it would add lots of locations.Β 

Any suggestions or ideas?

I'll provide a test version, but only if someone asks for it. I'm lazy πŸ˜›

Some random lines of code to get an overview:

icon_chacrater_tag = "πŸ¦₯"
icon_superpowers_tag = "🦸"
icon_genres_tag = "🎭"
icon_source_literature_type_tag = "β­•πŸ“‘"

Β  Β  'based on memoir or autobiography': ['β­• True Story'],
Β  Β  'vampire': ['🧬 Vampire πŸ§›'],
    'zombie': ['🧬 Zombie 🧟'],
Β  Β  'alien': ['🧬 Alien πŸ‘½'],
Β 

  • Like 1
Link to comment
Share on other sites

user24

Hi there, I just wanted to say "nice job" with what you have achieved so far! It's very interesting...

I'm currently investing a similar approach (nothing cleverly automated though) for grouping together Music Genres and Tags. At the moment I'm just using simple color-coded squares: πŸŸ₯🟦🟧🟨🟩πŸŸͺ to devise a workable system. I posted the basic idea here:Β Custom sort order for Genres and Tags.

So far, its working reasonably well, and rapidly evolving, but I have a long way to go!

You've now got me thinking that it it may be possible to automatically import Music Genres from MusicBrainz or Discogs (which Emby use) or an equivalent/similar website (e.g. rateyourmusic.com) into Emby and add the prefix icons. Do you think this would be technically achievable, based on what you have managed to do so far with Movies and OMDB ? I'm not asking you to implement it (unless you want to), I'm just curious about it, as I don't know Python and would need to learn a lot more to get started on something like this for myself.

Anyway, "keep up the good work" and you've given me some new ideas to explore!

Link to comment
Share on other sites

Killface69

Cheers, I'm also having issues with music and proper genre tagging. (I must say that I switched to Plex for music only as PlexAmp ist a really good player).

I had the issue with too many single tags, also some genre like Rock are far too generic (Elvis Presley right next to Arch Enemy). I started using Beets for tagging, currently with the plugin whatlastgenre for genres. It gets the genres from Discogs, Musicbrainz and Lastgenre and calculates the most fitting ones with an optional whitelist and max genre count.

It also has a find/replace function for the tags, so changing Rock to 'πŸŸ₯Β Rock' and whitelisting this is possible.
There are currently two downsides, first one is my Lidarr instance losing track of retagged music, and the second one being that it's tag level and requires a rescan of the metadata.

Let me know if you need more information.

  • Like 1
Link to comment
Share on other sites

Killface69

/Edit: If you just want to edit the genre, you can skip Beets and directly use whatlastgenre. Always do dry-runs until you are satisfied with the result.

  • Like 1
Link to comment
Share on other sites

user24
8 hours ago, Killface69 said:

I started using Beets for tagging, currently with the plugin whatlastgenre for genres. It gets the genres from Discogs, Musicbrainz and Lastgenre and calculates the most fitting ones with an optional whitelist and max genre count.

1 hour ago, Killface69 said:

If you just want to edit the genre, you can skip Beets and directly use whatlastgenre. Always do dry-runs until you are satisfied with the result.

This is good info that I didn't know about and will aim to research it a bit more over the weekend. I'll also reply on some of your other comments, when I get a bit of time... ThankYou!

  • Thanks 1
Link to comment
Share on other sites

user24

Yes, Plex|Plexamp have some nice things that Emby do not have (yet). E.g. they have fields for Country and Style, above and below Genre. I've been setting up my Emby Genre|Tag fields with these (and Decades) and it's sufficient, but could be so much better with dedicated fields.

I like your icons. I was also thinking about using a world|globe symbol for Country. E.g. 🌐 could cover everything, but 🌏🌎🌍 could be used to group the world into three regions:

  • 🌎North America and South America
  • 🌍Europe and Africa
  • 🌏Asia and Australasia

With my music, I guess about 50% is from USA, so I add States as well as Country for these Album Artists, but don't worry too much about this level of detail for other countries. Perhaps something similar could apply for movies?

Also, you could use one of these icons for Space πŸ›°οΈπŸš€πŸŒŒβ˜„οΈto separate Space from your Earth locations, if you wanted to.

I had a brief look at whatlastgenre and also came across another tool called bliss. Both look like they could automate genre tagging for me, if I decide I want to go down this path (thanks again for your info). Whenever I look at genre mapping, I usually always come to the same conclusion that there is no "proper" way to tag genres, just what works best for the individual. So, do what you want and have fun with it!

Bliss has some old (but still very relevant) blog posts about weighted genre trees, and more, that are very interesting. I've been reading these to help consolidate my thinking about how I want to ultimately set up my music genres. I fully understand your "Elvis Presley" example and expect that many people would have a Pareto-like distribution of genres in their libraries (as I do); I have circa 10 top-level genres, but 20% of these genres (Blues|Rock) have 80% of my music. Therefore it makes sense for me to split these two into many sub-genres|styles and keep the rest at a single high level. However, someone with a Classical|Jazz focus would split these, but keep Blues|Rock at a single level. This then starts to get into another off-topic area of Styles (but there are other threads about this).

Anyway, as my icons and naming classifications evolve, I can at least relatively easily find|replace|change|update with MP3tag, if I don't get to stage of using whatlastgenre or bliss. Cheers!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...