Guest topbanana Posted July 20, 2023 Posted July 20, 2023 To keep all my media filenames human readable, i get Filebot to replace all the illegal characters that Windows, etc, doesn't like, with lookalikes. My code: n.replace('<':'﹤','>':'﹥',':':'꞉','"':'“','/':'⁄','|':'⼁','?':'?','*':'﹡','\\':'∖') A very common job for Filebot, common is a lot of people's formats. https://www.filebot.net/forums/viewtopic.php?p=42216#p42216 https://www.filebot.net/forums/viewtopic.php?p=53079#p53079 The colon is the most common illegal character that is needed in movies/tv show names. But when adding these items to emby, it often fails to identify them. Removing the colon replacement makes it immediatly identifiable. Is there a way to fudge emby to ignore these characters? And could emby be made aware of these common alternatives that are used so that they're no longer a problem? Cheers.
Luke 42080 Posted July 20, 2023 Posted July 20, 2023 Hi there, can you please provide a specific example? Thanks.
Guest topbanana Posted July 21, 2023 Posted July 21, 2023 I'm having to rebuild my whole server as i'm migrating from lots of external HDDs to BIG internal HDDs... And we don't have the ability to just change the path to the library... yet... Here's a few: Jump Shot꞉ The Kenny Sailors Story (2019) (720p).mkv Kick Out the Jams꞉ The Story of XFM (2022) (720p).mkv Life on Us꞉ A Microscopic Safari (2014) (720p).mkv March of the Penguins 2꞉ The Next Step (L'empereur) (2017) (720p).mkv Mercy, Love & Grace꞉ The Story of Force Blue (2017) (720p).mkv
Luke 42080 Posted July 21, 2023 Posted July 21, 2023 That really shouldn't cause any problems with identification. Can you please attach the server log from importing one of these? Thanks.
Guest topbanana Posted July 21, 2023 Posted July 21, 2023 Some do ID ok... But almost all of the unidentified items have the colon replacement character... Here's the snippet of the humongous log that it's currently creating... The items either side of this one ID'd ok and got their metadata and images, etc. 2023-07-21 02:42:18.618 Info MediaProbeManager: ProcessRun 'ffprobe' Execute: C:\Users\Rich\AppData\Roaming\Emby-Server\system\ffprobe.exe -i file:"G:\Documentary Films\Jump Shot꞉ The Kenny Sailors Story (2019) (720p).mkv" -threads 0 -v info -print_format json -show_streams -show_chapters -show_format -show_data 2023-07-21 02:42:20.152 Info MediaProbeManager: ProcessRun 'ffprobe' Process exited with code 0 - Succeeded 2023-07-21 02:42:20.155 Info App: MovieDbProvider: Finding id for item: Jump Shot꞉ The Kenny Sailors Story 2023-07-21 02:42:20.156 Info HttpClient: GET https://api.themoviedb.org/3/search/movie?api_key=x_secret1_x&query=Jump Shot꞉ The Kenny Sailors Story&language=en&year=2019 2023-07-21 02:42:20.438 Info HttpClient: GET https://private.omdbapi.com?apikey=x_secret2_x&plot=full&r=json&y=2019&t=Jump Shot꞉ The Kenny Sailors Story&type=movie 2023-07-21 02:42:21.411 Info HttpClient: GET https://api4.thetvdb.com/v4/search?type=movie&q=Jump Shot꞉ The Kenny Sailors Story&year=2019
rbjtech 5284 Posted July 21, 2023 Posted July 21, 2023 (edited) I just use a '-' inplace of these (illegal) separation type chars, or remove them entirelly. It makes no difference to the emby identification and emby then uses the downloaded metadata to 'name' it (with whatever chars the provider has used) - it's just the filename that has them removed. If you are trying to use a 'lookalike' extended character - then you are adding in complications to the detection - as none of the providers will be using these. Edited July 21, 2023 by rbjtech 3
Guest topbanana Posted July 21, 2023 Posted July 21, 2023 6 hours ago, rbjtech said: I just use a '-' inplace of these (illegal) separation type chars, or remove them entirelly. It makes no difference to the emby identification and emby then uses the downloaded metadata to 'name' it (with whatever chars the provider has used) - it's just the filename that has them removed. I used to use '-' to eplace all the illegal characters. Or just remove them, like with '?'... But then the filenames look wrong, read wrong. So i simply took one of the many, many examples of Filebot users' code to replace them with lookalikes from their forums. (I trust you know of Filebot? It's truly amazing!) As i said in my op, i name my media files to be Human readable first, with basic naming convention that is used everywhere that mentions movies, etc. Name & Year... With only resolution and audio channels tacked onto the end: Jump Shot꞉ The Kenny Sailors Story (2019) (720p).mkv So my HDD of movies can be perused and it all just looks like the simple, clean list of movies names. 6 hours ago, rbjtech said: If you are trying to use a 'lookalike' extended character - then you are adding in complications to the detection - as none of the providers will be using these. Sure. But as Luke hints at, they shouldn't interfere with the identification... Hence his amazement. And they don't need to complicate the detection, as they can be 'managed' by both emby and the providers. Stripping certain characters out of search strings before submitting is pretty standard stuff. Normal programming stuff. Emby perhaps just needs a few more characters added to its 'illegal' character lists that it strips out, or this character stripping feature added. The characters i'm now using are commonplace replacements for characters that are illegal on the various filesystems and programming languages. Everyone uses different, weird and wonderful naming schemes and emby's goal, and our wish, is that it 'just works', within reason.
Solution Luke 42080 Posted July 21, 2023 Solution Posted July 21, 2023 OK yes I do see one improvement that can be made here. Thanks.
rbjtech 5284 Posted July 22, 2023 Posted July 22, 2023 (edited) I have no need to use 'filebot' as all naming rules are dealt with fully automatically by the source applications - they remove/replace most illegal characters, and automatically add filename details, provider id and codecs into the filename (from provider metadata) before even writing it to the file system (and thus before emby even sees them). Extended 'fake' characters just complicate things imo - infact things like the fake apostophes really screw up some common utilities (such as ffmpeg, mkvtoolnix etc) in scripting code as they then don't comply with UTF-8 for example. imho - what is displayed on the emby user screens is what's important - the filesystem just needs to display a functional name only - compatability and portability is more important than how it looks. Edited July 22, 2023 by rbjtech 1
Guest topbanana Posted July 22, 2023 Posted July 22, 2023 16 minutes ago, rbjtech said: I have no need to use 'filebot' as all naming rules are dealt with fully automatically by the source applications - they remove/replace most illegal characters, and automatically add filename details, provider id and codecs into the filename (from provider metadata) before even writing it to the file system (and thus before emby even sees them). It's ok, mate. This little issue probably doesn't affect you at all then. Anyhoo... As replacement characters are sometimes used, along with other characters that might cause problems, it seems logical, and fairly easy, to make emby aware of them, and to make it such that they simply don't matter. It's the amazing power of software... It can be made to deal with whatevery we throw at it.
rbjtech 5284 Posted July 22, 2023 Posted July 22, 2023 1 minute ago, topbanana said: It's the amazing power of software... It can be made to deal with whatevery we throw at it. True - and I have no issues with your thoughts, but sometimes what you think might be a 'good idea' , impacts others in ways you may never have considered. Keeping to agreed standards (ASCII etc) is always the best route for simply, predictable logic..
Guest topbanana Posted July 22, 2023 Posted July 22, 2023 1 hour ago, rbjtech said: Keeping to agreed standards (ASCII etc) is always the best route for simply, predictable logic.. People who speak different languages, with different alphabets, also make movies... So we're 'stuck' with unicode going forward i guess.
pwhodges 2012 Posted July 22, 2023 Posted July 22, 2023 2 hours ago, rbjtech said: Extended 'fake' characters just complicate things imo - infact things like the fake apostrophes really screw up some common utilities (such as ffmpeg, mkvtoolnix etc) in scripting code as they then don't comply with UTF-8 for example. Eh? Unicode (of which UTF8 is merely the commonest representation) is the standard for characters - ASCII (which you mention in another post) is a historical artefact now. If any utilities are confused by characters they don't explicitly make decisions on (i.e. 99% of Unicode), then they simply need fixing. Paul
rbjtech 5284 Posted July 22, 2023 Posted July 22, 2023 2 minutes ago, pwhodges said: Eh? Unicode (of which UTF8 is merely the commonest representation) is the standard for characters - ASCII (which you mention in another post) is a historical artefact now. If any utilities are confused by characters they don't explicitly make decisions on (i.e. 99% of Unicode), then they simply need fixing. Paul My point is a char was used that the modern utilities did not like - they converted it into a multi-char 'version' which then failed on a lookup. Maybe the utility was only UTF8 aware, and this was something different. Anyway - It doesn't really matter - the point of the discussion is don't try and use 'system or illegal' characters in filenames - it just causes unnecessary issues.
Luke 42080 Posted July 28, 2023 Posted July 28, 2023 On 7/21/2023 at 2:26 AM, topbanana said: Some do ID ok... But almost all of the unidentified items have the colon replacement character... Here's the snippet of the humongous log that it's currently creating... The items either side of this one ID'd ok and got their metadata and images, etc. 2023-07-21 02:42:18.618 Info MediaProbeManager: ProcessRun 'ffprobe' Execute: C:\Users\Rich\AppData\Roaming\Emby-Server\system\ffprobe.exe -i file:"G:\Documentary Films\Jump Shot꞉ The Kenny Sailors Story (2019) (720p).mkv" -threads 0 -v info -print_format json -show_streams -show_chapters -show_format -show_data 2023-07-21 02:42:20.152 Info MediaProbeManager: ProcessRun 'ffprobe' Process exited with code 0 - Succeeded 2023-07-21 02:42:20.155 Info App: MovieDbProvider: Finding id for item: Jump Shot꞉ The Kenny Sailors Story 2023-07-21 02:42:20.156 Info HttpClient: GET https://api.themoviedb.org/3/search/movie?api_key=x_secret1_x&query=Jump Shot꞉ The Kenny Sailors Story&language=en&year=2019 2023-07-21 02:42:20.438 Info HttpClient: GET https://private.omdbapi.com?apikey=x_secret2_x&plot=full&r=json&y=2019&t=Jump Shot꞉ The Kenny Sailors Story&type=movie 2023-07-21 02:42:21.411 Info HttpClient: GET https://api4.thetvdb.com/v4/search?type=movie&q=Jump Shot꞉ The Kenny Sailors Story&year=2019 Moviedb does return results for this example, even with the colon.
Guest topbanana Posted November 17, 2023 Posted November 17, 2023 Emby continues to have problems with the colon replacement characters. And yes, often if i manually Identify the item, copying and pasting the exact name from the file, with the replacement colon in it, it does often find the proper match! BUT. It just doesn't find it when emby is doing it automatically... How is the process different when identifying media adding to emby and manually identifying it? As there is a finite list of these replacement characters used by us media hoarders (as detailed in the FileBot forums), could emby just have a short charater replacement list that it uses to correct the names before submitting the queries to the Moviedb, et al? So it finds a replacement colon, it repleaces it with a normal colon... If it finds a replacement questionmark, it replaces it with a normal questionmark? My OP has the common sellection of replacement characters that are used. I could help search to find the others that are used. I've had to rebuild my doco season library (due to a path change! lol), which is over 1000 shows... I've now got dozens of shows with colons to manually ID. This is not fun.
Luke 42080 Posted November 17, 2023 Posted November 17, 2023 Hi there, can you please provide a specific example: How to Report a Problem Thanks.
blgentry 51 Posted November 17, 2023 Posted November 17, 2023 @topbananaAs @rbjtechhas tried to illustrate, your problem is caused by your instance on using weirdo unicode characters. Make no mistake: Substituting obscure characters that resemble others is bound to cause problems. If you can get over the idea that your file names have to be "really really correct", you will have a much easier time with everything. Just eliminate the problem characters altogether. Or map them to a common easily recognized character. I can hear you yelling at the screen that I'm wrong. My intention here is to make your life easier. Would you rather be pure and frustrated or make a compromise and have things be easy for you? The choice is yours. Brian. 1
Guest topbanana Posted November 19, 2023 Posted November 19, 2023 On 18/11/2023 at 01:47, blgentry said: @topbananaAs @rbjtechhas tried to illustrate, your problem is caused by your instance on using weirdo unicode characters. Make no mistake: Substituting obscure characters that resemble others is bound to cause problems. If you can get over the idea that your file names have to be "really really correct", you will have a much easier time with everything. Just eliminate the problem characters altogether. Or map them to a common easily recognized character. I can hear you yelling at the screen that I'm wrong. My intention here is to make your life easier. Would you rather be pure and frustrated or make a compromise and have things be easy for you? The choice is yours. Brian. Our intention with these forums is to make emby better. That includes making it 'just work'. Software is all about doing this. Should we just give up and settle for the 1 or 2 ways that it just has to be? Or should emby adapt to what people tend to use out in the real world. If you were to look at the Filebot forums, and the renaming codes they use, you'd notice that replacing illegal characters is quite common... So i'm not 'unique' doing this. I'm sure the emby team would love their product to be the best available... To be able to identify everything we threw at it... @Luke I've given you examples. I've got dozens and dozens of series and movies that emby can't id during normal adding. And, again, replacing these character is something the data/media hoarders often do... I guess it's up to you if you can be arse make emby better in this way... ... OK. I give up.
pwhodges 2012 Posted November 19, 2023 Posted November 19, 2023 Do you want Emby to concentrate on playing media efficiently in a huge range of circumstances, or do you want it to compromise that feature by spending time duplicating the excellent name-handling abilities of programs like FileBot, which already specialise in doing that? Emby, like all groups, has a finite amount of effort available, and would prefer use most of its efforts on its core functionality, leaving unusual name handling to others, at least for now. And many Emby users also use a program like FileBot to help them curate their library. Note, I said "unusual", because it is clear that most Emby users have little problem naming their files in ways that Emby already handles just fine. Paul 1
viking19 32 Posted November 19, 2023 Posted November 19, 2023 @topbanana I'm with you in doing this. I've been replacing illegal characters in my music library for years, mostly / to ⧸ (29F8) and : to ꞉ (A789). I pre-tag everything with MusicBrainz or Mp3Tag so Emby doesn't need to identify anything. If you added the IMDB ID to the folder name would that help? Probably a real pain to do with a large library unless there is an addon or use a script.
Luke 42080 Posted November 19, 2023 Posted November 19, 2023 OK we do already normalize a number of these characters but you've identified a few more that we can add, so we'll do that. Thanks.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now