Jump to content

Episode metadata detection for TV series


softworkz

Recommended Posts

Hi,

 

I have quite a view TV Series where the episodes are named like series-s01e01-1080p.mkv.

This leads MB3 to think that this is a multi-episode file covering episodes 1 to 108. As a consequence, the episode titles are set to a comma separated string of all expisodes.

 

I know that I could rename all files to fix this, but I am using Auto-Organize and would like to avoid the renaming. Is there any way to make these kind of file names work correctly?

 

I tried to change the Multi-Episode pattern in the Auto-Organize settings to a random pattern but this does not seem to be relevant for metadata updates. How can I work around this. Possible solutions for me would be:

  • Disable Multi-Episode handling globally
  • Change the detection pattern for Multi-Episodes through configuration
  • Require a separator or at least non-letter character to be present after the second episode number (which would rule out things like 720p, 1080i or 4k)
  • Disallow episode spanning of more than 100 Episodes

Thanks for any help,

 

softworkz

Link to comment
Share on other sites

Under the auto organize tab you should able to monitor a folder.

 

Once you have your folder being monitored, place you MKV file I to that folder and run the auto organize task.

 

Go to the auto organize tab in the server.

 

Your file will probably be highlighted in red.

 

On the left you will see an "X" button. Press the button next to it and it will allow you to edit which series it belongs to and also it will allow you to edit the duel season option. Remove the "final episode" text area which will probably have the number "108"

 

Press "OK"

 

 

It is possible that the auto organize feature could be fix, it is a bug I have noticed as well.

Edited by chef
Link to comment
Share on other sites

I think you are going to need to rename them because the format you are using matches our multi-episode convention (as you found out).  Probably as simple as changing the '-' to a '.'.

Link to comment
Share on other sites

yea i can foresee eventual exposing of some of these advanced settings but probably not in the immediate short term, so Ebr is correct

Link to comment
Share on other sites

@@Luke

 

I think we all know here that he hasn't got control over the naming of these files.

 

Is there any way to ignore the string: "-1080"?

 

After all unless it is some sort of day time soap opera, I have never seen a series with that many episodes. I don't think the auto organize would break with that string being omitted.

Link to comment
Share on other sites

i mean what did i just say previously. there's no bug here. there's no way to change the parsing rules right now so if he needs an immediate answer, do what Ebr said.

Link to comment
Share on other sites

Hi,

 

thanks for all your replies!

 

It would surely be great to be able to configure the parsing rules, which is quite a bit of work, I understand that.

 

But @Luke: How about my suggestion "Require a separator or at least non-letter character to be present after the second episode number (which would rule out things like 720p, 1080i or 4k)". This would take just a few lines of code and no configuration UI or similar. And it wouldn't break anything, it's just a kind of hardening the parsing code.

 

Thanks again,

 

softworkz

Link to comment
Share on other sites

It's great to have new contributors. In this case it depends on how you're going to do it. If you're going to work on it within the vision of what i'm planning on then yes, but if you're just looking to add a one-off option such as that, then no. Reason being it is all based on regex expressions. The shared library is configurable and the configuration needs to be exposed in the server's web interface. It is the expressions themselves that are going to be exposed, for instance, all of these:

 

https://github.com/MediaBrowser/MediaBrowser.Naming/blob/master/MediaBrowser.Naming/Common/NamingOptions.cs#L276

 

So adding that type of option is difficult given that the end goal is to expose the actual expressions. The other thing is the NamingOptions object is quite large. I really haven't taken the time to think about where in the web interface the configuration would live, and also questions about whether parsing the library vs. auto-organize should get their own parsing settings, in which case it needs to be re-usable for both. So there's complexity at different levels here. 

Link to comment
Share on other sites

Yes, I have just noticed that there is a bit of confusion with the auto-organize patterns. I have just entered 'abc' as multi-episode pattern, but when I click on "Organize file" for a failed auto-organize item, it still assumes 720p as the end-episode.

I think a decision wether auto-organize can have it's own patterns must be made first. (and if yes, then it should also be used when clicking the button on a failed auto-organize item.

 

Since you are using regular expressions, it is not as easy as I thought. An easy solution could be an option to just disable multi-episode handling.

 

I would really like to contribute, but It would not make any sense to me, to implement a change that does not get merged to main, so of course it is up to you to decide what would be an acceptable solution (e.g. if you would feel comfortable with a "disable multi-episode handling" option). We could discuss other options privately as well..

Link to comment
Share on other sites

  • 4 weeks later...
SirBobLXIX

I'm having a different issue with naming convention, and can't find any info in my searches.

 

I have lots of media with leading "The" moved to the end, for better files system sorting.

  ex: The Usual Suspects = Usual Suspects, The

 

Any way to have Media Browser without  identify these items automatically without either:

  1.    renaming the media from the existing structure
  2.    manually identify the shows and movies in Media Browser

If this has been answered, or I erroneously hijacked a thread, my search-fu was weak.

Edited by SirBobLXIX
Link to comment
Share on other sites

SirBobLXIX

2 is implemented, just open up the media-editor and press identify.

 

Any way to have Media Browser without either:

  1.    renaming the media from the existing structure
  2.    manually identify the shows and movies in Media Browser

 

 

I don't want to have to manually identify hundreds, if not thousands of items, in Media Browser. If that's a current limitation, then I'll have to wait until it develops more robust identify options.

 

I've edited my original post, as it may have been unclear due to poor phrasing.

archer-phrasing.jpg

Image an attempt to lighten my mood. I originally started to identify manually, but severely underestimated the number of items to be fixed.

Link to comment
Share on other sites

  • 7 months later...

Hi Guys,

 

sorry, to warm up on this topic again, but the automatic detection of 720 or 1080 (=> 108) as ending episode number is still bugging me over and over again.

Not only that it requires manual correction, this also imposes a serious performance and resource penalty, since any episode 'n' is treated as a complete stream comprising episodes 'n' to 1080 and all episode information is aggregated onto a single item.

 

I learned from the previous discussion, that it wouldn't be a trivial task to change this behaviour, since the evaluation would be performed by evaluation of regular expressions.

 

BUT: Recent investigations have put some doubt on this claim, though. This is my current Episode file pattern setting:

 

55f5c5d488429_Emby_SS1.png

 

 

Since my Multi-Episode-Pattern is empty, I would expect, that Emby does no longer detect any episodes as multi-episodes.

But this is not the case. Instead, Emby keeps categorizing ordinary episodes as mega-multi-episodes comprising tens or hundreds of episodes just because of this "-720" or "-1080" substring.

 

I understand, that it might be difficult to programmatically intercept with the RegEx behaviour. But on the other hand I am quite sure, that there shouldn't be any difficulty disabling the "End-Episode-Number"-Detection in cases where the user has not configured a respective RegEx patten for this case (like me).

 

I wonder anyway: In my filename configuration patterns, there is no ending episode token. So why would Emby perform a different detection and what kind of rules would it apply in this case?

 

Additionally I noticed this - probably erroneus - behaviour to be applied not only during "Auto-Organize" but also for on episode files which are directly copied in to the corresponding (Seasons) folders.

 

I am quite sure that we could find a viable remedy without redesigning the mechanism itself. As the most simple implementation, any end-episode number would be ignored, when there is no "multi-episode-pattern" defined.

 

Best,

 

softworkz

Link to comment
Share on other sites

The pattern is the output, not the input. The input is not customizable at this time however it matches what other apps like kodi support. As far as blanking out the pattern, I'm surprised the ui let you do that. I will have to make sure it doesn't. Blanking it out was never intended therefore I'm sure there are unexpected consequences

Link to comment
Share on other sites

Hey Luke,

 

aha, now I understand! I always thought that this would be the pattern for parsing.

 

Hm, I am not quite sure why you seemingly do not consider the described problem as a priority issue (probably you are not facing it by yourself?).

 

In previous postings we talked about changing the parsing rules, which would still be a good idea to implement, but would take some time without any doubt.

 

But a command line parameter causing Emby to not evaluate any ending episode number, or to never look for or identify any ending episode number, would really help.

 

In day-to-day usage I am dealing with countless series episodes that need some manual correction for the ending episode number because they were a "victim" of the "720" or "1080" parsing problem...

 

softworkz

Link to comment
Share on other sites

you haven't given any examples. it works fine with sonarr patterns and that right now is the leading app that names this way, or one of them at least.

Link to comment
Share on other sites

Hi Luke,

 

just some content selected from current streams:

 

Folder: Bones S09E09 Die Wut der Geschworenen GERMAN DL DUBBED 1080p WebHD x264-TVP_{{

Filename: tvp-bones-s09e09-1080p.mkv

 

Folder: Marvels.Agents.of.S.H.I.E.L.D.S02E14.Love.in.the.Time.of.Hydra.GERMAN.DL.DUBBED.1080p.WebHD

Filename: tvp-shield-s02e14-1080p.mkv

 

These are just two current examples. Each of these two episodes is considered as a multi-episode epic - which it is not, of course....

Link to comment
Share on other sites

There are really many files with this kind of naming scheme.

I could even live with completely deativating multi-episode detection. It's a benefit in very rare cases but an annoyance in many others...

 

Thanks,

 

softworkz

Link to comment
Share on other sites

  • 3 weeks later...
  • 8 months later...
Warsen

You made a mistake @@softworkz. The code you introduced in commit 8e71102 (Source) makes an assumption that there exists a next character after the ending episode group match, so line 99 of TV/EpisodePathParser.cs can throw an ArgumentOutOfRangeException because nextIndex is already at the end of the string.

 

So what happened afterwards is that Luke added additional tests in commit a90005e (Source). He ran into that exception and created his own solution to the exception which is a hack that doesn't actually solve the underlying problem. In TV/EpisodePathParser.cs he appended ".mp4" to folder path parsing.

 

Just keep in mind that if you check for the next character in a string, you are not already at the end of the string.

I'll try to fix it and create a pull request. I'm new at using GitHub (contributing to any open source) and need to figure out how to do this process anyways.

  • Like 2
Link to comment
Share on other sites

Good find @@Warsen!

 

At the time of writing that code, I honestly didn't even think of the possibility that one could have episodes in separate folders where the folder name is used for identification.

All the unit tests we had were using filenames no folder names, so the bug didn't show up.

 

What you found is a rare case (probably even impossible case after the appended mp4 extension), but still a flaw in the code without question.

Feel free to create and commit a fix for it!

 

You can contact me if you need help with the process around GitHub!

 

PS: One of your assumptions is probably wrong though: I don't think that @@Luke added the ".mp4" as a "hacky" fix. I suppose he added the extension because some of the regular expressions used for parsing wouldn't work correctly for a string without extension..

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...