Jump to content

OpenSubtitles API call - how to identify filename from moviehash?


Go to solution Solved by SolidSnake289,

Recommended Posts

SolidSnake289
Posted (edited)

Good day Emby team,

I'm currently running Emby 4.8.10.0, with a Lifetime Premiere license.  I have a standard OpenSubtitles account, and so am limited to the 100-daily downloads (no problem there). 

The problem I do have is, Emby consistently tries to download subtitles for files it apparently cannot find matches for, which appears to artificially consume my daily quota.  Further, some files are actually named in the logs, while others simply report the "moviehash" in the API call.  This brings me to 2 specific questions:

  1. How do I identify the exact filenames in my Library associated with an API call such as this example below?  I navigated to the (old) OpenSubtitles website (.org), and found "MovieHash" in their "Very-Advanced Search", and none of the MovieHash IDs in my log (I checked 10) are found when I search.  So I'm at a loss for identifying these files. 
    2025-03-09 11:48:11.587 Info HttpClient: GET https://api.opensubtitles.com/api/v1/subtitles?languages=en&moviehash=a57367d2c203332f

     

  2. As noted above, some filenames are specifically called out in the logs, such as seen below.  I thought I could get around this by creating an empty SRT file (0-kilobyte) in the same directory with a matching filename (example attached).  However, Emby continues searching OpenSubtitiles for these files despite the empty file.  Do the SRT files have to contain some minimum text/content in order for Emby to consider the subtitle "real" and avoid the OpenSub search?
    2025-03-09 11:48:02.717 Info HttpClient: GET https://api.opensubtitles.com/api/v1/subtitles?imdb_id=249854&languages=en&moviehash=7e8920bfd056ee2f&query=George Carlin - 1977 - On Location at USC.mp4

I recognize I could "stop" the sub search by disabling subtitles on a given library/content folder, but without being able to identify the affected files, it would be a game of wack-a-mole:  disable subs on a certain library, wait for a library scan (because of 24-hour quota), see if the log is full of overruns again, etc. 

 

Many thanks!

GeorgeCarlin_SRT_file.png

Edited by SolidSnake289
Added server version
Posted

Hello SolidSnake289,

** This is an auto reply **

Please wait for someone from staff support or our members to reply to you.

It's recommended to provide more info, as it explain in this thread:


Thank you.

Emby Team

GrimReaper
Posted
28 minutes ago, SolidSnake289 said:

I recognize I could "stop" the sub search by disabling subtitles on a given library/content folder

You could also set subtitle age for search (per-library setting), for example limit to items added during last week or last month?

Posted

If you've set your download language to English try renaming the srt to en.srt instead, this should prevent it from searching since it thinks there's already an English subtitle (even if the file is empty).

Posted

You can also set the max age of files to search for in library options. That's another way to get it to stop searching repeatedly for the same content over and over.

SolidSnake289
Posted
19 minutes ago, Lessaj said:

If you've set your download language to English try renaming the srt to en.srt instead, this should prevent it from searching since it thinks there's already an English subtitle (even if the file is empty).

Thank you for the suggestion, that's a good point, I forgot about the ".en" naming convention.  I've added it to the files that are explicitly named in the logs, I'll see if it makes a difference after tonight. 

For those that suggested the max age of files - thank you for the input, that's a great workaround for "get the errors out of the log file / brush the dust under the carpet".  However, it does not help me identify which files are actually missing subtitles for whatever reason.  So yes, while the log file would become error-free, I'm still likely to bump into content one day that's missing subtitles, and I won't know the problem exists until the moment I hit "Play" and have the unpleasant discovery. 

Is anyone actually familiar with the OpenSubs API call and how the hash is generated from a specific filename?  Or is it closed-source to the OpenSubs team?  Meaning, Emby just consumes OpenSub's code and its functionality is a blackbox to everyone here? 

Cheers,

-Snake

Posted

OK well you hadn't mentioned any errors. It sounded like you just wanted to get the number of requests down.

SolidSnake289
Posted
23 minutes ago, Luke said:

OK well you hadn't mentioned any errors. It sounded like you just wanted to get the number of requests down.

Fair point of specificity.  There are no errors (up until the RateLimitExceededException exception 🙃 ), so, "error-free" was a misnomer. 

Clarified and re-stated:  yes, my goal is getting the number of requests down, by properly identifying the filenames that continually and repeatedly consume my daily request quota. 

  • Thanks 1
SolidSnake289
Posted
19 hours ago, adminExitium said:

Thank you for the search effort.  However, that seems possibly deprecated (part of the .org feature sunset in 2023).  They're now using the RESTful API:

https://forum.opensubtitles.org/viewtopic.php?t=17930

Which directs you to here:

https://opensubtitles.stoplight.io/docs/opensubtitles-api/e3750fd63a100-getting-started

Which then has this link for "OSDB MovieHash Source Code", except it's a dead link on their .org page (adds the "/projects/" node to the URL you provided - maybe just a busted URL on their part?): 

https://trac.opensubtitles.org/projects/opensubtitles/wiki/HashSourceCodes

Even then (if the URL you found is the most up-to-date hash logic) - that page provides support/examples for 30+ programming languages, many of which include "FileName" as the input parameter to the hash generation function (e.g. "private static byte[] ComputeMovieHash(string filename) such as ComputeMovieHash("C:\test.avi")"  ). 

Thus, back to my original question/concern - at some point in the process, Emby has to identify which files on my hard drive (C:\test.avi) get submitted to the hashfunction and ultimately to the API. 

If nobody is able to answer the question precisely and succinctly in this moment, would the Emby team at least add a feature similar to the "Missing Episode" logic? 

Analogously, a "Missing Subtitle" feature, so that when the OpenSubtitles task runs, it documents the precise filenames and disk location of the files for which the API returned no results?  This could be documented in the log file, or in some new Library UI similar to the Season > ... > "View Missing Episodes" list. 

 

Thank you all for your time,

Snake

Posted

You can already do this using the Filters, just set Subtitles to No. Per library filter, for TV you'll want to use the Episodes tab.

image.png.fba1a842f2c5693dbfad5616958e1e04.png

SolidSnake289
Posted
38 minutes ago, Lessaj said:

You can already do this using the Filters, just set Subtitles to No. Per library filter, for TV you'll want to use the Episodes tab.

image.png.fba1a842f2c5693dbfad5616958e1e04.png

Thank you for the suggestion - and, again from a specificity standpoint, you indeed addressed the broad ask of "show content missing subtitles". 

However - applying that filter reveals that I have 1,072 episodes missing subtitles.  And since Emby keeps re-queuing the same 100 API requests every night (and failing to return any results), that probably also explains why a manual execution of the Subtitles task doesn't do anything and I never make any progress.  I can pick an episode at random from this filtered list, for example:

image.png.0967a2d641243983cffe307d77a3d90c.png


And search for a subtitle (one episode at a time....) and find a result:

image.png.3ed897009742023dc68947e7d45e62b9.png



And I don't have hash matching enabled in the subtitles settings, so I would imagine the Emby task would pick one of these results, IF and WHEN this exact episode was submitted over the API. 

So, the core issue remains - I need to find out which 100 files (out of 1,072) Emby / the API is choking on, so I can exclude them or troubleshoot them, so the next time the job runs, it actually finds 100 subs 🙂

Posted

I roughly understand how their hash works, it uses the first and last 64 bytes and the size of the file, and I was able to take their example bash script and confirm the hash matches with one of my files, but to reverse engineer it from the hash back to the original file I'm not sure is particularly possible or easy. Maybe more information is available in the logs with debug mode enabled. Since you know which files are missing subs you could utilize the Powershell script, however at that point you should question how much your time is worth because a VIP sub is not very expensive and would just solve the problem even with the cheapest option of only 1 month.

Quote

2025-03-11 14:27:01.477 Info HttpClient: GET https://vip-api.opensubtitles.com/api/v1/subtitles?imdb_id=8134742&languages=en&moviehash=e3ab26369ea38745&query=3 from Hell (2019).mkv

./hash.sh "/mnt/raid/Storage/Movies/3 from Hell (2019)/3 from Hell (2019).mkv"
e3ab26369ea38745

  • Like 2
Posted
55 minutes ago, SolidSnake289 said:

Thank you for the suggestion - and, again from a specificity standpoint, you indeed addressed the broad ask of "show content missing subtitles". 

However - applying that filter reveals that I have 1,072 episodes missing subtitles.  And since Emby keeps re-queuing the same 100 API requests every night (and failing to return any results), that probably also explains why a manual execution of the Subtitles task doesn't do anything and I never make any progress.  I can pick an episode at random from this filtered list, for example:

image.png.0967a2d641243983cffe307d77a3d90c.png


And search for a subtitle (one episode at a time....) and find a result:

image.png.3ed897009742023dc68947e7d45e62b9.png



And I don't have hash matching enabled in the subtitles settings, so I would imagine the Emby task would pick one of these results, IF and WHEN this exact episode was submitted over the API. 

So, the core issue remains - I need to find out which 100 files (out of 1,072) Emby / the API is choking on, so I can exclude them or troubleshoot them, so the next time the job runs, it actually finds 100 subs 🙂

Hi, did you read through this:

Automatic Subtitle Downloads

It will explain why you might see results there even though you don't get anything automatically. (Hint: your library options).

SolidSnake289
Posted
24 minutes ago, Luke said:

Hi, did you read through this:

Automatic Subtitle Downloads

It will explain why you might see results there even though you don't get anything automatically. (Hint: your library options).

Hey Luke - thank you for the article reference.  I have, indeed, read through the documentation.  As I noted in the post you quoted, I already don't have hash matching enabled. 

What other setting, as configured below, would cause Emby to queue up the same files over and over?  Besides, of course, the "videos older than..." setting, which we've already discussed.

 

image.png.8ee3d813a26d26cfebf005861bcbbde0.png

 

Thank you,

Snake

Posted

You should probably enable Skip if the video already contains embedded subtitles matching the download language.

  • 3 weeks later...
  • Solution
SolidSnake289
Posted
On 3/11/2025 at 2:36 PM, Lessaj said:

I roughly understand how their hash works, it uses the first and last 64 bytes and the size of the file, and I was able to take their example bash script and confirm the hash matches with one of my files, but to reverse engineer it from the hash back to the original file I'm not sure is particularly possible or easy. Maybe more information is available in the logs with debug mode enabled. Since you know which files are missing subs you could utilize the Powershell script, however at that point you should question how much your time is worth because a VIP sub is not very expensive and would just solve the problem even with the cheapest option of only 1 month.

Just wanted to post an update here, since there was sadly no official response/commitment to the feature request to identify/log these failed/"unanswered" API requests so they include filenames/paths. 

Your post inspired me, so I did indeed grab their hash code and compiled a quick PowerShell script that iterated through all my shows and movies and kicked out the filename and hash value.  The example hash in my original post happened to be an "Extras" episode from a TV series, which I guess nobody created subs for because I manually queried OpenSubs for it and found nothing.  And now I have a record of all existing file hashes to compare against in the future. 

  • Like 1
Posted
On 3/28/2025 at 4:12 PM, SolidSnake289 said:

Just wanted to post an update here, since there was sadly no official response/commitment to the feature request to identify/log these failed/"unanswered" API requests so they include filenames/paths. 

Your post inspired me, so I did indeed grab their hash code and compiled a quick PowerShell script that iterated through all my shows and movies and kicked out the filename and hash value.  The example hash in my original post happened to be an "Extras" episode from a TV series, which I guess nobody created subs for because I manually queried OpenSubs for it and found nothing.  And now I have a record of all existing file hashes to compare against in the future. 

Hi, we can add a logging statement when no results are found. Thanks.

  • Thanks 1

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...