Jump to content

Subtitle problem for accented characters


christophe.ferrandon

Recommended Posts

christophe.ferrandon

Server OS Debian 7 last update (2015-07-22)

Emby Server Version : 3.0.5681.29291

Player OS : OS X 10.10.4 - Web Browsers : Safari, Firefox (39.0), Chrome (41.0.2272.104), Opera (30.0.1835.125).

 

Hi,

 

After upgrading Emby to latest beta version I've found small problem subtitles on accented characters like "é" displayed "?" on screen (I'm French).

Another small problem about subtitles : we have to manually select another subtitle language and then go back to your preferred subtitle language in order to have them on screen.

Last : Firefox seems not happy with subtitles : no sub on screen.

 

Best and many thanks for your great job.

 

 

post-47425-0-06331100-1437596995_thumb.png

Link to comment
Share on other sites

christophe.ferrandon

Subtiles where automatically download by Emby (.srt on open subtitle.org) and perfectly working in previous Emby version (my previous was : 3.0.5662.32894).

I tried many other films with same result.

Seems to be a different encoding type problem. A .srt file saved as ANSI type will have the same problem. The file must be a UTF-8 encoded file.

Because all my .srt flles where OK on previous version, I believe it's a file type decoding problem.

 

 

ps : did you notice Firefox don't display any .srt subtitles ?

 

Best

 

Christophe Ferrandon

post-47425-0-87251400-1437653285_thumb.png

Edited by christophe.ferrandon
Link to comment
Share on other sites

Not sure if this will help but I had the same problem with my spanish external subs. Converting all of them to UTF-8 without BOM solved all my problems. This can be easily done with a shell script in Linux or with a Powershell script in Windows (I wrote both since I'm running Emby on Linux but my fileserver runs Windows) and add it to a schedule task or cron. I never had any other problem with subs since then.

 

 

Sent from my iPhone using Tapatalk

Link to comment
Share on other sites

  • 1 month later...
danergo

My question regarding this:

 

Can I setup a cronjob right after the downloading of the subtitles?

What do you think?

 

I wouldn't stress the system with a cron job, if I can schedule the converter script after the downloader.

 

 

Thank you.

Link to comment
Share on other sites

My question regarding this:

 

Can I setup a cronjob right after the downloading of the subtitles?

What do you think?

 

I wouldn't stress the system with a cron job, if I can schedule the converter script after the downloader.

 

 

Thank you.

 

Not sure if you can do that from Emby itself.

 

In my case I'm running the script down all my video library, every 30 minutes, directly on the file server that serves Emby.

It takes around 1 minute and 30 seconds for the script to analyze all my subs (1559 files at the moment) and convert to UTF-8 without BOM the ones that need it.

Of course on the first run if you have a lot of subs to convert it will take longer.

 

Hope this helps you.

Edited by fc7
Link to comment
Share on other sites

  • 2 weeks later...
MiChAeLoKGB

Hi,

 

I have problem with Central european (Slovak/Czech) subtitles, as they have special characters like: š, č, ť, ľ, ž etc. and they are not showing propperly in web client, but they work OK when I play movie with KMP. The encoding we usually use for subtitles is Windows: 1250.

 

 

Also, could you make the OPEN button when I click on series, open either series screen (where you can select which season to watch), or at least season (yes, season not the episode) which has next unwatched episode? As now its opening latest added episode, even when I have 4 seasons downloaded and I saw only first episode of season 1... Its extremly annoying and dumb to have it that way, as I always have to click myself trough, back from season 4 to season 1 every time.

 

Thanks.

post-4619-0-45937500-1442178278_thumb.png

Continuum.S01E01.zip

Edited by MiChAeLoKGB
Link to comment
Share on other sites

MiChAeLoKGB

I know that (as I am a web dev), but I wont do it for couple hundreds of SRT files. Its not that hard for them to support multiple encodings.

Edited by MiChAeLoKGB
  • Like 1
Link to comment
Share on other sites

@fc7:

 

I have found a solution for accented letters. I downloaded the source and modify the recognisation of the subtitle encoding.

There was a function which is responsible for recognising the encoding and it always returned UTF8 (doesn't matter what the current encoding was).

 

So I modified this function and now it works perfectly also with non-UTF8 subtitles.

I also wrote to the github but noone responded...

Edited by danergo
  • Like 2
Link to comment
Share on other sites

@fc7:

 

I have found a solution for accented letters. I downloaded the source and modify the recognisation of the subtitle encoding.

There was a function which is responsible for recognising the encoding and it always returned UTF8 (doesn't matter what the current encoding was).

 

So I modified this function and now it works perfectly also with non-UTF8 subtitles.

I also wrote to the github but noone responded...

Sounds cool. Let's see if this change can make it into the code so everyone can benefit. :)

 

 

Sent from my iPad using Tapatalk

Link to comment
Share on other sites

Can you share the script?

Sure. I will connect with my laptop later and share the code here.

 

 

Sent from my iPad using Tapatalk

Link to comment
Share on other sites

  • 3 weeks later...

Same problem here with spanish external srt.

Any news or solution?

Thanks!

Right now the first thing you should do is check your subs encoding. If they aren't UTF-8 then you need to convert them.

If the code change proposed by @@danergo is included in future Emby releases then this will not be necessary anymore.

 

 

Sent from my iPad using Tapatalk

Edited by fc7
Link to comment
Share on other sites

Juanmanuelius

Right now the first thing you should do is check your subs encoding. If they aren't UTF-8 then you need to convert them.

If the code change proposed by @@danergo is included in future Emby releases then this will not be necessary anymore.

 

 

Sent from my iPad using Tapatalk

Hi fc7 !

Can you please share the script you use to convert them ?

Big thanks!

Link to comment
Share on other sites

Ok, here we go. This is a bash shell script I also have another Powershell version in case someone wants it:

#!/bin/bash
FROM=iso-8859-1
TO=utf-8
ICONV="iconv -f $FROM -t $TO"

# Convert
find ./ -type f -name "*.srt" | while read fn; do

    IS_TARGET=`file "${fn}" | grep -i iso-8859`
    
    if [ "$IS_TARGET" = "" ]; then

        echo "${fn} ---- Will NOT be converted!"

    else
        
        echo "${fn} ---- Will be converted!"
        cp "${fn}" "${fn}.bak"
        $ICONV < "${fn}.bak" > "${fn}"
        
    fi

done

Basically you need to put this script on the root of your media folder tree. The script will go down each sub-folder, check if the sub file is in "FROM" encoding, and convert it to "TO" encoding.

Just configure those variables according to the enconding you want to use.

The script will also save a backup of your original sub file before convertion, in case something goes wrong and you want to recover your original file.

 

For sure it can be improved since I'm not using it anymore (moved to powershell since my fileserver is running Windows Server), so feel free to do it and share! :)

 

Cheers

  • Like 1
Link to comment
Share on other sites

Juanmanuelius

Ok, here we go. This is a bash shell script I also have another Powershell version in case someone wants it:

#!/bin/bash
FROM=iso-8859-1
TO=utf-8
ICONV="iconv -f $FROM -t $TO"

# Convert
find ./ -type f -name "*.srt" | while read fn; do

    IS_TARGET=`file "${fn}" | grep -i iso-8859`
    
    if [ "$IS_TARGET" = "" ]; then

        echo "${fn} ---- Will NOT be converted!"

    else
        
        echo "${fn} ---- Will be converted!"
        cp "${fn}" "${fn}.bak"
        $ICONV < "${fn}.bak" > "${fn}"
        
    fi

done

Basically you need to put this script on the root of your media folder tree. The script will go down each sub-folder, check if the sub file is in "FROM" encoding, and convert it to "TO" encoding.

Just configure those variables according to the enconding you want to use.

The script will also save a backup of your original sub file before convertion, in case something goes wrong and you want to recover your original file.

 

For sure it can be improved since I'm not using it anymore (moved to powershell since my fileserver is running Windows Server), so feel free to do it and share! :)

 

Cheers

 

Work perfectly !

I am running on mac os x.

Thanks a lot fc7 !!!

Link to comment
Share on other sites

Awesome. To be honest I'm surprised it just worked since I didn't know OS X included all the needed commands.

Link to comment
Share on other sites

Juanmanuelius

Awesome. To be honest I'm surprised it just worked since I didn't know OS X included all the needed commands.

I am not a developer, I have no idea about scripts and this things. But I always try :)

I am running on automator but I don't know how specify an specific folder to process. I think is processing in all hard drive.  :lol:

Link to comment
Share on other sites

I am not a developer, I have no idea about scripts and this things. But I always try :)

I am running on automator but I don't know how specify an specific folder to process. I think is processing in all hard drive.  :lol:

 

Let's assume you have a Media folder that is organized in sub-folders like this:

 

/Media

/Media/Movies

/Media/Music

/Media/TV_Shows

 

In this case the script should be placed in /Media. Then it will traverse all your sub-folders looking for subs and converting them. In any case is harmless to run it on the root of the drive, I will just take more time. :)

Link to comment
Share on other sites

Juanmanuelius

Thanks!

I don't know what Im doing wrong, but I can't process a specific folder, even if I put the script in media folder (I have organized as you said).

I think is because i have to configure automator to run the script well, but I don`t know how  :lol:.

 

Thanks a lot for your help!

I have all my srt working fine now!

Link to comment
Share on other sites

Thanks!

I don't know what Im doing wrong, but I can't process a specific folder, even if I put the script in media folder (I have organized as you said).

I think is because i have to configure automator to run the script well, but I don`t know how  :lol:.

 

Thanks a lot for your help!

I have all my srt working fine now!

 

Permissions? Do you see any error in the console? I'm sorry but even when I'm familiar with OS X I don't know how automator works.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...