phantasm79666 0 Posted August 16, 2019 Posted August 16, 2019 (edited) Hello! I got wrong characters in subtitles on my LG (Netcast) TV using DLNA playback using Emby server 4.2.1 on windows 10. The subtitles are downloaded from Open Subtitles and are stored in ansi 2019-08-16 01:20:52.834 Info HttpClient: POST https://api.opensubtitles.org/xml-rpc 2019-08-16 01:20:52.968 Info SubtitleManager: Saving subtitles to D:\Sorozatok\Downton Abbey\S05\downton.abbey.s05e01.hdtv.x264.tla.hu.srt orig.srt If I play it back accuted characters are wrong. If I edit the downloaded subtitle file with notepad++ and save it as utf8 with bom then it plays back correcty. utf8bom.srt Is there a solution? I'd like to avoid manually eding all downloaded subtitles. orig.srt utf8bom.srt embyserver.zip Edited August 16, 2019 by phantasm79666
Luke 39638 Posted August 16, 2019 Posted August 16, 2019 Hi there, please attach the emby server log. thanks.
phantasm79666 0 Posted August 16, 2019 Author Posted August 16, 2019 Hi there, please attach the emby server log. thanks. Attached embyserver.zip
Luke 39638 Posted August 17, 2019 Posted August 17, 2019 Can you please zip up the original subtitles and attach them here? thanks.
phantasm79666 0 Posted August 17, 2019 Author Posted August 17, 2019 Attached subtitles-orig.zip subtitles-orig.zip
Luke 39638 Posted August 17, 2019 Posted August 17, 2019 What time in the video do those screenshots correspond to?
phantasm79666 0 Posted August 17, 2019 Author Posted August 17, 2019 1. The original wrong acutes (btw all acutes diplay wrong): 64 00:03:52,810 --> 00:03:56,520 Ó, de jó! Szívből gratulálok! 2. The one with ut8 bom is identical to the screenshot (all good in whole video): 326 00:16:51,500 --> 00:16:52,810 Köszönöm!
phantasm79666 0 Posted August 20, 2019 Author Posted August 20, 2019 (edited) I have tested the orig srt and the modified utf8 bom encoded srt in the web player on Safari on iOS 12.4 and Chrome 76.0.3809.100 on Windows. The results are the same in both browsers. Both versions are readable I noticted only differences for two characters. In orig srt "û" displayed instead of "ű" In orig srt "ô" displayed instead of "ő" It looks like for me that web client picks it up wrong encoding too, it uses ISO-8859-1 instead of ISO-8859-2 character encoding. However the difference is not major and it is readable at the end, but the utf8 bom encoded version is how it should really look like in hungarian. According to my tests UTF-8 with bom forces all subtitles to correctly display on DLNA and web clients. It looks like that the DLNA client in the TV assumes srt is always UTF-8 encoded. Researched a bit more and it looks like I having a problem similar to this issue https://emby.media/community/index.php?/topic/66351-subtitle-encoding-issues-on-dlna-lg-netcast/ however in my case the correct subtitle files are sent (the one with the hu suffix) problem is only the encoding of the characters. I'd like if this is sorted out in emby if possible rather than I change manually all non utf8 encoded srt files to utf8. Is there a magic setting which does this or a way to force in the dlna profile the subtitles to utf8? Edited August 20, 2019 by phantasm79666
Luke 39638 Posted August 20, 2019 Posted August 20, 2019 It's not an easy answer. You can't just force something to utf8. You have to know the encoding of the input file in order to be able to convert it. 1
phantasm79666 0 Posted August 22, 2019 Author Posted August 22, 2019 Thanks Luke! I understand... It's clear for me now from that this is a "fault" of the subtitles provided for the videos. Kind of a problem that most of the subtitles I have in hungarian are having this issue Maybe a plugin could be made to fix those where encoding can be detected and that could be useful for many of us with the same problem. I'll experiment with some tools to detect encoding and then write a script to convert my subtitles, probably running it as a scheduled task.
phantasm79666 0 Posted August 22, 2019 Author Posted August 22, 2019 Hi Luke, Just compiled uchardet on windows https://www.freedesktop.org/wiki/Software/uchardet/ It reliably outputs for all wrong subtitle files: - ISO-8859-2 And for all working subtitle files - UTF-8 Probably a feature to convert subtitle files to UTF-8 if detected encoding is matching a list would solve this issue. Can you build something like this in into Emby?
Luke 39638 Posted August 23, 2019 Posted August 23, 2019 We have a c# port of that already built into the server for encoding detection, and most of the time it is pretty accurate. Ours might be based on an older version though.
Solution phantasm79666 0 Posted August 24, 2019 Author Solution Posted August 24, 2019 (edited) Attached the scripts I made using windows versions of iconv and uchardet. 1. Unzip convertsrt.zip 2. Edit bin\convertsrt.bat to specify folders to scan and encoding to convert to utf8 3. Run bin\convertsrt.bat Should work on Windows 10. convertsrt.zip Edited August 24, 2019 by phantasm79666
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now