Jump to content

Live-Video-Translation Plugin Beta using OpenAI Whisper


Go to solution Solved by bobo99,

Recommended Posts

Posted

Hi @bobo99- I have tried to test but have not been able to make it work. To be honest, it might just be that my machine is too slow.

When Emby switches to the translated stream, it errors with "No compatible stream available".

Please find logs attached here.

I've tried different streams, different iptv providers, different settings but more or less I always end up with the same "No compatible stream available error".

Let me know what you find from the logs, of if you need me to try anything else.

Thanks!

ffmpeg-transcode-4fb3bf9c-4dc5-4f3f-909a-95278c4dcd4f_1.txt embyserver (2).txt

Posted
8 hours ago, axelsl said:

Hi @bobo99- I have tried to test but have not been able to make it work. To be honest, it might just be that my machine is too slow.

When Emby switches to the translated stream, it errors with "No compatible stream available".

Please find logs attached here.

I've tried different streams, different iptv providers, different settings but more or less I always end up with the same "No compatible stream available error".

Let me know what you find from the logs, of if you need me to try anything else.

Thanks!

ffmpeg-transcode-4fb3bf9c-4dc5-4f3f-909a-95278c4dcd4f_1.txt 24.22 kB · 1 download embyserver (2).txt 8.46 MB · 0 downloads

So the first time you tried to start playing HRT, I'm not actually seeing the plugin try and switch streams to the translated stream.

If you disable the pluging is the IPTV working? (No compatible stream available error)

Can you post a picture of the configuration of the plugin.

Importantly, what hardware is the translation AND transcoding happening on ?

Hvala.

Posted

Hi @bobo99

Yes the IPTV works when the plugin is disabled. And I only get the "no compatible stream available" error when it attempts to switch to the generated stream with subtitles.

And indeed in that example I first tried to play HRT 2 I think, and I did not see the Whisper ASR API being targeted at all. Then I tried to play HRT 4, and it actually tried, I see the Whisper ASR API being triggered, but then playing the generated stream with subtitles failed.

I attached the config of the plugin.

I tried a few configurations but in that test I had the transcoding happen on one machine (Windows 10, Processor Intel(R) N95, 1700 Mhz, 4 Core(s), 4 Logical Processor(s), Integrated GPU Intel® UHD Graphics), and the transcription/translation on another one (docker run on  Ubuntu 24.10, Processor Intel® Core™ i5-8250U × 8, integrated GPU Intel® UHD Graphics 620). They're both pretty low performance, and don't have an actual Graphics card so it might be why it is not working properly.

I tried something else - instead of starting the live TV stream from my laptop, I tried from my iPhone. And in that case, the newly generated stream actually starts! But no subtitles at all in that stream (I can select WEBVTT for subtitles, but no subtitle appear on screen), and the generated stream is too slow (takes about 30 seconds to generate a 5 second chunk). But I actually see the translation from the very beginning of the stream in the logs!

I am attaching the logs of that test too, if it can help.

 

Thanks!

Screenshot From 2025-04-24 19-03-03.png

embyserver (3).txt ffmpeg-transcode-3ac47d99-d3bc-4db8-ba3a-078e0b77b794_1.txt

BillOatman
Posted
On 3/2/2025 at 12:26 PM, AXIANT said:

I would love to use this on media in my library. Lots of things I have don’t have subtitles or they are not timed correctly. This would be fantastic!

This might work for you as well.

Posted

Hi @bobo99

I did more testing today. Moved the docker for the translation to a better machine - I am getting close to something really good!

I get the subtitles/translation (and it is brilliant!), my only problem now is performance - it outruns the buffer every minute or so. I'll play with settings and configurations and see if I can get something workable - with my subpar machines. But it is looks super promising!

Posted

A few observations:

- the chunk size setting does not seem to work. It always goes to 5s chunks.

- the subtitles only work on the web version of Emby. On iOS, the subtitles do not show. On Roku, the generated stream does not play at all.

Posted

Hi @bobo99

in order to cater for people with very weak machines (like me!)  - would It be possible to give the option to use the OpenAI API for Whisper translation - instead of a locally hosted model? User should add an OpenAI API key to the settings and would have to pay for usage. But no need for a high performance machine!

let me know if you’d like to do that - I can definitely help. And can test it of course.

thanks!

BillOatman
Posted (edited)
17 minutes ago, axelsl said:

Hi @bobo99

in order to cater for people with very weak machines (like me!)  - would It be possible to give the option to use the OpenAI API for Whisper translation - instead of a locally hosted model? User should add an OpenAI API key to the settings and would have to pay for usage. But no need for a high performance machine!

let me know if you’d like to do that - I can definitely help. And can test it of course.

thanks!

Good thought but I suspect the latency of going out to that service for every line would be too much.
Plus, you will need a GPU for sure on whatever machine your translation docker is running on.

Edited by BillOatman
Posted

@BillOatmanI’ve used a chrome extension to transcribe and translate the audio using the OpenAI whisper API and it is super fast - it definitely can keep up.

And that API both transcribes and translates, so you do not need the translation docker at all anymore.

Posted
On 25/04/2025 at 14:52, axelsl said:

Hi @bobo99

I did more testing today. Moved the docker for the translation to a better machine - I am getting close to something really good!

I get the subtitles/translation (and it is brilliant!), my only problem now is performance - it outruns the buffer every minute or so. I'll play with settings and configurations and see if I can get something workable - with my subpar machines. But it is looks super promising!

Great!

On 25/04/2025 at 17:26, axelsl said:

A few observations:

- the chunk size setting does not seem to work. It always goes to 5s chunks.

- the subtitles only work on the web version of Emby. On iOS, the subtitles do not show. On Roku, the generated stream does not play at all.

Chunk size is something that I thought of adding but it adds some other headaches that I have to deal with. Currently setting it in the GUI doesn't actually change it, I default to 5 in the back end. The benefit is you can increase chunk size to 10+ seconds, and get higher precision on the translation (due to the longer context length), but translation takes even longer, and you have to wait chunk size length before it even goes off to translation.

3 hours ago, axelsl said:

Hi @bobo99

in order to cater for people with very weak machines (like me!)  - would It be possible to give the option to use the OpenAI API for Whisper translation - instead of a locally hosted model? User should add an OpenAI API key to the settings and would have to pay for usage. But no need for a high performance machine!

let me know if you’d like to do that - I can definitely help. And can test it of course.

thanks!

Thanks for the feedback. Yes it looks like you're over running your buffer as the translating machine can't keep up.

I have tested it only on relatively "beefy" machines.

3090, 1070, or a P40 GPUs

I have had real-time success with a 5800X3D just in CPU mode on the "medium" model, but you need just as much RAM, which is about 5-12 GB depending on the model.

What model are you using for the Whisper ? try lowering to the "small" or "base" or even "tiny" and see if you can get your performance up that way.

I will look into OpenAI API, and see whether it's realistic for someone to pay for that service for this application (how many credits per 30 min video etc). Will see if it's easy to integrate and if there's more interest in the plugin.

This isn't ideal, but you can test making your own buffer. Simply click "pause" on the translated stream, and wait a couple minutes. If you overrun the stream, Emby's way of dealing with the back end stream is to basically stop querying it and eventually dies. If you manager your own buffer health (keep pausing the video to maintain a buffer), then you can use it, obviously not ideal.

3 hours ago, BillOatman said:

Good thought but I suspect the latency of going out to that service for every line would be too much.
Plus, you will need a GPU for sure on whatever machine your translation docker is running on.

I think they're recommending that the translation be done by the cloud, no local GPU required.

 

On 25/04/2025 at 13:09, BillOatman said:

This might work for you as well.

And mentioned it previously in thread, but this plugin was really created for liveTV. The code was easy to modify to allow for local videos, but there are better solutions for generating subtitles for your local videos.

 

Posted

@bobo99thanks for the feedback! 

If I change the model to tiny or base, I am close to getting the right performance, but the translation is terrible :D

In case you want help with trying to integrate the openAI API to do the translation on the cloud, and have a guinea pig to test it (especially wrt how much it would cost) - I am ready! I don't know if you're willing to share your source code but if you do, I can try and test a few things on my side.

Let me know!

Axel

  • 1 month later...
Posted

Attached is another updated version for testing.

Some slight performance improvements (still needs GPU for real time performance).

 

LiveVideoTranslation.dll

  • Like 1
  • 6 months later...
Posted

Hello! Does the latest version of Emby support this plugin? I cant seem to get the plugin to show up in plugins. I am on version 4.9.1.90

Thanks for this!

BillOatman
Posted
10 hours ago, Geeked said:

Hello! Does the latest version of Emby support this plugin? I cant seem to get the plugin to show up in plugins. I am on version 4.9.1.90

Thanks for this!

Pretty sure since it is a beta it is not in the catalog yet.  You need to download the latest dll from this thread and install it yourself.

BillOatman
Posted
On 4/26/2025 at 3:10 PM, axelsl said:

@BillOatmanI’ve used a chrome extension to transcribe and translate the audio using the OpenAI whisper API and it is super fast - it definitely can keep up.

And that API both transcribes and translates, so you do not need the translation docker at all anymore.

Nice!  What is the plugin name?

Posted
20 hours ago, Geeked said:

Hello! Does the latest version of Emby support this plugin? I cant seem to get the plugin to show up in plugins. I am on version 4.9.1.90

Thanks for this!

it is in beta, and there isn't as much interest in this as I expected. 

I can post another dll if you'd like for testing, just let me know. 

Posted
On 4/23/2025 at 6:14 PM, bobo99 said:

Here you are, it should work for 14 days.

Please let me know how it works for you!

Thanks. And what do I do after 14 days?

Posted
40 minutes ago, Geeked said:

Thanks. And what do I do after 14 days?

Dll is only valid for 14 days. Next one I post will not have that limitation 

Posted
2 hours ago, bobo99 said:

Dll is only valid for 14 days. Next one I post will not have that limitation 

Sure I'd like to try it out :)

Posted
On 19/12/2025 at 15:10, Geeked said:

Sure I'd like to try it out :)

 

1 hour ago, mrtj18 said:

grea! i would like to test it as well

I haven't visited this project since April. This was the last build that was working.

Here is the dll, this one is good for 14 days from today. Please let me know what you think. 

 

LiveVideoTranslation.dll

Posted (edited)

Welp from my experience with the plug-in, it actually works great, I think it's very cool, that I can watch a Spanish broadcast for example, with English captions with my IPTV provider. 

 

The bad news is, the whisperai or whisper live ai docker does not work with my current GPU. I have a rtx 5080 and no matter what I tried, I kept getting errors/forced CPU usage with the docker container. That kills my performance with everything across the board with other dockers I have installed including the stream I'm attempting to watch within emby.

 

The plug-in works great, I love to use it daily with my setup. But until I can figure out how to work the whisper docker properly, or until they patch the docker appropriately with my 5080 and my unraid set-up I can't use it.😐 but if course this has nothing to do with your development of this plugin.

Edited by mrtj18
Posted
47 minutes ago, mrtj18 said:

Welp from my experience with the plug-in, it actually works great, I think it's very cool, that I can watch a Spanish broadcast for example, with English captions with my IPTV provider. 

 

The bad news is, the whisperai or whisper live ai docker does not work with my current GPU. I have a rtx 5080 and no matter what I tried, I kept getting errors/forced CPU usage with the docker container. That kills my performance with everything across the board with other dockers I have installed including the stream I'm attempting to watch within emby.

 

The plug-in works great, I love to use it daily with my setup. But until I can figure out how to work the whisper docker properly, or until they patch the docker appropriately with my 5080 and my unraid set-up I can't use it.😐 but if course this has nothing to do with your development of this plugin.

Hmm did you run the docker with the Gpu flag? 

 

onerahmet/openai-whisper-asr-webservice:v1.7.0-gpu

 

And pass through Nvidia visible devices ? 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...