Jump to content

Emby Server/Apps: Support External Audio Tracks for Movies/TV


sydlexius
Go to solution Solved by softworkz,

Recommended Posts

  • 7 months later...
  • Solution

There are features where you know all about it - how it can work, how it needs to be implemented and where you are sure that it will work out successfully.

Other features require a certain amount of research to evaluate and determine whether it's doable, which effort it would take and whether the results would be reasonably useful.

Another category of features exists, where you know up-front that there's nothing to win, which means that you know that you can't deliver it in a way that it would satisfy users' expectations, and you even know that this feature would eventually cause much more dissatisfaction than not providing the feature at all.

Unfortunately, this feature falls into the latter category - even though it seems to be so simple and even though some local players support it (somewhat), but this feature would suck in too many cases, even with high effort put into it. 

To be a bit more specific: There will be offsets between audio and video. And not only offsets but also drifts. And these two things cannot be automatically fixed by the server as it requires a human to assess and adjust. This means in turn that we would need to have controls for adjusting these two parameters - in all clients.
But besides adjustment controls, many clients can't handle such things locally in terms of playback, and the server would need to do those adjustments - but the server, which is transcoding and/or muxing audio and video together, is often way ahead of time in processing, so adjustments made at the client, won't happen instantly. But only with instant changes can adjustments be made properly.
Also, when changing offsets of audio/video in a live-transcoding/live-streaming scenario, HLS clients may fail and stop playback, as players usually sync by audio not by video. For players that support offsets, the range of these is quite limited because it happens at a point where video frames are already decompressed, and raw video frames take a huge amount of memory per frame.

Then there's transcoding itself at the server. We support a gigantic amount of different transcoding setups and we know from subtitles that the behavior regarding data flow in ffmpeg is fundamentally different when there's a stream involved that is coming from a separate external file. For subtitles it doesn't matter that much, because all subtitle frames can be loaded into memory at once, but for audio, it can be different. As always, we don't have a single case to consider, but ALL cases and there's no room for an additional factor and potential point of error.
It's also important to understand that you cannot compare a local player to what Emby Server is doing. Emby server is an intermediate instance in the chain of media delivery to clients. It doesn't work by outputting AV to local hardware (display/sound) directly, but instead it re-encodes/remuxes media which will finally be demuxed and decoded and presented by clients/hw/sw which is usually a lot less capable (than VLC for example).

The deeper you look into this, the more difficult it gets and what I've mentioned so far is just scratching the surface. While everything is always possible for the future, I think we shouldn't make any false promises. It's too unlikely that this could ever be part of our playback procedures, not just because these are quite complex already, but also because this wouldn't work as a feature that would match our own standards in terms of quality and reliability and cause an endless amount of user reports instead.

 

A more realistic feature might be some kind of "interactive remuxing", which allows you to set offset and drift values interactively through some UI in the server administration area (or maybe through a plugin) and mux the external audio track into the video file so that eventual playback will use the prepared file..

  • Like 1
Link to comment
Share on other sites

sydlexius
9 hours ago, softworkz said:

There are features where you know all about it - how it can work, how it needs to be implemented and where you are sure that it will work out successfully.

Other features require a certain amount of research to evaluate and determine whether it's doable, which effort it would take and whether the results would be reasonably useful.

Another category of features exists, where you know up-front that there's nothing to win, which means that you know that you can't deliver it in a way that it would satisfy users' expectations, and you even know that this feature would eventually cause much more dissatisfaction than not providing the feature at all.

Unfortunately, this feature falls into the latter category - even though it seems to be so simple and even though some local players support it (somewhat), but this feature would suck in too many cases, even with high effort put into it. 

To be a bit more specific: There will be offsets between audio and video. And not only offsets but also drifts. And these two things cannot be automatically fixed by the server as it requires a human to assess and adjust. This means in turn that we would need to have controls for adjusting these two parameters - in all clients.
But besides adjustment controls, many clients can't handle such things locally in terms of playback, and the server would need to do those adjustments - but the server, which is transcoding and/or muxing audio and video together, is often way ahead of time in processing, so adjustments made at the client, won't happen instantly. But only with instant changes can adjustments be made properly.
Also, when changing offsets of audio/video in a live-transcoding/live-streaming scenario, HLS clients may fail and stop playback, as players usually sync by audio not by video. For players that support offsets, the range of these is quite limited because it happens at a point where video frames are already decompressed, and raw video frames take a huge amount of memory per frame.

Then there's transcoding itself at the server. We support a gigantic amount of different transcoding setups and we know from subtitles that the behavior regarding data flow in ffmpeg is fundamentally different when there's a stream involved that is coming from a separate external file. For subtitles it doesn't matter that much, because all subtitle frames can be loaded into memory at once, but for audio, it can be different. As always, we don't have a single case to consider, but ALL cases and there's no room for an additional factor and potential point of error.
It's also important to understand that you cannot compare a local player to what Emby Server is doing. Emby server is an intermediate instance in the chain of media delivery to clients. It doesn't work by outputting AV to local hardware (display/sound) directly, but instead it re-encodes/remuxes media which will finally be demuxed and decoded and presented by clients/hw/sw which is usually a lot less capable (than VLC for example).

The deeper you look into this, the more difficult it gets and what I've mentioned so far is just scratching the surface. While everything is always possible for the future, I think we shouldn't make any false promises. It's too unlikely that this could ever be part of our playback procedures, not just because these are quite complex already, but also because this wouldn't work as a feature that would match our own standards in terms of quality and reliability and cause an endless amount of user reports instead.

 

A more realistic feature might be some kind of "interactive remuxing", which allows you to set offset and drift values interactively through some UI in the server administration area (or maybe through a plugin) and mux the external audio track into the video file so that eventual playback will use the prepared file..

Thanks for your detailed explanation.  I haven't encountered any issues with the Rifftrax media, but I understand what you're driving at as this being an impractical general solution.

  • Thanks 1
Link to comment
Share on other sites

16 minutes ago, sydlexius said:

Thanks for your detailed explanation.  I haven't encountered any issues with the Rifftrax media, but I understand what you're driving at as this being an impractical general solution.

Yea, it's not something that can be "sold" as a feature. 

You are "the one guy" who says:

On 2/21/2018 at 12:14 AM, sydlexius said:

I wouldn't expect Emby to handle the offset calculations needed to sync the audio.  It's not too difficult for me to modify that now.

But all others will say: "Hey the feature isn't working, audio and video are out of sync!"

  • Like 1
Link to comment
Share on other sites

sydlexius
2 minutes ago, softworkz said:

Yea, it's not something that can be "sold" as a feature. 

You are "the one guy" who says:

But all others will say: "Hey the feature isn't working, audio and video are out of sync!"

Yeah, my narrow use-case does allow for a myopic view like that :) I'm just going to work on developing a better workflow to integrate my audio tracks with my existing media.  I still think there should be a "lazy-friendly" solution to media like this!

  • Like 1
Link to comment
Share on other sites

48 minutes ago, sydlexius said:

Yeah, my narrow use-case does allow for a myopic view like that :) I'm just going to work on developing a better workflow to integrate my audio tracks with my existing media.  I still think there should be a "lazy-friendly" solution to media like this!

One way would be some kind of super-charged client which requests both, the main A/V item plus the extra-audio item simultaneously - both for direct play. That would be a situation then, which is comparable to those other players that were mentioned.

It's not a very flexible solution, but more doable, even though the same things apply that I mentioned above:
The client-player would need to have controls for offset and drift adjustment.

Link to comment
Share on other sites

  • 4 months later...

External audio tracks are working on Kodi with Embycon.

I tried it myself with these files:

  • Weathering With You (2019).51ch.mka
  • Weathering With You (2019).mkv
  • Weathering With You (2019).ass

I can select the 5.1 sound track while watching on Kodi.

People have been talking about this for five years and no progress.

Choose Kodi and Embycon!

  • Like 2
Link to comment
Share on other sites

sydlexius
6 hours ago, dfsdf said:

External audio tracks are working on Kodi with Embycon.

I tried it myself with these files:

  • Weathering With You (2019).51ch.mka
  • Weathering With You (2019).mkv
  • Weathering With You (2019).ass

I can select the 5.1 sound track while watching on Kodi.

People have been talking about this for five years and no progress.

Choose Kodi and Embycon!

Thanks for the report!  Too bad Kodi has too low a WAF/SAF to adopt in my household!

Link to comment
Share on other sites

HawkXP71

+1 for this functionality.

However, I see this as a plugin.

The plugin when run, would look for paired files of *.MKV (or whatever video format) and the audio file.
Then it would remux the video, using a copy of the video stream + adding the audio file as a new stream, and add this as a new version of the mkv (So *-Audio Mux.mkv)

Then using the "Auto Movie Version Grouping" plugin, both versions of the MKV will be picked up and merged into a choice pull down for the video

Link to comment
Share on other sites

  • 7 months later...

External audio only with direct play seems like a good solution. I know it's possible to use other client in some way, but it's not an elegant process and not well supported in different platforms. Is it possible to add this functionality to Emby Theatre? This is very meaningful for many use cases. Personally I have many external audio files similar in size to its paired video files which make merging them not a good option.  As I know Jellyfin has external audio file support now, though I'm not clear it's achieved by its server or client.

On 12/13/2022 at 2:51 AM, softworkz said:

One way would be some kind of super-charged client which requests both, the main A/V item plus the extra-audio item simultaneously - both for direct play. That would be a situation then, which is comparable to those other players that were mentioned.

It's not a very flexible solution, but more doable, even though the same things apply that I mentioned above:
The client-player would need to have controls for offset and drift adjustment.

 

  • Agree 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...