Jump to content

Feature Request: Audio Normalization


Netfool

Recommended Posts

You would not double you library as all you do is add 1 track to existing files.  All other streams are copied.

Of course you could set it to also convert any video to a specific format if you wanted or convert interlaced to progressive video.  Basically anything that would cause transcoding (bandwidth aside) to be fixed.  Since you're really just added a track you could actually remove the track later with a new remux and be back essentially to what you started with.

This IS the solution as I doubt you'll ever see this in Emby clients across the board as I don't think it's possible.

This is just personal, but I try and make my media consistent so that it doesn't matter if I copy it to my phone, play it in Kodi, VLC or any other player.

It does take time, even to remux things but I did it to well over 100K files in under a month so it's doable.

  • Like 2
Link to comment
Share on other sites

  • 1 year later...
frankmb

Bumping up Audio Normalization.

 

I constantly need to fight loud sections by lowering volume then increasing again for hearing conversations. It would be great if Emby could reduce the range between soft and loud audio.

I'm using the Emby Roku app on a TCL Roku TV with its speakers. I don't have the "Volume Leveling" option that some Roku devices have that normalizes audio.  

Link to comment
Share on other sites

PuffyToesToo
On 12/2/2020 at 1:54 PM, Gilgamesh_48 said:

Then leveling should be done in the client software, in this case Emby.

Yes, I hope they can add this. Not everyone wants or needs or CAN have an audio setup that includes a receiver etc.

 

Link to comment
Share on other sites

Historically, the way to do this is process your files before adding them to your libraries.

Link to comment
Share on other sites

48 minutes ago, cayars said:

Historically, the way to do this is process your files before adding them to your libraries.

Sounds like this feature would greatly benefit those who either don't have the time or the know-how to do that.

Pretty much everybody falls into at least one of those two categories.

Link to comment
Share on other sites

lowdough
32 minutes ago, C.S. said:

Sounds like this feature would greatly benefit those who either don't have the time or the know-how to do that.

Pretty much everybody falls into at least one of those two categories.

Amen to that.  Of course, before emby was created, Historically, the way to perform its function was to use some other software, perform its functions manually, etc.

Look, we're asking for a feature to be added to Emby.  I'm sure many of us appreciate a reference to some convoluted workaround that may or may not accomplish the task, but as CS said, it's no substitute for Emby doing it, depending on whether the user checked a box or some similar easy selection process.

"This would be so much easier if these ^%$%^$# customers would just go away."  --  one of my buddies.

We are your customers.  We're giving you gold by telling you what features we value.  Ignore us at your peril.

Edited by lowdough
Link to comment
Share on other sites

Hi.  Actually, I would say this function is normally handled at the audio output end because modifying the actual media is global and permanent and this issue (too high of a dynamic range) is usually environmental and, therefore, conditional on exactly where/when you are playing the item. 

Many TVs and most receivers have a way to do this type of compensation which means the audience for the feature within Emby is fairly small (only those without devices that already account for this).  It is also not a simple thing to implement (well) so the cost/benefit is a bit difficult right at this time.  This may be easier to do in some apps than others as well (Roku would be one of the harder ones).

There is an open feature request for this though so please lend your support there.

Thanks.

And related:

 

  • Like 1
Link to comment
Share on other sites

50 minutes ago, ebr said:

Many TVs and most receivers have a way to do this type of compensation which means the audience for the feature within Emby is fairly small (only those without devices that already account for this).  It is also not a simple thing to implement (well) so the cost/benefit is a bit difficult right at this time.  This may be easier to do in some apps than others as well (Roku would be one of the harder ones).

You are thinking about this from your own (technically informed) perspective. When it comes to audio, most Emby users, i.e. our friends and family, know three things: Volume Up, Volume Down, and Mute. For the sound, all they have are TV speakers and sometimes soundbars.

I have a couple dozen users, and right now I can't think of a single one who hasn't mentioned the audio level issue. All they know is that some shows are loud, some are quiet, and it's kind of annoying.

As for implementation, would this really be done at the client level? Aren't we talking about the server transcoding audio?

Link to comment
Share on other sites

8 minutes ago, C.S. said:

our friends and family, know three things: Volume Up, Volume Down, and Mute

Are you saying they may actually already have equipment that can do this?  If they don't look beyond the simple volume controls they have now, then they wouldn't look for an option for any type of normalization either.

10 minutes ago, C.S. said:

All they know is that some shows are loud, some are quiet,

However, I think we're talking about different things here.  There are two situations people have trouble with here.  The one I was assuming was the issue of a particular soundtrack having a high dynamic range so you feel like you have to fiddle with the volume constantly between scenes.  That is the scenario described in the post I most recently responded to.  What you describe above comes from different media from different sources with different types of audio and encoding/mastering or copying.  That is a different situation (and more difficult actually) than the just high dynamic range in a single audio track.  A discussion of that is in the second link I provided above.

12 minutes ago, C.S. said:

As for implementation, would this really be done at the client level? Aren't we talking about the server transcoding audio?

Ideally it would be handled by the app (dynamic range compression - not normalization between content) because, any time something transcodes or processes in any way on the server, we have people complaining so we try to handle as much as we can at the app end.

Again, though, if your primary concern is differing audio between different media, then that is a much more difficult issue to try and compensate for.

Link to comment
Share on other sites

visproduction

Interesting topic.  ffmpeg normalization is possible.  It should be an on / off option to help people with average computers not wanting to add processing time or alter the audio.  See: https://superuser.com/questions/323119/how-can-i-normalize-audio-using-ffmpeg

Normalization would probably make the audio sound a little clearer on small hardware playback. It can also fix bad volume audio.  But, on a larger TV it will alter the sound enough, to make it sound broken to an audiophille's ear.  Few people are that discriminating. For everyone else, the audio would end up being tiring to listen to after an hour or two.

I would guess for a 2 hour media program on an average computer, encoding time would increase by perhaps 10% to 25%. That extra time can be a problem.  So, Normalization must have an off switch.  I don't think you can normalize for live content because the method is to read the entire audio track for the whole length of the media and then adjust the volume and compression.  Perhaps it could be done in segments.  The result might not sound that great.

It was mentioned the average user does not want to adjust any settings to make the audio work better.  It sort of like saying you want a "Make it sound better' button.  Normalization is one step of changing the audio.  There are up to about a dozen steps of audio encoding in typical media, to get the original recorded audio to the viewer.  Normalize is not used in professional playback setups. It can have it's own issues.  A surrournd sound codec in DTS, Dolby is expected to have a certain audio profile.  If the surround sound is normalized, prior to being decoded by the TV, then the decoding will result in audio with inproper EQ and maybe even incorrect volume, per channel.  If you decide to normalize then the encoded result needs to be either stereo or PCM audio so the TV doesn't adjust the audio EQ curve again.  After normalization, it could be that the TV no longer recognizes the surround sound codec, anyway and just passes each audio speaker track along with no decoding.  I never ran into this question, becuase I would never treat audio like that.

The home or commercial theater installer, instead adjusts all the steps of audio quality to be optimal.  So, when you go to theater you get great sound.  There are many things you can add to your home system to improve audio quality.  But that is not the topic of this forum post.  If you trying to get better audio quality, it might make sense to consider upgrading audio playback and your encoding settings or process, as well as rejecting original media copies that have audio problems. 




 

Edited by visproduction
Link to comment
Share on other sites

1 hour ago, ebr said:

However, I think we're talking about different things here.  There are two situations people have trouble with here.  The one I was assuming was the issue of a particular soundtrack having a high dynamic range so you feel like you have to fiddle with the volume constantly between scenes.  That is the scenario described in the post I most recently responded to.  What you describe above comes from different media from different sources with different types of audio and encoding/mastering or copying.  That is a different situation (and more difficult actually) than the just high dynamic range in a single audio track.  A discussion of that is in the second link I provided above.

Ideally it would be handled by the app (dynamic range compression - not normalization between content) because, any time something transcodes or processes in any way on the server, we have people complaining so we try to handle as much as we can at the app end.

Again, though, if your primary concern is differing audio between different media, then that is a much more difficult issue to try and compensate for.

My bad. I may be referring to a different issue.

The most common problem I've seen goes like this:

User plays a movie with a 5.1 track. User has no surround sound setup, so the 5.1 track is mixed down to stereo. User turns the volume wayyyyyy up to compensate for the attenuation that comes with the mixdown. User finishes that movie and then starts one with a 2.0 track. User is blasted with sound. User wonders why things are this way. User asks me. I explain what's going on, and we all just live with it because whatayagonnado.

If somehow, with clipping protection, the attenuation from a 5.1 -> 2.0 mixdown could be addressed in the background, you guys would be heroes - underappreciated heroes for sure, but still heroes (to my parents at least).

Link to comment
Share on other sites

frankmb
2 hours ago, C.S. said:

User plays a movie with a 5.1 track. User has no surround sound setup, so the 5.1 track is mixed down to stereo. User turns the volume wayyyyyy up to compensate for the attenuation that comes with the mixdown. User finishes that movie and then starts one with a 2.0 track. User is blasted with sound. User wonders why things are this way. User asks me. I explain what's going on, and we all just live with it because whatayagonnado.

This is me LOL. I have been running various HTPC clients and servers for 15+ years and I never knew why I almost need to max out the volume on my TV for shows then that is way too loud for other things.

Link to comment
Share on other sites

PuffyToesToo
21 hours ago, C.S. said:

My bad. I may be referring to a different issue.

The most common problem I've seen goes like this:

User plays a movie with a 5.1 track. User has no surround sound setup, so the 5.1 track is mixed down to stereo. User turns the volume wayyyyyy up to compensate for the attenuation that comes with the mixdown. User finishes that movie and then starts one with a 2.0 track. User is blasted with sound. User wonders why things are this way. User asks me. I explain what's going on, and we all just live with it because whatayagonnado.

If somehow, with clipping protection, the attenuation from a 5.1 -> 2.0 mixdown could be addressed in the background, you guys would be heroes - underappreciated heroes for sure, but still heroes (to my parents at least).

Yes, this right here. 

Link to comment
Share on other sites

Tremas
22 hours ago, C.S. said:

My bad. I may be referring to a different issue.

The most common problem I've seen goes like this:

User plays a movie with a 5.1 track. User has no surround sound setup, so the 5.1 track is mixed down to stereo. User turns the volume wayyyyyy up to compensate for the attenuation that comes with the mixdown. User finishes that movie and then starts one with a 2.0 track. User is blasted with sound. User wonders why things are this way. User asks me. I explain what's going on, and we all just live with it because whatayagonnado.

If somehow, with clipping protection, the attenuation from a 5.1 -> 2.0 mixdown could be addressed in the background, you guys would be heroes - underappreciated heroes for sure, but still heroes (to my parents at least).

I haven't used this myself, but can't this specific case be controlled by adjusting the "Audio Boost When Downmixing" setting in the server Transcoding section?

image.png.356d390d36a03df175ae558a16fc11e3.png

Edited by Tremas
Link to comment
Share on other sites

Happy2Play
7 minutes ago, Tremas said:

I haven't used this myself, but can't this specific case be controlled by adjusting the "Audio Boost When Downmixing" setting in the server Transcoding section?

image.png.356d390d36a03df175ae558a16fc11e3.png

But default is 2 which should increase down mixed sound.  But the scenario provided suggest an issue of downmixing surround sound to stereo.

Link to comment
Share on other sites

1 hour ago, Tremas said:

I haven't used this myself, but can't this specific case be controlled by adjusting the "Audio Boost When Downmixing" setting in the server Transcoding section?

image.png.356d390d36a03df175ae558a16fc11e3.png

I don't believe the scenario I described involves any transcoding on the server side. The client direct plays and downmixes the 5.1 audio to 2.0, forcing attenuation.

Link to comment
Share on other sites

Tremas

Ok, just a thought. When you said it was downmixing because a user did not have a 5.1 setup, I thought you were suggesting that it was unable to direct play.

Link to comment
Share on other sites

Happy2Play
12 minutes ago, C.S. said:

I don't believe the scenario I described involves any transcoding on the server side. The client direct plays and downmixes the 5.1 audio to 2.0, forcing attenuation.

The this would be a specific client issue as it is not downmixing correctly I assume loosing the center channel.

But normalization would require conversion and since the server is not doing it I would say it is out of Emby's hands.

Link to comment
Share on other sites

2 hours ago, Happy2Play said:

The this would be a specific client issue as it is not downmixing correctly I assume loosing the center channel.

No I don't believe that is the case I'm describing. This is what I'm talking about:

https://bobpariseau.com/blog/2018/6/20/understanding-audio-downmix-and-surround-sound-processing-or-wait-ive-got-the-wrong-number-of-speakers

Quote

These rules, for example, detail "Downmix Attenuation".  A Digital Audio stream has a maximum volume it can represent -- called the Full Scale signal.  But if you are mixing two or more streams together as part of Downmixing, what if they are each ALREADY at Full Scale?  You'll end up "clipping" the result -- meaning the audio gets distorted.  So Downmix Attenuation is applied.  That is, the individual signals are each reduced in volume before they are summed together, to keep the result from clipping.  The AMOUNT of Downmix Attenuation you need is based on how many channels are being mixed together -- plus how much attenuation might already have been applied to "preserve the sound stage" as described above.  Again, this is all standardized.  From the user's perspective, this shows up as difference in volume according to the content you are playing.  For example if you play a Stereo track into Stereo speakers at a given Volume setting, it will sound LOUDER than if you play a 5.1 track into those same Stereo speakers at the same Volume setting. The 5.1 track has to be Downmixed, and Downmix Attenuation has to be applied.

The next section is VERY interesting:

Quote

(Alternatively, the AVR could process the Digital Audio stream using more bits per sample than were present in the original audio tracks.  That means Full Scale for the processed audio is larger than Full Scale for the content channels.  So there's more room to sum things together without clipping.  Thus no need for Downmix Attenuation, but at the cost of using more expensive electronics in the AVR.)

So all you guys need to do is emulate some fancy AVR magic. Piece of cake?😬

Link to comment
Share on other sites

Happy2Play

To me everyone is still talking about something different as Server or Client are totally different?  If an item is Direct Played, then the server is eliminated and anything that happens is the clients fault.

  • Agree 1
Link to comment
Share on other sites

Tremas
13 hours ago, Happy2Play said:

To me everyone is still talking about something different as Server or Client are totally different?  If an item is Direct Played, then the server is eliminated and anything that happens is the clients fault.

One step further. If I understand correctly (and I fully accept that I may not understand correctly), if the media is direct played it is most likely using the native media player on the target device. It wouldn't be the emby client's fault so much as the how the Roku/Fire stick/Chromecast/Shield/Apple TV hardware handles that type of audio stream. If so, I think the only way for emby to get the device to behave differently would be to force a transcode and do the downmix on the server. The emby client is doing it's job correctly based on what capabilities the hardware is reporting.

  • Like 1
Link to comment
Share on other sites

pwhodges
20 hours ago, C.S. said:

The next section is VERY interesting:

So all you guys need to do is emulate some fancy AVR magic. Piece of cake?😬

What is described there requires the use of floating point (as in many DAWs).  But there's no magic; if the summed audio goes "over the top" (requires the use of floating point) then it will still need to be attenuated before it can be played on any hardware.

The reason that real-time summing has to take a cautious approach is that there is no way of knowing in advance how loud the loudest part will be.

Paul

Link to comment
Share on other sites

roaku
16 minutes ago, pwhodges said:

What is described there requires the use of floating point (as in many DAWs).  But there's no magic; if the summed audio goes "over the top" (requires the use of floating point) then it will still need to be attenuated before it can be played on any hardware.

The reason that real-time summing has to take a cautious approach is that there is no way of knowing in advance how loud the loudest part will be.

Paul

Let's add a VST pipeline so we can put a brickwall limiter on the master to pump up the voume without clipping. Maybe add some nice analog tape simulation so everything sounds like it was recorded in 1983, since that's trendy right now.

  • Haha 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...