Jump to content

Add DV, Atmos, DTS:X, HDR10+, HLG information to database


FrostByte

Recommended Posts

rbjtech
29 minutes ago, signde said:

I was surprised by that too but upon further digging, Radarr no longer uses the Media Info library either and is strictly ffprobe based.

https://github.com/Radarr/Radarr/releases/tag/v4.0.4.5922

The class names that are used in analyzing media in both apps are still named Media Info. I suspect that is both because the place holders both arr apps use to rename files reference that exact name and in general "Media Info" is literally what that data is.

?  that's a 2 year old release - 5.2.5 is the latest - and that uses mediainfo ...

Anyway - doesn't really matter - all we want is emby to support all the video and audio codecs - not much to ask for a modern media player ... ;)

Link to comment
Share on other sites

signde
10 minutes ago, rbjtech said:

?  that's a 2 year old release - 5.2.5 is the latest - and that uses mediainfo ...

Anyway - doesn't really matter - all we want is emby to support all the video and audio codecs - not much to ask for a modern media player ... ;)

Yes that release note is 2 years old. Radarr removed the MediaInfo library and moved to ffprobe back in v4. That remains the case today.

It just so happens there are still namespaces and classes in their code base called "MediaInfo" and what you are seeing in the release notes for v5 is referring to changes in those areas. It is not referring to the "MediaInfo" library. The logic in Radarr to determine HDR information is the same as Sonarr and it's explicitly using ffprobe.

I've been writing .net apps for a very long time and that is what the arr apps are written in, so it's pretty easy for me to follow.

Link to comment
Share on other sites

rbjtech
23 minutes ago, signde said:

Yes that release note is 2 years old. Radarr removed the MediaInfo library and moved to ffprobe back in v4. That remains the case today.

It just so happens there are still namespaces and classes in their code base called "MediaInfo" and what you are seeing in the release notes for v5 is referring to changes in those areas. It is not referring to the "MediaInfo" library. The logic in Radarr to determine HDR information is the same as Sonarr and it's explicitly using ffprobe.

I've been writing .net apps for a very long time and that is what the arr apps are written in, so it's pretty easy for me to follow.

I'm just looking at the code myself (I'm no coder, but I can understand the cs easy enough..)  - but I'm still not sure how it's pulling out that it's HDR10+  ☺️  I get the sidedata bit for DV - but I see no reference to 2094-40 in the ffprobe output - but I see it referenced in the code ...   I'm genuinely  interested to find out if it's possible - as my scripts all use a combo of ff and mediaInfo - moving to one would make things easier .. ! 

Link to comment
Share on other sites

signde
5 hours ago, rbjtech said:

I'm just looking at the code myself (I'm no coder, but I can understand the cs easy enough..)  - but I'm still not sure how it's pulling out that it's HDR10+  ☺️  I get the sidedata bit for DV - but I see no reference to 2094-40 in the ffprobe output - but I see it referenced in the code ...   I'm genuinely  interested to find out if it's possible - as my scripts all use a combo of ff and mediaInfo - moving to one would make things easier .. ! 

Looking through the Sevarr code, their ffmpeg.core library is looking for the string "HDR Dynamic Metadata SMPTE2094-40 (HDR10+)" in the sidedata which is straight out of ffmpeg.

There is a separate ffprobe "GetFrameJson" command used to grab that, which ends up looking like this:

./ffprobe -i "/path to your movie.mkv" -print_format json -show_frames -v quiet -sexagesimal -read_intervals "%+#1" -select_streams v:0

When I run that on the DV/HDR10+ file you posted in that AFTV 4k max thread a while back, I get this:

{
                    "side_data_type": "HDR Dynamic Metadata SMPTE2094-40 (HDR10+)",
                    "application version": 1,
                    "num_windows": 1,
                    "targeted_system_display_maximum_luminance": "500/1",
                    "maxscl": "19740/100000",
                    "maxscl": "19740/100000",
                    "maxscl": "40981/100000",
                    "average_maxrgb": "232/100000",
                    "num_distribution_maxrgb_percentiles": 9,
                    "distribution_maxrgb_percentage": 1,
                    "distribution_maxrgb_percentile": "3/100000",
                    "distribution_maxrgb_percentage": 5,
                    "distribution_maxrgb_percentile": "9975/100000",
                    "distribution_maxrgb_percentage": 10,
                    "distribution_maxrgb_percentile": "97/100000",
                    "distribution_maxrgb_percentage": 25,
                    "distribution_maxrgb_percentile": "34/100000",
                    "distribution_maxrgb_percentage": 50,
                    "distribution_maxrgb_percentile": "87/100000",
                    "distribution_maxrgb_percentage": 75,
                    "distribution_maxrgb_percentile": "221/100000",
                    "distribution_maxrgb_percentage": 90,
                    "distribution_maxrgb_percentile": "539/100000",
                    "distribution_maxrgb_percentage": 95,
                    "distribution_maxrgb_percentile": "924/100000",
                    "distribution_maxrgb_percentage": 99,
                    "distribution_maxrgb_percentile": "10354/100000",
                    "fraction_bright_pixels": "0/1000",
                    "knee_point_x": "36/4095",
                    "knee_point_y": "76/4095",
                    "num_bezier_curve_anchors": 9,
                    "bezier_curve_anchors": "188/1023",
                    "bezier_curve_anchors": "410/1023",
                    "bezier_curve_anchors": "609/1023",
                    "bezier_curve_anchors": "784/1023",
                    "bezier_curve_anchors": "821/1023",
                    "bezier_curve_anchors": "850/1023",
                    "bezier_curve_anchors": "880/1023",
                    "bezier_curve_anchors": "911/1023",
                    "bezier_curve_anchors": "937/1023"
                }

There is also sidedata for ""Dolby Vision Metadata" so pretty easy to see this is a combo DV/HDR10+ file.

I think this demonstrates it is very possible to detect HDR10+ straight out of ffmpeg. 

 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

rbjtech
15 minutes ago, signde said:

Looking through the Sevarr code, their ffmpeg.core library is looking for the string "HDR Dynamic Metadata SMPTE2094-40 (HDR10+)" in the sidedata which is straight out of ffmpeg.

There is a separate ffprobe "GetFrameJson" command used to grab that, which ends up looking like this:

./ffprobe -i "/path to your movie.mkv" -print_format json -show_frames -v quiet -sexagesimal -read_intervals "%+#1" -select_streams v:0

When I run that on the DV/HDR10+ file you posted in that AFTV 4k max thread a while back, I get this:

{
                    "side_data_type": "HDR Dynamic Metadata SMPTE2094-40 (HDR10+)",
                    "application version": 1,
                    "num_windows": 1,
                    "targeted_system_display_maximum_luminance": "500/1",
                    "maxscl": "19740/100000",
                    "maxscl": "19740/100000",
                    "maxscl": "40981/100000",
                    "average_maxrgb": "232/100000",
                    "num_distribution_maxrgb_percentiles": 9,
                    "distribution_maxrgb_percentage": 1,
                    "distribution_maxrgb_percentile": "3/100000",
                    "distribution_maxrgb_percentage": 5,
                    "distribution_maxrgb_percentile": "9975/100000",
                    "distribution_maxrgb_percentage": 10,
                    "distribution_maxrgb_percentile": "97/100000",
                    "distribution_maxrgb_percentage": 25,
                    "distribution_maxrgb_percentile": "34/100000",
                    "distribution_maxrgb_percentage": 50,
                    "distribution_maxrgb_percentile": "87/100000",
                    "distribution_maxrgb_percentage": 75,
                    "distribution_maxrgb_percentile": "221/100000",
                    "distribution_maxrgb_percentage": 90,
                    "distribution_maxrgb_percentile": "539/100000",
                    "distribution_maxrgb_percentage": 95,
                    "distribution_maxrgb_percentile": "924/100000",
                    "distribution_maxrgb_percentage": 99,
                    "distribution_maxrgb_percentile": "10354/100000",
                    "fraction_bright_pixels": "0/1000",
                    "knee_point_x": "36/4095",
                    "knee_point_y": "76/4095",
                    "num_bezier_curve_anchors": 9,
                    "bezier_curve_anchors": "188/1023",
                    "bezier_curve_anchors": "410/1023",
                    "bezier_curve_anchors": "609/1023",
                    "bezier_curve_anchors": "784/1023",
                    "bezier_curve_anchors": "821/1023",
                    "bezier_curve_anchors": "850/1023",
                    "bezier_curve_anchors": "880/1023",
                    "bezier_curve_anchors": "911/1023",
                    "bezier_curve_anchors": "937/1023"
                }

There is also sidedata for ""Dolby Vision Metadata" so pretty easy to see this is a combo DV/HDR10+ file.

I think this demonstrates it is very possible to detect HDR10+ straight out of ffmpeg. 

 

Thanks - that's very useful.  yes the DV data is present with just a show streams, it was the sidedata type using the frames parameter that is key here :)

Hopefully embly can use this to detect HDR10+ as well ...  it could just be run as a subset of detecting HDR - so doesn't need to be run for every ffprobe.

I'll test this over the weekend.

@softworkz@ebr@Luke - FYI

  • Like 1
Link to comment
Share on other sites

HDR10+ builds upon HDR10 (hence the plus 😉 ) - the unfortunate outcome of this design is that there is no header-included information about whether a file is HDR10+. The header only tells whether it's HDR10 - which every HDR10+ video is (by definition). 

To identify whether an HDR10 video also provides HDR10+ information, it is required to read the side data of individual frames. This would be absolutely no problem when processing (probing) time doesn't matter - like for your command line tests - but in case of Emby, it does matter a lot. The difficulties in probing for this are those:

HDR10+ information is usually provided on scene changes (or more specifically: changes to lighting). It's not unusual that a scene can last for many minutes or even longer and you cannot take for granted that there will be HDR10+ information close to the beginning. So the first question here is: which duration do you want to probe for? 1min - 5min - 10min - longer?

The next problem is that ffprobe doesn't have an ability like "stop  once you have found an HDR10+ data block. So you have to probe a fixed interval and analyze the results later. And when you haven' found any HDR10+ information you can't be sure that there won't come any at a later time. So what to do? Abandon it? Or probe an additional interval? 
As soon as we say we support HDR10+ detection, we need to be sure to detect it reliably - not in a "maybe, maybe not" manner. That would require probing the same files multiple times.
Also, the first ffprobe run is usually too short to detect it, because ffprobe stops as soon as it has all the information about the streams. We really cannot run it with a fixed interval because that would blow up library scanning time.
Everything we are talking about here increases library scanning time and this has always been a sensible area and focus of user complaints for a long time. That's why there is no HDR10+ detection (at this time).

PS: While writing, I had a deja-vu that I've explained this some time ago already? Maybe somebody can find it - it was a bit more detailed IIRC.

  • Thanks 1
Link to comment
Share on other sites

rbjtech
11 hours ago, softworkz said:

HDR10+ builds upon HDR10 (hence the plus 😉 ) - the unfortunate outcome of this design is that there is no header-included information about whether a file is HDR10+. The header only tells whether it's HDR10 - which every HDR10+ video is (by definition). 

To identify whether an HDR10 video also provides HDR10+ information, it is required to read the side data of individual frames. This would be absolutely no problem when processing (probing) time doesn't matter - like for your command line tests - but in case of Emby, it does matter a lot. The difficulties in probing for this are those:

HDR10+ information is usually provided on scene changes (or more specifically: changes to lighting). It's not unusual that a scene can last for many minutes or even longer and you cannot take for granted that there will be HDR10+ information close to the beginning. So the first question here is: which duration do you want to probe for? 1min - 5min - 10min - longer?

The next problem is that ffprobe doesn't have an ability like "stop  once you have found an HDR10+ data block. So you have to probe a fixed interval and analyze the results later. And when you haven' found any HDR10+ information you can't be sure that there won't come any at a later time. So what to do? Abandon it? Or probe an additional interval? 
As soon as we say we support HDR10+ detection, we need to be sure to detect it reliably - not in a "maybe, maybe not" manner. That would require probing the same files multiple times.
Also, the first ffprobe run is usually too short to detect it, because ffprobe stops as soon as it has all the information about the streams. We really cannot run it with a fixed interval because that would blow up library scanning time.
Everything we are talking about here increases library scanning time and this has always been a sensible area and focus of user complaints for a long time. That's why there is no HDR10+ detection (at this time).

PS: While writing, I had a deja-vu that I've explained this some time ago already? Maybe somebody can find it - it was a bit more detailed IIRC.

Thanks @softworkz

On every single file that I tested, using ffprobe to frame analyse the very first packet (and only the first packet) - it correctly identifies it as HDR10+.     

Maybe as part of an informal HDR10+ spec it has to list this as HDR10+ in it's first packet ... maybe I'm being lucky.

I get what you're saying about 'overhead' - but if emby has identified a file as HDR in it's standard ffprobe (and got the HDR and DV Data) - then I hardly think it's going to slow things down by doing a 2nd pass to check the first packet to see if it's HDR10+ and if it is, simply log that data as well.   You already have the file in cache having just read it's header - so we must only be talking microseconds extra here .. ?

I'm now interested if this can also be done for Audio - is there extra metadata/object based data for the taking there as well.... 🤔

Link to comment
Share on other sites

11 hours ago, rbjtech said:

maybe I'm being lucky.

I think so..

11 hours ago, rbjtech said:

then I hardly think it's going to slow things down by doing a 2nd pass to check the first packet to see if it's HDR10+ and if it is, simply log that data as well.   You already have the file in cache having just read it's header - so we must only be talking microseconds extra here .. ?

It's more than that. Last year I had done something similar for certain subtitle types. In this case, we get the stream info from the initial probe run, but the header doesn't provide information about the target screen size - which is important for setting up transcoding (like burn-in). I had implemented such second pass, probing for subtitle packets only. 

It has been in the beta for only two or three versions, then Luke had to remove it due to user complaints about increased library scanning times.

  • Like 1
Link to comment
Share on other sites

rbjtech
11 hours ago, softworkz said:

I think so..

It's more than that. Last year I had done something similar for certain subtitle types. In this case, we get the stream info from the initial probe run, but the header doesn't provide information about the target screen size - which is important for setting up transcoding (like burn-in). I had implemented such second pass, probing for subtitle packets only. 

It has been in the beta for only two or three versions, then Luke had to remove it due to user complaints about increased library scanning times.

ok - and that's fair enough.

Due to the usage of the MediaInfo Plugin, I do think there is demand to know this extra codec information - so maybe simply have it as an option in the library - tick this for extra mediainfo -  with a warning and acknowledgement that it will increase scan times.   Default is off - thus an initial scan is not impacted by it on new systems.

 

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

Don't get me wrong - I really want to do this, but it's about finding a good way.

What we've learned so far is that any kind of change to the library scanning and probing process which might increase the duration must be avoided by all means. We just don't want to see user posts anymore, saying that another software would require a lot less time for scanning. 😉 

Beyond the regular library scanning, there's actually a bunch of information that we (and users) would like to get out of the media files, it just can't be part of the library scanning due to the given reasons. Now, the way I see it, is that if we "touch" all media files (given the user has opted-in to it) a second time, then it should be final and there shouldn't be a third or fourth time - ideally (if possible). I had proposed this years ago already, but it's been lying around for a while, waiting for reasons to come which turn it from a nice-to-have into a requirement. The idea is to have a kind of secondary/extended probing process which collects all the additional information that we want from a single run, including:

  • Subtitle target screen size (for certain graphical subtitles)
  • Subtitle availability detection 
    (tv signals often have streams without content)
  • Availability detection for closed captions
    (and if available, which sub-channels have actual content)
  • HDR10+ presence
  • HEVC keyframe times for better HLS streaming
  • [undisclosed idea]
  • [undisclosed idea]
  • Thumbnail Images
  • Chapter Images
    (in addition to thumbs rather than just one of the two)
  • Automatic Chapter Detection if chapters unavailable
    (even though, this would be rather a matter of thumbnail post-processing) 

I'm not sure whether intro detection wouild fit into this (kind of required processing is probably too different). 
The basis would be the quick-extraction for images, but further ffmpeg customization will be required in order to make it as efficient as possible. 

Many points are still rather on the nice-to-have side, so I can't say how this will be prioritized, only that when it would happen then all-at-once (where feasible).
I agree that it's unfortunate not being able to distinguish between HDR10 and HDR10+, but it's not as crucial as with DOVI where it's really a question of "plays or play not".

Regarding Atmos and DTSX it's merely a matter of ffmpeg abilities. I didn't follow ffmpeg progres during the past months. When something gets added there regarding those audio formats, we''ll surely integrate it in our detection.

Edited by softworkz
  • Like 3
Link to comment
Share on other sites

rbjtech

Sounds good 👍

The MediaInfo Plugin has matured over time as well and we now add all sorts of 'extras' that improve filtering etc - but it would be really nice to get the 'extra-data' in the database alongside the standard metadata as opposed to simply tags and/or added to the track title - then it starts to become really useful. :)

Maybe you haven't seen the plugin - but some community added ideas and enhacements are below that you may want to add to the 'nice to have' emby to do list .. 😎

image.png.c0c7832f3e8d27c302b7bd6bee001bd2.png

image.png.07f802ac57f3fb913e60347637c4c164.png

 

 

Edited by rbjtech
Link to comment
Share on other sites

Well, I'd say that "Use RBJTECH style formatting" is clearly a must-have!

  • Haha 1
Link to comment
Share on other sites

signde
3 hours ago, softworkz said:

I agree that it's unfortunate not being able to distinguish between HDR10 and HDR10+, but it's not as crucial as with DOVI where it's really a question of "plays or play not".

I would argue for me it IS an issue of plays or not plays, hence how I ended up in this thread. The new fire stick 4k max will not play combo HDR10+ / DoVI files

Is that an Amazon issue that should be fixed? Yes. Are they going to fix it? Doubtful. 

Was Plex able to fix this by detecting HDR10+ and altering the meta data on the fly? Yes.

On 1/21/2024 at 4:22 AM, rbjtech said:

On every single file that I tested, using ffprobe to frame analyse the very first packet (and only the first packet) - it correctly identifies it as HDR10+.     

Maybe as part of an informal HDR10+ spec it has to list this as HDR10+ in it's first packet ... maybe I'm being lucky.

I do not have the largest library in the world, but every single file I have tried in my library that I know is HDR10+ mirrors this. I'm probing the first and only the first packet. 

If you look at the code I linked in the Sevarr project, it seems this is all they are doing as well. That operation seems to complete in under 1s for me on the files I analyzed. On top of that they only do this if certain conditions exist in the metadata (VideoTransferCharacteristics / ColorTransfer) so that is also another short circuit to save time.

The Sevarr products have thousands and thousands of users. That seems like quit a large testing pool that has already vetted this strategy. 

Make it a new, optional, off by default, scanning setting that only applies to new scans and mark it as experimental. That could be a quicker win that would provide usefulness and most importantly user feedback about how well that strategy works.

  • Like 1
Link to comment
Share on other sites

It's been two years since I had looked into it. Zillions of technical details have flown through my brain since that time, so I'm afraid, I only remember the conclusion but not all the specific details. What I'm sure about it is that It would definitely have required a second run.  And the metrics for the additional subtitle probing that was rertracted,  were that it had added between 300 and 1500ms per file and same like in this case, it wasn't done for all files but only for some with specific subtitle formats.

I was the one who had thought it would be ok and acceptable - but it turned out that it wasn't.

Edited by softworkz
Link to comment
Share on other sites

signde
48 minutes ago, softworkz said:

It's been two years since I had looked into it. Zillions of technical details have flown through my brain since that time, so I'm afraid, I only remember the conclusion but not all the specific details. What I'm sure about it is that It would definitely have required a second run.  And the metrics for the additional subtitle probing that was rertracted,  were that it had added between 300 and 1500ms per file and same like in this case, it wasn't done for all files but only for some with specific subtitle formats.

I was the one who had thought it would be ok and acceptable - but it turned out that it wasn't.

Are you saying that adding up to a second per file for analysis is unacceptable? That sounds like a relatively low hit to me but what do I know. 

I didn't actually measure time for probing just the first singular packet, it could be 100ms for all I know, it just always returns immediately to my perception.

Also not sure what you mean about requiring a second run?

The algorithm I linked to does this probing immediately after the meta data is grabbed, so you net with two ffprobe calls, but that can all be part of the singular analysis swoop.  

Link to comment
Share on other sites

2 hours ago, signde said:

Are you saying that adding up to a second per file for analysis is unacceptable?

For 5000 items (a fairly conservative number) that would add 83 minutes to an operation.

Link to comment
Share on other sites

neik
15 minutes ago, ebr said:

For 5000 items (a fairly conservative number) that would add 83 minutes to an operation.

It is an one-time job though to get the additional information, isn't it?

  • Agree 1
Link to comment
Share on other sites

signde
12 minutes ago, ebr said:

For 5000 items (a fairly conservative number) that would add 83 minutes to an operation.

5000 HDR video files in a collection is a conservative number? good grief, emby must be the tool of choice by the worlds most extreme data hoarders. 

at any rate, i ran some actual numbers.

ffprobe reading intervals of only the first packet ranged from 50ms - 200ms with the average being ~125ms.

the size of the file does not make a difference.

this is on a low budget qnap nas in a docker container analyzing files on physical 5400 rpm spinny disks, so a woefully slow setup.

Link to comment
Share on other sites

rbjtech

Just looking at my TV episodes (~30K total), HDR is only approx 1% of them - 320 to be precise.    So taking the nas procesing time above - that's 40,000ms or 40 seconds extra..

It's also worth mentioning that I had to use filter by Tag (HDR10+HDR10+) to get the results below, as emby has not even included 'HDR' as a common 'MediaInfo' filterable attribute .. :( but it does have a 3D option ... ☺️    

image.thumb.png.b5764c99a46cc29137b050ccc81c2ee0.png

Edited by rbjtech
  • Like 1
Link to comment
Share on other sites

7 hours ago, rbjtech said:

HDR is only approx 1% of them

Today...  and the time assumptions are still somewhat suspect based on SW experience previously.

The bottom line is, we can't be adding things that make the initial scan even longer so I would imagine this type of probe would be best left to just before first playback like we do with strm.

Link to comment
Share on other sites

GrimReaper
6 minutes ago, ebr said:

we can't be adding things that make the initial scan even longer so I would imagine this type of probe would be best left to just before first playback like we do with strm.

Or make it a scheduled task/plugin as @TeamBdid for strm files?

  • Like 4
  • Agree 2
Link to comment
Share on other sites

signde

I stand by the data I have provided for ffprobe interval timing but I don't really care when the detection happens. I can certainly understand library scanning duration is a sensitive subject.

I only came here to prove it can be done in ffmpeg with the further goal that down the road the backend can make decisions on how to handle the metadata given that many clients have playback issues depending on the HDR information available.

  • Like 1
  • Agree 1
Link to comment
Share on other sites

rbjtech
38 minutes ago, GrimReaper said:

Or make it a scheduled task/plugin as @TeamBdid for strm files?

being frank - all emby need to do is to add HDR10+ as a valid ExtendedVideoType and/or ExtendedVideoSubType (the same as they did for DV Profiles) and any Plugin/API call can just write it as HDR10+ as part of a 3rd party effort.   I would add it as an option in the MediaInfo plugin .. ;)  

I'm unsure if the clients would need updating to 'show' HDR10+, but I would hope that name lookup is got from the database but I'm not 100% because the full DV profile is still not shown even through emby has the full DV profile (in ExtendedVideoSubType).     

 

  • Agree 1
Link to comment
Share on other sites

1 hour ago, rbjtech said:

being frank - all emby need to do is to add HDR10+ as a valid ExtendedVideoType and/or ExtendedVideoSubType (the same as they did for DV Profiles) and any Plugin/API call can just write it as HDR10+ as part of a 3rd party effort.

That's okay for display but, for playback decisions, we need the core to be in charge of the information it uses for inputs.

  • Agree 1
Link to comment
Share on other sites

6 hours ago, rbjtech said:

being frank - all emby need to do is to add HDR10+ as a valid ExtendedVideoType and/or ExtendedVideoSubType (the same as they did for DV Profiles) and any Plugin/API call can just write it as HDR10+ as part of a 3rd party effort.  

This is done already: https://betadev.emby.media/reference/pluginapi/MediaBrowser.Model.Entities.ExtendedVideoTypes.html?q=ExtendedVideoType

  • Thanks 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...