Dumb Question - Why Use ffprobe for Audio Files?

June 6, 2022

Title kinda says it all.

It seems to me that using ffprobe for scanning audio is wasteful of both time and cpu cycles. Would it not be much more efficient (and significantly faster . . . like by 100X) to simply read the embedded tags? I get that not all files are (properly) tagged but perhaps we could first try to grab the tags and only resort to ffprobe if embedded metadata wasn't present or if the user specifically requested a "deep scan". What info does Emby get from ffprobe that it wouldn't get from embedded tags?

Just asking because music and audiobook scans are so painfully long with ffprobe having to exit and reinit for each file. It seems like there has to be a better way.

Cheers.

June 6, 2022

Hi, the main thing is that if we have to transcode, then the information we get from the probing process will influence how the transcoding happens. So that's the main thing.

June 6, 2022

17 hours ago, SideCar said:

Just asking because music and audiobook scans are so painfully long with ffprobe having to exit and reinit for each file. It seems like there has to be a better way.

This is the one point that is true. ffprobe is run as an external process separately on each file.
There are multiple reasons why we are doing this:

stability
security
error resiliency
isolation of concerns
portability
etc.

I can't go into detail for each of those points; let's go with a simple and easily understandable explanation:

When we run ffprobe.exe as a separate process on a file and some kind of error occurs (maybe due an incorrect file or a bug in ffprobe), we don't need to care much about it. The process ends, memory is freed and all that Emby Server gets in return is an error exit code.

But if we would have compiled the ffprobe code/library into Emby Server, or we would use it as a dynamic library, then we wouldn't need to create a process for each file, but in that case - everything that could go wrong would be our concern (our process - our concern). Memory not getting freed would add up to the Emby Server process and this can add up until failure. Also any segfault in the probing code would not simply give us an error return code: it would crash the whole Emby Server.

17 hours ago, SideCar said:

I get that not all files are (properly) tagged but perhaps we could first try to grab the tags and only resort to ffprobe if embedded metadata wasn't present or if the user specifically requested a "deep scan"

What do you mean by "grab the tags only"?

Each (most) container format is different and needs different code to read the tags. Also, we don't only need to know tags, we need to know the codecs, channel layout, sampling rate, bit rate, and more for one of the primary things that Emby does: Provide it to clients in a way that it can be played there.

But that's not a "deep scan". ffprobe just reads the headers and only in some (special and very rare for audio) cases it reads more data (or when ffprobe gets instructed to do so).

17 hours ago, SideCar said:

Would it not be much more efficient (and significantly faster . . . like by 100X)

No it wouldn't. The overhead is process creation. No other tool would be significantly faster. Also no other tool would give equally reliable and accurate data.

Sign In

Dumb Question - Why Use ffprobe for Audio Files?

Recommended Posts

SideCar 4

Luke 40068

softworkz 4566

Create an account or sign in to comment

Create an account

Sign in

Activity