Jump to content

Multiple Thumbnail Extract Processes


runtimesandbox

Recommended Posts

runtimesandbox

Could we get the option to (or the system automatically detect and set) the number of thumbnail extraction processes that can be run?

 

Rebuilding my library is taking a long long time due to the fact that the thumbnail extract only runs on one media item at a time - server is currently sat with ~30% cpu usage. 

  • Like 8
  • Agree 1
Link to comment
Share on other sites

  • 5 months later...
rechigo

This would be nice. I could probably have at least 10 extracts going and my server would still have enough resources free

Link to comment
Share on other sites

  • 2 weeks later...
miniliQuid

If this can be done it will certainly help, would love this for any task.

That way I can run the tasks late at night and they will be finished before the family gets out of bed :D

Link to comment
Share on other sites

  • 3 months later...
neik
On 7/27/2020 at 8:49 PM, Sammy said:

How much content do you should be able to utilize the GPU, right@softworkz ?

Sent from my SM-G960U1 using Tapatalk
 

I am not even sure if it's worth using the GPU for this, using multiple threads would already speed things up by a lot.
Just adding a second thread would already half the time needed.

Link to comment
Share on other sites

8 hours ago, neik said:

I am not even sure if it's worth using the GPU for this, using multiple threads would already speed things up by a lot.
Just adding a second thread would already half the time needed.

Except for the IO access to disc that would take place. It's actually a pretty fast process at least compared to how long it used to take.

I'm pretty sure 1 thread was selected on purpose because it's designed to work in the background and not interfere with other things as much as possible.

Link to comment
Share on other sites

neik
13 minutes ago, cayars said:

It's actually a pretty fast process at least compared to how long it used to take.

That's definitely correct.

 

1 hour ago, cayars said:

Except for the IO access to disc that would take place.

My drive is mounted via network and I don't have any issue regarding IO load and I can't imagine that a second thread would overload a drive.

Especially those of us using local drives and SSD's.

Link to comment
Share on other sites

But what if you're already streaming 5 videos from the server.  That's what I was getting at that it likely wants to keep IO down so that it's available for other things, just in case.

Link to comment
Share on other sites

BTW, I'm all for being able to set the number of threads this uses.  Heck if you 32 cores it would be nice to use 4 just for this as an example.

Link to comment
Share on other sites

On 10/6/2019 at 12:26 PM, runtimesandbox said:

Could we get the option to (or the system automatically detect and set) the number of thumbnail extraction processes that can be run?

 

Rebuilding my library is taking a long long time due to the fact that the thumbnail extract only runs on one media item at a time - server is currently sat with ~30% cpu usage. 

Why do you choose to extract thumbnails during library scan? That doesn't make a lot of sense. (I don't know why that option even exists)

 

 

Link to comment
Share on other sites

I do that softworkz and it makes sense if using the option correctly.

I for example had it off to load the initial library but then turn it on after this.

Now when ever I load a movie or two it finds them, creates the index and I've got a perfectly usable movie as soon as it shows up in the library.  Let confusing for family members then explain whey this movie has images and this one doesn't.

Of course I'm not dumping 100 movies at it either.

 

  • Agree 1
Link to comment
Share on other sites

On 7/27/2020 at 6:23 PM, wilson18 said:

Would be a great feature to have. Seems silly for it to be restricted to 1 thread.

We have a quick extraction that extracts thumbnails 10-20 times faster than normally. It cannot operate in parallel, though, and wouldn't run faster but "wronger" instead.

 

 

  • Thanks 1
Link to comment
Share on other sites

1 minute ago, cayars said:

Now when ever I load a movie or two it finds them, creates the index and I've got a perfectly usable movie as soon as it shows up in the library.  Let confusing for family members then explain whey this movie has images and this one doesn't.

But movies will have all images, except chapter and thumbnails - so you're actually concerned about those?

Link to comment
Share on other sites

Yep, it's a consistency thing for me. I could surely get by without doing this or kicking off the schedule job manually but doing it when Emby picks up new files on my system makes sense for me.  Less stuff to do later during nightly processing. :)

If I'm about to add a new TV Series with multiple seasons and lots of episodes I'll turn this off, scan, turn back on then kick off the job.

  • Agree 1
Link to comment
Share on other sites

2 minutes ago, cayars said:

Yep, it's a consistency thing for me. I could surely get by without doing this or kicking off the schedule job manually but doing it when Emby picks up new files on my system makes sense for me.  Less stuff to do later during nightly processing. :)

If I'm about to add a new TV Series with multiple seasons and lots of episodes I'll turn this off, scan, turn back on then kick off the job.

Ah, now I understand - the alternative (fixed schedule at night) is just not very attractive and I'd agree to that.

My primary point it that it should be decoupled from library scan - so instead of doing;

Add Item >> 10 min extraction >> Add Item >> 10 min extraction  >> Add Item >> 10 min extraction 

Rather do:

>> Add Item >> Add Item >> Add Item || Check system usage || when idle, do: 10 min extraction >> 10 min extraction >> 10 min extraction 

...when checking whether the system is in use or not (idle detection), there's no need for a late night task schedule

 

But let's get back on topic: Image extraction is something that needs to happen in the background in the most efficient way that is possible and should not affect the users' system operations. This is not mean to be a benchmark or load-test for your many-core CPUs.

There's not much CPU involved anyway (in case of quick extraction): We're mostly reading out the key frames and only when there are no suitable matches at a point in time, we're going into decoding a few frames in sequence. You should be much more concerned about your network usage when the media isn't local.

 

ffmpeg threads parameter: You can completely forget about that. This is not some kind of global setting. When you specify that parameter, it will be matched to the closest component that supports multi-threading. And only applied to that one. So it might affect the audio deccoder at one time, and when you have a different audio-decoder next time, that doesn't support multi-threadiing, it might be applied to the video encoder instead.

That setting shouldn't even exist in Emby and the parameter should not be used as long as you aren't hand-tuning a specific command line. But for us - supporting thousands of command line variations, it's impossible to manage the threads parameter. Eventually it will do more harm than anything else. You must scratch that idea once and for all.

 

Parallel extractions would be a realistic option, but as mentioned above, there are other resources than CPU cores and Emby would surely not do something that could lead to a negative user experience. So I say "realistic" but probably "not reasonable".

Maybe @Luke will share his thoughts about that..

 

 

 

 

  • Like 2
Link to comment
Share on other sites

neik

@softworkz, thanks for adding the developers view.

It would be a nice to have but I also think there are other requests with more importance than this one.

  • Like 1
Link to comment
Share on other sites

  • 3 months later...
On 7/29/2020 at 8:05 PM, softworkz said:

We have a quick extraction that extracts thumbnails 10-20 times faster than normally. It cannot operate in parallel, though, and wouldn't run faster but "wronger" instead.

I've been asked to better explain why we cannot use multiple threads for thumbnail extraction.

In addition to what I had written above, let me put it this way: there's nothing much to accelerate, because in most cases, extraction is limited by IO rather than CPU processing.
Unlike we did in earlier times, when we did a full decoding of the video to get the images, we are now doing a keyframe based extraction. Keyframes are very easy to decode and we need to decode only those where we want to create a thumbnail image. 

We need to seek through the whole video file for this, that means the whole file needs to be transferred over the network (unless it's stored locally) while the keyframe-decoding doesn't take much CPU. Also, that seeking needs to happen sequentially, so there isn't much that could be parallelized anyway.

  • Like 1
Link to comment
Share on other sites

rbjtech

I think the only reason you would need to have multiple threads on this, is an initial brand new scan where you have multiple libraries across multiple platters across multiple I/O controllers.  I/O then is less of any issue and you can utilise the extra CPU you may have.

However, as this is such an edge case, I don't personally think this is worth any development effort.  If you really need the thumbnails that quickly, then I believe there is a standalone way to create these - and you could run multiple copies of this if you so wish.

  • Agree 1
Link to comment
Share on other sites

I guess we can never do it right:

For many years, the extraction process took a long time under high CPU load:

=> Users complained about long library scans and the CPU load (one of the reasons to introduce throttling)

Since two years, we have the new lightweight extraction which is 5 to 30 times faster without much CPU load

=> Now, users are concerned that the extraction process does not use all CPU processing resources...

Edited by softworkz
  • Haha 4
Link to comment
Share on other sites

  • 2 years later...
ryzen5000

I want to set mine to 100 MP/H and I have lots of CPU power to spare. Is it possible to use GPU to extract thumbnails? I have two GPU that are sitting there doing nothing I could put them to work.

Link to comment
Share on other sites

1 hour ago, ryzen5000 said:

I want to set mine to 100 MP/H and I have lots of CPU power to spare. Is it possible to use GPU to extract thumbnails? I have two GPU that are sitting there doing nothing I could put them to work.

Did you even read this topic? I mean - it's not a long one. Maybe the last 3 posts at least - no? 🙂 

  • Haha 1
Link to comment
Share on other sites

ryzen5000

I read it all twice, thanks. I do realize that some customers of EMBY are running on lower power machines and even raspberry pi. However some of us, I know there are others like me that went all out on building a high performance server and I would like to see some additional settings for power users, like for example:

1. We could select how many CPU cores we want to be used to do scanning and extractions. This could be an option for the larger machines that have lots of extra cores and ram.

Even a high performance or high load setting that could be used to set Emby to utilize more of the servers resources to complete tasks, very useful for scanning pre existing media on a new installation or adding lots of series and episodes at once. I know I download at 185 MB/s and its not unusual for me to add 2 TB of series at a time.

2. The other thing I would benefit from is ability to reach the dashboard settings from the EMBY theater app for windows os. I have to often go into the web interface and press scan library to force it to scan faster even when its on an automatic setting.

3. We should be able to select more than once GPU for transcoding for those of us who have servers with 2 or more GPU. multiple GPU.

4. We could use these GPU to do extractions when not transcoding.

I know you said its not worth the effort but it would be really cool to see stuff like this for advanced and power users who plan on using emby for life. We are asking for new features here because where else are we going to go to get them, EMBY is at the top of the list for media server software. I could not find a better software, unless I paid somone to design it for me specific to my needs which is not really an option.

5. Library scans from the android app. Useful because I am constantly adding new media and I want to scan from my remote.

6. The image of the TV that has the EMBY logo is an old school flat screen, I mean the one with the chapter images from the ROkU app and the WIndows one. Anyways the black tv needs an update. They don't make TV's that old anymore. I realize it was the original picture and you probably haven't gotten around to changing it yet. But that might be an easy one if its just an image. I would like to see a more modern TV displayed even an oled or a nano or something with skinnier borders around the display.

Edited by ryzen5000
Link to comment
Share on other sites

2 minutes ago, ryzen5000 said:

I read it all twice, thanks.

Very good! 🙂 

2 minutes ago, ryzen5000 said:

1. We could select how many CPU cores we want to be used to do scanning and extractions. This could be an option for the larger machines that have lots of extra cores and ram.

For scanning please see my reply to your other post: https://emby.media/community/index.php?/topic/113885-chapter-images-not-displayed-in-tv-series/&do=findComment&comment=1201660

For extraction: This is a single-thread operation. Decoding a key-frame is not a task that benefits from multiple core. Neither does it benefit from GPU decoding. I think I have explained that already.
Faster IO is the only thing you could to to accelerate thumbnail extraction.

6 minutes ago, ryzen5000 said:

2. The other thing I would benefit from is ability to reach the dashboard settings from the EMBY theater app for windows os. I have to often go into the web interface and press scan library to force it to scan faster even when its on an automatic setting.

@Luke will need to respond to that.

6 minutes ago, ryzen5000 said:

3. We should be able to select more than once GPU for transcoding for those of us who have servers with 2 or more GPU. multiple GPU.

 

Yes, this has always been planned for and the architecture is there. You also see that you can already select multiple GPUs for each  codec in a priority order.
The only bit that is missing is some logic to determine under which conditions, the other GPU should be chosen.
We're not far away from making that possible - technically.

9 minutes ago, ryzen5000 said:

4. We could use these GPU to do extractions when not transcoding.

I can assure you this it wouldn't accelerate this. Not even a tiny bit. 

It would accelerate the old way of extraction. But the old way with GPU is still much slower than the new way (where CPU vs. GPU doesn't differ).

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...