Jump to content

High CPU usage 4.2.0.40


jeffu231

Recommended Posts

jeffu231

Keep in mind that they use multiple processes whereas Emby only uses one, so it's not really fair to compare one of their processes to the Emby server process.

 

i was comparing the cumulative process that make it up. It is definitely not a one to one comparison, but the overall resource load on the box is significantly less which is my point. The overall load on the box for Emby under versions prior to 4.2 was also significantly less for me which was also my point.  

Link to comment
Share on other sites

It is way more than a few percentage points. It is using 8-10 times what it used to use before I upgraded to 4.2. Multiply that by multiple streams or recordings and it adds up. But if that is trivial to you, then we live in different worlds. But if that is how you feel, I am fine to cancel my subscription and move on rather than waste anymore of your time and mine. Please excuse me for trying to report my observations to aid in improving the product. As a developer myself I am always interested in my users experiences with the software I write. CPU consumption may not be a big deal for some folks who have money to waste on oversized hardware. I prefer to use that money on other things. There are still some of us that are concerned with writing efficient code that performs well. I started coding when CPU cycles and memory were expensive and you worked hard to manage the resources you used. Maybe that is a lost art these days. 

 

I'm afraid - you totally misunderstood the point.

 

The point was that you let us believe that there was a serious problem while there wasn't.

 

The current implementation will be completely replaced shortly, that's why there's no point in optimizing anything there.

Effective code is in fact an important goal. But not for legacy code.

Link to comment
Share on other sites

jeffu231

I'm afraid - you totally misunderstood the point.

 

The point was that you let us believe that there was a serious problem while there wasn't.

 

The current implementation will be completely replaced shortly, that's why there's no point in optimizing anything there.

Effective code is in fact an important goal. But not for legacy code.

 

 

Yes, I clearly missed something. I did not intentionally lead you to believe anything. I presented what I observed which was an increase in CPU usage between one version and another. I was more than willing to provide any information the team wanted to help determine what or if there was a problem. I was not hiding any information or trying to guide the evidence in any way to influence how you researched it. The difference between the versions is something I observed on my system, but if you expect that difference, then it may be ok. I don't believe in magic. Any differences are because of a change somewhere intended or not. 

Link to comment
Share on other sites

What I meant is nothing like that. I meant the presentation (look at the title) as if it was a severe issue.

Also it was written well and precise, that's why I wasn't as skeptical as in other cases and didn't look closely enough.

So let's forget about that. 

 

What I wanted to add, is that, we welcome users offering help to test and optimize new stuff, sometimes outside the official beta cycles.

If you like, I can put you on a list and we'll contact you when we got something that's really worth testing...

 

Best regards,

softworkz

Link to comment
Share on other sites

sfatula

Not to prolong the discussion, but, a point of reference. I am ubuntu 18.04.2 server, running 4.2.1.0. I am streaming live TV right now from a HDHOMERUN QUATRO, which I believe is identical to what was asked about. I see an average of 40% for emby server (of one core). This is on a 720p channel, if I switch to a 1080i channel, it goes to around 80%. Of course, that's / 12 in my case. SD channel goes to ~20%. Doesn't sound too different, but those are small numbers to me. I don't have a comparison point as the OP says he has. It certainly acceptable, and wouldn't call it high. Perhaps highER than it was.

 

We always wish for better of course. Sounds like some new stuff is coming though.

 

Is there a roadmap anywhere of 4.3, assuming that is the next major version?

Link to comment
Share on other sites

jad3675

So, not to pile on, but I'm seeing similar behavior with the latest stable when I have OTA HD homerun recordings going on. For instance, todayt (10/17 @ 8pm) 'Young Sheldon' started to record, then 'Perfect Harmony' then 'The Good Place' and then 'A Million Little Things'.

 

I didn't see this behavior earlier in the summer when I was testing the betas, but I also wasn't doing much in the way of HD homerun recording - any, if at all. I don't see this if I'm watching live HD Homerun based TV only for recording. I take that back - I do see high-ish cpu utilization if I'm watching live HD homerun TV. I don't see high CPU if I'm watching an m3u based stream.

 

Here's my CPU utilization and the process detail at that time.

 

5da920f27cba5_proc.jpg

 

 

5da920262a521_cpu.png

Edited by jad3675
Link to comment
Share on other sites

So, not to pile on, but I'm seeing similar behavior with the latest stable when I have OTA HD homerun recordings going on. For instance, todayt (10/17 @ 8pm) 'Young Sheldon' started to record, then 'Perfect Harmony' then 'The Good Place' and then 'A Million Little Things'.

 

I didn't see this behavior earlier in the summer when I was testing the betas, but I also wasn't doing much in the way of HD homerun recording - any, if at all. I don't see this if I'm watching live HD Homerun based TV only for recording. I take that back - I do see high-ish cpu utilization if I'm watching live HD homerun TV. I don't see high CPU if I'm watching an m3u based stream.

 

Here's my CPU utilization and the process detail at that time.

 

5da920f27cba5_proc.jpg

 

 

5da920262a521_cpu.png

 

Hi there, please attach the emby server log. thanks.

Link to comment
Share on other sites

jad3675

How does sar look for the same time period? Graphs can be misleading when there's no legend or details.

The top graph is the cpu utilization for the emby process itself, which is shown above the graph.

 

And yes, I cut the legend of the second graph by mistake - nice catch.

Here's the full legend that covers the 2 hours that my emby instance was recording the shows. I have comskip running after each show to mark the commercials.

5da9ca196e835_cpu_load.png

Edited by jad3675
Link to comment
Share on other sites

jad3675

Here's the cpu utilization of the ffmpeg marking of 'The Good Place' episode. As you can see, it's not consuming much in the way of CPU. I only allow one comskip process to run at a time.

5da9cb38ccdb6_gp_cpu.png

Link to comment
Share on other sites

jad3675

I'll also point out that during this high CPU utilization time period, emby is extremely slow at the client (fireTV and browser) and it takes more than a minute for me to SSH into the device.

 

DVR recording and transcoding is down to local disk - it's an nVME drive - so not exactly slow.

 

John

Link to comment
Share on other sites

Q-Droid

You're showing significant i/o wait so even though you're using a fast ssd there seems to be a bottleneck somewhere. Are all the relevant paths set to the ssd? Your CPU is not busy, it's waiting.

Link to comment
Share on other sites

jad3675

You're showing significant i/o wait so even though you're using a fast ssd there seems to be a bottleneck somewhere. Are all the relevant paths set to the ssd? Your CPU is not busy, it's waiting.

 

I noticed that too. Everything is on the same drive.

 

I don't think it is hardware though.

 

Here's an IPTV recording I did this morning, starting just after 8am. 9am is when it cut the commercials. I continued to watch iptv streams after the recording was over, too.

You'll notice the i/o wait is minimal, compared to the recording done via the hd homerun (The Good Place) in the prior posts. As near as I can tell, hd homerun recordings are not done via ffmpeg like the iptv ones are, though..ts i

 

A thought - I cut the commercials with comskip, but I use the emby version of ffmpeg to convert the .ts into a .mkv - that's probably where the spike is below. I'll try a recording using the non-emby version of ffmpeg.

 

5da9d4054fd11_iptv_record.png

Edited by jad3675
Link to comment
Share on other sites

Q-Droid

This latest graph looks normal. The comskip spike is mostly %user as expected.

 

Hdhr being networked could be contributing to the i/o waits though I couldn't say to what extent. You could check to make sure that connection is full speed and full duplex.

Link to comment
Share on other sites

jad3675

This latest graph looks normal. The comskip spike is mostly %user as expected.

 

Hdhr being networked could be contributing to the i/o waits though I couldn't say to what extent. You could check to make sure that connection is full speed and full duplex.

 

HDHR is on a downstream switch from where the emby server is plugged into. It's at 100/full and the port has zero errors on it.

 

I'm running a manual ffmpeg on a .ts file right now - same command I do for converting ts to mkv - while watching an OTA channel via emby - iowait is averaging 6%.

 

Very odd.

 

John

Link to comment
Share on other sites

jad3675

Run sar -d for yesterday. I'd like to see that disk activity during the high wait time frame.

 

ubuntu, so sysstat isn't enable by default.

 

All I have is disk latency from datadog.

5daa02a6e0534_latency.png

Link to comment
Share on other sites

Q-Droid

Whoa, are those values in ms? If so then something is wrong with /dev/sda or the controller or the drivers.

 

What fs are you using?

Link to comment
Share on other sites

Q-Droid

Also, why are sda1 and sda3 nearly identical?

 

Edit: my bad, sda and sda3 so nevermind.

Edited by Q-Droid
Link to comment
Share on other sites

jad3675

Whoa, are those values in ms? If so then something is wrong with /dev/sda or the controller or the drivers.

 

What fs are you using?

That's await time, so it's a combo of wait time and service time ie, the end-to-end of the i/o scheduler and it is in ms.

 

However...I am seeing similar numbers on a different server. The only commonality they have is fuser and rclone that would affect filesystem operations.

 

Looking back on the past 5 months of disk latency data, there's not much there until recently when the Fall TV season started and DVR'ing picked back up. The scale is a bit off, but the highest peak prior to this is pretty low @ 109. Even in April/May when it was still DVR'ing, it wasn't too bad.

 

I started on the 4.2 beta train at the end of May.

 

So, maybe the nvme drive could be going bad..but I doubt it.

 

5daa1274593e6_latency2.png

Link to comment
Share on other sites

Q-Droid

A 3-digit avg await would be considered bad even for HDD storage. A 4-digit await is horrible and you should never see those stats with NVMe. This is an OS/storage problem, not application. Keep in mind NVMe storage can handle 10s to 100s of kIOPS and crazy high throughput. Recording and transcoding are a tiny trickle. Latency should barely register for NVMe storage, with sub-ms values.

Link to comment
Share on other sites

sfatula

A 3-digit avg await would be considered bad even for HDD storage. A 4-digit await is horrible and you should never see those stats with NVMe. This is an OS/storage problem, not application. Keep in mind NVMe storage can handle 10s to 100s of kIOPS and crazy high throughput. Recording and transcoding are a tiny trickle. Latency should barely register for NVMe storage, with sub-ms values.

 

Yeah, I have a HDD for emby recordings, and, it's never recorded results that bad. Could be a driver thing. When I got my Samsung 970 Evo Plus nvme, there were issues requiring firmware updates on my former machine (Nuc), otherwise the performance was spotty and crippled.

 

Emby runs fine for HD Homerun recordings, never slows my machine in any detectable way. I can be running Handbrake at the same time it's recording, doing a backup, whatever.

Link to comment
Share on other sites

jad3675

A 3-digit avg await would be considered bad even for HDD storage. A 4-digit await is horrible and you should never see those stats with NVMe. This is an OS/storage problem, not application. Keep in mind NVMe storage can handle 10s to 100s of kIOPS and crazy high throughput. Recording and transcoding are a tiny trickle. Latency should barely register for NVMe storage, with sub-ms values.

 

I'm well aware of that - the system would be unusable with those numbers, and if it was a bad driver/config I'd expect to see other issues on the system, but I don't.

I'm seeing the similar iowait/cpu/slow system issue on two different systems, both with different storage subsystems (one is centos 7 and the other is ubuntu 18.04, one is nvme and the other is ssd) and I'm only seeing it when emby is recording or to a lesser extent, watching live tv via the hdhr. This issue wasn't apparent when watching or recording via an IPTV stream. The only commonality between the two is fuser and rclone (well, besides emby).

 

I'll throw another drive in the server and move the transcoding location there and see if the problem persists.

 

I appreciate the help.

 

John

Link to comment
Share on other sites

Q-Droid

Perhaps other I/O activity, like a synthetic benchmark tool, could give you repeatable numbers and help to zero in on the issue. A way to either pinpoint or eliminate Emby/ffmpeg as the culprit.

 

As for unusable you sort of hinted at that before when you mentioned the system became unresponsive while this I/O activity and latency were going on. There's definitely a bottleneck but I don't know much about fuse/fuser or rclone to offer any advice.

Edited by Q-Droid
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...