Jump to content

Server GPU acceleration or CPU processing?


Guest asrequested

Recommended Posts

Waldonnis

Fantastic information! You seem to be confirming my suspicions that Intel is making what I consider to be better architecture. I'd love to hear you discover about the SL-X and KL-X CPUs. I'm pretty much sold on them, but you have such good information. I would do research on this, but I likely won't understand a good portion of what I would read. 

 

Hehehe, it can be confusing.  So much has changed since my days in the hardware industry, but the science and theory behind it all really hasn't (thankfully).  After more reviewing of Threadripper and Ryzen in general, I'm actually rather impressed.  AMD has aggressively gone after a rather specific and profitable market with Ryzen (budget and midrange gamers) while also not being afraid to dip a whole leg in the workstation/prosumer/high-end-gaming market pool with TR.  They also did so in a way that keeps manufacturing costs lower (esp with TR) allowing for very competitive price points.

 

Intel's roadmap so far has been, frankly, exactly what I expected it to be...which is not a good thing.  I think they still hold the crown for IPC, but there's only so much they can do with their strategy and only so many cores they can fit on a die with their current processes before things like energy consumption and heat dissipation become a serious issue they'll need to overcome (to be fair, everyone has to deal with it, but Intel's position in the market puts more pressure on them).  Even worse for them, they seem to be still under the illusion that they're going to continue dominating the market with their pricing strategy on boxed desktop CPUs.  Granted, most of their processor revenue undoubtedly comes from OEM and laptop-type CPU sales (which has usually been a somewhat competitive segment) and sourcing/fabs costs a ton of money with their volume requirements, but geez.  For most tasks, KL only had a single digit performance gain over SL, yet they felt the need to increase prices anyway...not good for us or the industry or even Intel in the long term.  If AMD ends up taking even a decent percentage of market share away from them in even one of their market segments, then Intel's going to have to do something different than they did when they "owned" everything (read: most of the past decade)...and that's disruptive.

 

Having competition and alternatives is good for us consumers (and the industry as a whole) and I hope Intel doesn't fail to respond to it or they may find themselves in the same shoes AMD wore for the past bunch of years.  I'd hate for the crown to swap to AMD to the extent that we end up with another effective monopoly situation for a decade.  I'd rather see both companies produce solid offerings yearly that are actual upgrades at price/performance ratios that don't suck compared to prior generations.

Link to comment
Share on other sites

Guest asrequested

So here's a question. That the TR is basically two CPUs, we've found out that ffmpeg doesn't utilize more than one CPU. So in the server, will it actually use all of the TR or just half? It looks like the Intel will work, with the architecture being the way it is. The TR 1920X is $200 more that the Intel 7820X, with what looks to be very little gain. In fact a loss in the single/quad core performance. I'm wondering if the TR will actually be worse when working with ffmpeg. It's also 40W more power hungry. I really don't see any advantage, there. 

Edited by Doofus
Link to comment
Share on other sites

PenkethBoy

IIRC its not ffmpeg that does the multithreading its the libraries it uses which are key and do they get affected by the "two cpu" zen setup - also remember anybody that has a dual xeon rig would potentially see issues if there are any

 

interestingly its almost impossible to find any reports of TR performance with ffmpeg - Handbrake yes several of those and it does use all the cores

 

It might be worth asking in the ffmpeg forums if there are any issues

 

A thing to consider is that as Zen is so new things will change in the near and medium term - see the number of bios updates the m/b manufactures are releasing which have fixed some bug and added more features - also remember AMD will refresh ZEN to ZEN2 next year.

 

Also the performance in a particular app by TR/Ryzen will improve as optimisations (which Intel has a huge lead on) are found and applied - so what works (or not) today could be quite different in spring next year.

 

Ha the fun of deciding when to buy new hardware

 

Ps where did you get the quad core performance being an issue - not seen reports of that?

Link to comment
Share on other sites

Guest asrequested

Yeah, lots to consider. But if I wait for what comes next, I'll never buy anything ever again lol. I'm basically looking an overall significant improvement over what I have, now. This is going in my stable server, so I don't to be too experimental. I may try that next year in another machine :D

 

So far as I can tell (yes I know there's all kinds of test results. I don't want to get mired), the Intel 7820X is about my best option, at the moment. Weighing the costs of motherboards, CPUs, usage etc. 

 

The single core/quad core results I was just getting from that site you want me to avoid  :P

 

59d18c227bc5f_Snapshot_242.jpg

Edited by Doofus
Link to comment
Share on other sites

Waldonnis

IIRC its not ffmpeg that does the multithreading its the libraries it uses which are key and do they get affected by the "two cpu" zen setup - also remember anybody that has a dual xeon rig would potentially see issues if there are any

 

interestingly its almost impossible to find any reports of TR performance with ffmpeg - Handbrake yes several of those and it does use all the cores

 

It might be worth asking in the ffmpeg forums if there are any issues

 

Actually, there was a thread about NUMA and ffmpeg on these forums before (CLICKY) where ffmpeg seemed capped at 16 threads, which was equal to one of the two CPU's worth of cores and it was only loading one CPU.  Granted, that was in a VM situation, but the same principles apply.  I agree that asking an actual TR owner is the best way to go, but finding them hasn't been easy, and since UMA is the default mode, someone would have to want to explicitly test it in that mode.  If I knew a gamer with one, I'd prod them, but even that would be an uphill fight since nearly every review doesn't say much beyond "NUMA = you only use half of your costly CPU", which really isn't true (but a huge deterrent for many to even think about trying it out).  There's no denying that UMA presentation of 2+ distinct CPUs has a performance impact.  It's whether or not it would be a) noticeable for most people, and B) worth the trade-off for a given workload.  From the few benchmarks I've found today that tested NUMA vs. UMA, the increased latency has a very small impact on real-world application performance, which is a testament to AMD's mesh.  I haven't seen enough of them to judge the impact would be different with other types of workloads (they love games for testing or use Handbrake with simple presets and easy-to-encode sources).

 

Honestly, if the benchmarks available have all been run in UMA mode and they show TR to be the superiour performance choice for a given budget, there's no reason not to buy it.  In reflection, my concern about latency is probably a bit pedantic compared to most people, since NUMA would yield even better performance if the app can deal with it and allocate threads appropriately (x265 can, for example).  Call it a side-effect of my work history...that seemingly-small bit of latency mattered for our customers' applications and it just rubbed off on me  :P

 

A thing to consider is that as Zen is so new things will change in the near and medium term - see the number of bios updates the m/b manufactures are releasing which have fixed some bug and added more features - also remember AMD will refresh ZEN to ZEN2 next year.

 

Also the performance in a particular app by TR/Ryzen will improve as optimisations (which Intel has a huge lead on) are found and applied - so what works (or not) today could be quite different in spring next year.

This so much.  I meant to emphasise this earlier, but got distracted.  Microcode updates and compiler optimisation are only going to improve Ryzen/TR performance.  How much is entirely speculative, but it will get better.  One could say the same about the AVX-512 introduction on the Intel side as well.  Really, any new architecture goes through these growing pains.  The Zen refresh and generation bump should prove interesting, though.  Let's hope AMD doesn't pull a Kaby Lake on us  :P

 

There are other pros and cons for each that may/may not factor into any purchasing decision as well  (PCI-E lanes and thermal considerations, notably) that I didn't get into before, but are worth thinking about.  Early impressions of SL-X talked about some troubling heat generation, but those could've just been engineering samples or weren't using the final production cooler design.  That's something I'd definitely look into if you're leaning that way.

Link to comment
Share on other sites

Waldonnis

One other thing in reflection.  I think I/we may have gotten a little too focused on TR/NUMA in general when it may not even be necessary beyond just talking about neat hardware stuff.  If the performance that TR or even SL-X offer is overkill, there's not much use in considering it unless your budget is generous enough to warrant buying additional performance you may not end up using regularly (like me owning a sports car in an area where the highest speed limit is 35mph and it's heavily patrolled...I never leave first gear!).  If you don't need it, better to save the extra cash by buying for your anticipated needs over the next year or two max and just earmarking the savings for refreshing your hardware more frequently if keeping up with new features is important.  In contrast, performance gains have been pretty stagnant over the past few years, so a decent investment now may last you a while if you just need the raw processing power.

 

Bah, reminds me of why I've been perpetually wanting to upgrade for four years and haven't  :P   It used to be that you were always waiting thinking you'd be missing out on some huge performance gain about to drop.  Now it's more about core count, and even that was stagnant in the consumer market until AMD dropped the Ryzen bomb.

Link to comment
Share on other sites

Waldonnis

Interesting you mention how many lanes. I read somewhere that TR has up to 64,  while the intel has only 28

 

It should still differ by model on the Intel side, but I haven't looked.  Think the i9 SKUs had more than the i7 did (44 vs 28, I think?  It's been a while since I glanced at it), but neither had more than TR offers.  I personally wouldn't have a use case for TRs lane count right now, but I can see that being a very large benefit for some people and me being more interested in a higher lane count in the nearish future.  Multi-GPU (esp 3+) rigs would be the obvious beneficiaries, but there are other configurations that would love more available lanes.

Link to comment
Share on other sites

PenkethBoy

TR has 128 lanes of which 64 are available to the "User"

 

4 are used for the chipset on the motherboard - i.e. to connect USB, Sata, NIC's etc etc - same as intel

 

the other 60 are available for M.2 / pci-e slots etc - most Tr boards have 4-5 PCI-e slots at least two running at full x16 a couple at 8x etc - with intel having 24 available you get less options and why i depends so much on what slots you use and what you plugin so one x16 and one x8 and you are at your limit and why they usually have a x8 x8 x4 config etc and sometimes if you have a M.2 some other features on the board are turned off. Comes down to the choices of the m/b manufacturer.

 

Also most TR boards have 2-3 M.2 slots which means you can have m.2 raid not limited by the 4 lane of the chipset (by pass direct to the cpu) - i.e. on intel boards one Samsung 960 is at or about the speed of the 4 pci-e lanes - you in theory could have 3 960's running at full chat of TR in raid - still waiting to see a benchmark of that!!!!

 

all these extra options is why at the moment TR boards are expensive - this will come down over time but they wont be cheap as chips anytime soon :)

 

So future proofing with TR has its advantages over just the cpu speed :)

  • Like 1
Link to comment
Share on other sites

Jdiesel

I have a 1270v2 and have successfully achieved 7 30Mbps 1080p to 5 Mbps 720p transcodes in a testing environment.  

Link to comment
Share on other sites

I have a 1270v2 and have successfully achieved 7 30Mbps 1080p to 5 Mbps 720p transcodes in a testing environment.  

That's what I expected. Thanks!

Link to comment
Share on other sites

PenkethBoy

@@Waldonnis @

 

I had forgotten that The AMD Ryzen Master software allows you (amongst other things) to control the NUMA/UMA features of a TR cpu - by using Game Mode

 

This give you the option to turn off one Ryzen cpu when playing games - Legacy Mode - as games dont need lots of cores

 

But also allows you to switch between Numa/Uma mode and get a reduction in latency and how the memory is share by the CPU's

 

So as and when somebody gets a TR - it should be relatively easy to test if there is any noticeable latency - my prediction is i doubt you will notice but we will see

Link to comment
Share on other sites

Guest asrequested

@@Waldonnis @

 

I had forgotten that The AMD Ryzen Master software allows you (amongst other things) to control the NUMA/UMA features of a TR cpu - by using Game Mode

 

This give you the option to turn off one Ryzen cpu when playing games - Legacy Mode - as games dont need lots of cores

 

But also allows you to switch between Numa/Uma mode and get a reduction in latency and how the memory is share by the CPU's

 

So as and when somebody gets a TR - it should be relatively easy to test if there is any noticeable latency - my prediction is i doubt you will notice but we will see

 

Sold! 

 

Now I have to spend more money, lol

Edited by Doofus
Link to comment
Share on other sites

PenkethBoy

One for your Gigabyte addition  :P

 

 

there is a bit of linux stuff as well plus a view of the bios etc

Link to comment
Share on other sites

Waldonnis

@@Waldonnis @

 

I had forgotten that The AMD Ryzen Master software allows you (amongst other things) to control the NUMA/UMA features of a TR cpu - by using Game Mode

 

This give you the option to turn off one Ryzen cpu when playing games - Legacy Mode - as games dont need lots of cores

 

But also allows you to switch between Numa/Uma mode and get a reduction in latency and how the memory is share by the CPU's

 

So as and when somebody gets a TR - it should be relatively easy to test if there is any noticeable latency - my prediction is i doubt you will notice but we will see

 

Yep.  That's one nice benefit of TR that isn't seen in traditional (read: multi-socket) NUMA configurations - control over how they're presented to the OS without resorting to virtualisation.  I agree that the latency probably won't be a big deal at all.  Even gaming benchmarks that actually did try NUMA with games that could handle it were only seeing low-single-digit fps difference.  Since the "sockets" in this case are electrically "closer" (and mesh-joined), latency won't be nearly as bad as it would be with a two-socket motherboard configuration.

 

Things are getting crazy! lol

 

Woo!  Keep us updated when you get things set up.  If you are up for it, I could come up some x265 tests for checking out NUMA vs UMA for grins (since x265 has --pools).

Link to comment
Share on other sites

Guest asrequested

Yep. That's one nice benefit of TR that isn't seen in traditional (read: multi-socket) NUMA configurations - control over how they're presented to the OS without resorting to virtualisation. I agree that the latency probably won't be a big deal at all. Even gaming benchmarks that actually did try NUMA with games that could handle it were only seeing low-single-digit fps difference. Since the "sockets" in this case are electrically "closer" (and mesh-joined), latency won't be nearly as bad as it would be with a two-socket motherboard configuration.

 

 

Woo! Keep us updated when you get things set up. If you are up for it, I could come up some x265 tests for checking out NUMA vs UMA for grins (since x265 has --pools).

It'll be about a month, but sure, I'll be looking at testing it :D

 

But switching between UMA and NUMA requires a reboot. This will be my server, so I won't want to do that very much.

Edited by Doofus
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...