AMD CPU speculation... and expert conjecture

Page 721 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Reepca

Honorable
Dec 5, 2012
156
0
10,680


I had a long post written up about dependencies between colliding objects between-updates that likely completely misinterpreted your post, so I scrapped it (partially resurrected later).

So if I understand this correctly, physics isn't part of the main game thread, but rather runs separately - constantly updating the state of all objects based on time passed and interactions between objects in a loop. In this way, it is task parallel. However each object is... well... its own object! An individual thread can process the interactions between each object and everything else. In this way it is data-parallel.

But the part that bugs me about this is that for any time-chunk, things happen within that time-chunk. Taking this to an extreme example, suppose that one time chunk lasted 10 seconds. Two people are moving towards each other and, simply checking their paths against each other, one would think they would collide, except that 1 second into this time chunk person A gets hit by a train and flies far away. If we only looked at the interactions between person B and the world, person A and the world, and the train and the world (individually, in parallel) we would end up with persons A and B colliding *and* the train hitting person A. The way I imagine it, nothing is "simultaneous" in a time chunk - something happening within 10 ms of another thing clearly doesn't mean they happen at the same time. Collisions should be detected, sorted by time, processed in order, and every time a collision is processed all objects are moved forward to that point in time (and the corresponding change in position) and any objects that thought they were going to collide with either of those involved in that collision must re-check for the soonest collision... in my head, at least. Due to dependencies between the objects, don't the interactions have to be processed in order?
 
Again your stuck thinking linearly as though you need to calculate the effects of an object for the future. When you do the calculates they haven't moved yet, they only move after you've done the math. With each update you refresh the state of every object in the world. So in your case with object A and B on a collision course with object A being intercepted by object C, you would calculate out each of those as time progressed and not from life to death. What I described was now physics simulators work, the types of things that calculate out what happens on a nuclear level and they are massively parallel. Large scientific experiments are run on very powerful computers using gigantic arrays of processors, and they are run using this method. You don't have a single core running at 40Ghz trying to calculate out the particle interactions in a plasma, instead you have dozens of cores working simultaneously calculating out the interactions, collisions and kinematic effects of billions of ions. Resolution is measured in time, how small the amount of time in between calculations of what has happened not what will happen.

Hopefully your not trying to argue physics are serial when every scientific simulation demonstrates otherwise on a daily basis.

In fact here, I'm going to the let folks who own hardware Physics based processing technology speak about it.

http://www.nvidia.com/object/physx_faq.html

Why is a GPU good for physics processing?
The multithreaded PhysX engine was designed specifically for hardware acceleration in massively parallel environments. GPUs are the natural place to compute physics calculations because, like graphics, physics processing is driven by thousands of parallel computations. Today, NVIDIA's GPUs, have as many as 480 cores, so they are well-suited to take advantage of PhysX software. NVIDIA is committed to making the gaming experience exciting, dynamic, and vivid. The combination of graphics and physics impacts the way a virtual world looks and behaves.

Out of all the possible computation problems that you could argue is serial, physics is the last place you should of started.
 


Ok, the power consumption isn't as bad as I expected (though there is a noticable jump as it's overclocked- it is only a dual core part remember). The thing is though, your example is still absolute worst case for AMD. At the end of the day, would you (or anyone) honestly reccomend a core 2 duo or Core 2 Quad to anyone over an FX 6XXX or FX 8XXX part for modern day workloads?

In a few very specific scenarios (like x87 instructions), AMD aren't where they should be, however looking at a wider set of benchmarks (including a good range of games) I'm 100% certian that even an FX 4XXX part is a worthwhile improvement over anything fom Core 2 (and ok you *might* be able to outrun a FX 4 part when the Core 2 is overclocked, but then you just need to overclock the FX to redress the balance).

The tougher question is when you compare the FX (which are actually quite old) to the latest i3 from Intel. The new i3 is capable of running most things as well or better than an FX 8XXX part, with the possible exception of a few best case applications, and all the time using significantly less power. I think 18 months ago the FX represented very good value compared to Sandy / Ivy parts, but I can't pretend AMD have been dragging this gen out a lot longer than they should be.

Zen may not be a home run like A64 was, however they'd have to really screw up badly for it to be in any worse position than they are now in that space :p
 

truegenius

Distinguished
BANNED


i guess, they know how to do this much better than anyone else :whistle:
 
Ok, the power consumption isn't as bad as I expected (though there is a noticable jump as it's overclocked- it is only a dual core part remember). The thing is though, your example is still absolute worst case for AMD. At the end of the day, would you (or anyone) honestly reccomend a core 2 duo or Core 2 Quad to anyone over an FX 6XXX or FX 8XXX part for modern day workloads?

The real question is, if someone comes asking whether it's worth replacing a 9xxx based C2Q for an FX based rig, if there is justification to say "yes".

That's AMDs problem: High tier C2Q's aren't significantly worse then the FX lineup, so you can't justify a system rebuild around FX if you currently have a C2Q; it's not price/performance justified.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Never say never.
 
Gamerk316, there is more to a computer than gaming. You guys talk like nobody does anything with a computer but build gaming rig. Just the opposite is true. You represent a small minority of computer users. And even many gamers use computers for other things. Not to mention that many of these tests are sanitized. For example, your not streaming with C2Q, but could easily with most of the FX parts.
 


Actually, a C2Q is plenty for streaming. Streaming is actually not that hard on the CPU, all things considered. Games are one of the most stressful things a normal user could be doing.
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780
First, this forum software is awful. I had a long, nice post written out and it ate my post because I was logged out. Thanks for loading the posting form with AJAX for some stupid reason. When you navigate back to the page, what you typed in the textbox gets deleted when it's normally saved there by the browser.

Anyways, Intel is no saint in not making large IPC gains. http://www.tomshardware.com/reviews/processor-architecture-benchmark,2974-14.html

It's difficult to tell which ones see performance improvements from new instructions and which don't. Conroe only supports up to SSE3 and SB supports up to AVX. But there's a less than 10% IPC improvement in Photoshop CS5 from Conroe to SB over the span of ~4 years. AMD is not the only one having trouble with generall IPC improvements. But Conroe doesn't clock nearly as high as SB does, and the same can be said for Conroe and Piledriver. So IPC by itself is a moot point, which we've been over countless times.

http://techcrunch.com/2008/07/29/overclock-world-record-q6600-24ghz-run-at-51ghz/
Q6600 took liquid nitrogen to hit 5.1ghz and I run 5.1ghz 24/7 on my Piledriver with just a simple XSPC kit. And I'm far from the only person to hit 5.1ghz on an FX chip. Just for fun, Piledriver world record overclock is 72% higher than Q6600 overclock.

And to be fair to Intel, a good Q6600 overclock was high 3ghz range. And now that's what new Intel chips ship at stock. Now, if we were looking at chips that were not improving top frequency or even stock frequencies, then IPC would be something to consider. You know, sort of like IPC not changing very much between SB and Haswell when using the same instruction set, yet SB clock to 5ghz under water and Haswell not making it that far.
 


A 6000 series, sure. A 9000 series clocked in the mid/high 3GHz range can still perform. Yes, FX is better, but not "I'm going to spend $600 building a new PC" better.

And yes, I still have a OC'd QX9650 rig lying around (790i platform; DDR3 RAM); it still does the job. Sluggish around the edges, but still in the mid-tier i5 range based on the performance I'm getting out of it. i'd imagine a bit worse for memory dominated tasks, but it's not like it's a Pentium 4 or something.

Point being, I laugh every time I see people doing CPU sidegrades, like going SB->Haswell. There's literally no point to it anymore, and hasn't been since the late C2Q era.
 

logainofhades

Titan
Moderator
Yea Sandy to haswell is pretty pointless, unless you have a hardware failure, and have little choice in the matter. Hence why I went Sandy to Ivy. I used my i5 2400 setup to get my file server back up and running, and made a trip to microcenter for my 3570k and board.
 

etayorius

Honorable
Jan 17, 2013
331
1
10,780




Core 2 Quad probably has like 600Mhz advantage in regards to IPC compared to Piledriver, so the Q6600 is probably a good margin ahead the FX in Gaming, but i can see the FX winning in heavy Threaded App.

How old is the Q6600? Exactly, AMD have not managed to match the IPC of a 7/8 year old CPU.
 

logainofhades

Titan
Moderator


3.6ghz was easy with a Q6600, as most C2Q's could easily hit 400fsb. It was beyond 400, that many had troubles. My X3210, which was a xeon equivalent to a Q6400, ran at 3.6ghz @450fsb. It was quite the golden CPU.


 
Gamerk316 wrote

A 6000 series, sure. A 9000 series clocked in the mid/high 3GHz range can still perform. Yes, FX is better, but not "I'm going to spend $600 building a new PC" better.

And yes, I still have a OC'd QX9650 rig lying around (790i platform; DDR3 RAM); it still does the job. Sluggish around the edges, but still in the mid-tier i5 range based on the performance I'm getting out of it. i'd imagine a bit worse for memory dominated tasks, but it's not like it's a Pentium 4 or something.

Point being, I laugh every time I see people doing CPU sidegrades, like going SB->Haswell. There's literally no point to it anymore, and hasn't been since the late C2Q era.

I see your point now. I agree with most of this. I don't understand the "upgrades" from sandy bridge to Ivy bridge to Haswell. They are so small, they just aren't worth the money.
 

Reepca

Honorable
Dec 5, 2012
156
0
10,680


I'm not trying to "argue" anything, I'm just trying to reconcile what you're saying with what I know/think. I'm not sure I'm communicating my question clearly, though.

What I think when I hear "parallel physics" is having a thread that runs X other threads to update X other objects, then repeats after they are all updated. It measures the amount of time between each update through this loop and updates the objects as far as time has passed. "Updating" each object involves checking for collisions/interactions with the rest of the system and changing the state accordingly.

Explain to me how my understanding is wrong thus far, if it is.

However, I would also think that when updating each object, surely it needs to be done in order based on time? If there are 10 seconds between updates, if you don't process interactions in order then you end up with everything that happened in those 10 seconds happening at the same time - which doesn't happen in the real world; there is no Planck length of time, it is continuous.

The processing of object A's collision with object C must occur before the processing of object B's collision with object A. If it doesn't, then the simulation will say that object B collided with an object that could not have possibly been there. I can't see any way around it.
 

jdwii

Splendid


Not to mention streaming is starting to be done on the GPU like i use shadowplay i see almost zero difference in performance. Plus if it can game it can easily do facebook or word i heard this argument before, moar cores doesn't mean moar multitasking.
 

etayorius

Honorable
Jan 17, 2013
331
1
10,780


Suddenly, AMD Fails is all of our fault... AMD is in a sh*t position and every time they answer to Intel Products it seem as if they throw a Diarrhea attack.

Bulldozer was released in 2011, their next arch comes in 2016... that`s 5 years of Faildozer... why on earth it took them 5 years to understand Bulldozer was that bad? they should had done something immediately after Bulldozer, they were in a much much better position that they are now.
 

8350rocks

Distinguished


To be fair to AMD.

You are discussing IPC in a software code base for a single specific game, written with archaic code that was not much more relevant 7/8 years ago and is essentially over 30 years old now. AMD has never supported it well, and likely never will since the only software to run x87 code since nehalem launched are the following:

SuperPi
Skyrim

AMD cares not for worrying about either of those...

However, here is a cinebench R11.5 video comparing the 2: http://www.youtube.com/watch?v=DiXkg3-RA8c

Both are overclocked there, the Q6600 a lot more than the AMD is, though the AMD still holds the GHz advantage in terms of clockspeed.

All said and done...it depends on what you are doing. Even then, each architecture will have strengths and weaknesses. So, you really cannot say AMD cannot beat C2Qs, unless you put the caveat in that it is only while running 30 year old code.

By comparison:

A core i7-5960x is incapable of processing punch card instructions from old PCs...does that mean that the punch card instruction models are faster than a 5960x?

Be reasonable.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


You cannot compare Intel to AMD like you are doing it. You are thinking linearly, but world is nonlinear. It is much much easier and cheaper to improve by 10% the IPC of a slower chip such as Piledriver than improving by 10% the IPC of a faster chip such as Haswell.

Some here have mentioned that Haswell IPC is not an improvement over Ivy Bridge. That is right if you limit measurements to x86 software. If you consider AVX software then Haswell brings up to 70% IPC gains over Ivy

http://www.pugetsystems.com/blog/2013/08/26/Haswell-Floating-Point-Performance-493/
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


The development of Zen/K12 started when Jim Keller returned to AMD in 2012. According to Feldman, it takes three or four-year time frame and $300 million to $400 million in development costs to build an x86-based server chip based on a new micro-architecture

http://www.xbitlabs.com/news/cpu/display/20130709232003_AMD_Amazon_Facebook_and_Google_Could_Develop_Their_Own_Chips.html

Thus Zen cannot be ready before 2016--2017.
 

jdwii

Splendid
Juan i thought Jim keller said he didn't have to start from scratch and it was just going to take the best of Bulldozer and Jaguar and Phenom with an updated process? If so that probably cuts down time.

Always a question for you guys do you think if Samsung/Global foundries had 14-16nm ready Amd would be able to get Zen/K12 out the door faster?
 
Status
Not open for further replies.