Intel/AMD Microarchitecture 2011

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
I think this will be a good thread next year after the products are out. There is too much that has not been disclosed and will not be out until launch.
 
I just wanted to point this out about Bulldozer:
http://blogs.amd.com/work/2010/10/25/the-new-flex-fp/

Read the post, specifically this part:
"One of these new instruction set extensions, AVX, can handle 256-bit FP executions. Now, let’s be clear, there is no such thing as a 256-bit command. Single precision commands are 32-bit and double precision are 64-bit. With today’s standard 128-bit FPUs, you execute four single precision commands or two double precision commands in parallel per cycle. With AVX you can double that, executing eight 32-bit commands or four 64-bit commands per cycle – but only if your application supports AVX. If it doesn’t support AVX, then that flashy new 256-bit FPU only executes in 128-bit mode (half the throughput). That is, unless you have a Flex FP."

It seems like when JF mentions the "flashy new 256 bit FPU" he is referring to Sandy Bridges FPU (im assuming), so does this mean that Bulldozer might have a higher IPC than SB in floating point workloads because of its flex FP?
 
It seems like when JF mentions the "flashy new 256 bit FPU" he is referring to Sandy Bridges FPU (im assuming), so does this mean that Bulldozer might have a higher IPC than SB in floating point workloads because of its flex FP?
Very possible if they are targeting the server/workstation/HPC market a lot with this release.
 
Hmm, if you compare an 8-core SB to an 8-core (4-module) BD, then it seems they would have the same number of FP units - eight - in 128-bit mode. However for AVX-enabled apps, the same SB would have 8 256-bit-wide FP units vs. BD's 4. It's only when you're comparing the MCM 16-core versions of BD to the 8-core versions of SB that BD would have the numerical advantage in 128-bit floating point ops. And I would imagine Intel will have MCM versions of SB as well, as they do with Nehalem-EX.

If AVX yields as much performance boost as Intel says, then I would imagine there will be upgrades or updates for most software available pretty soon after SB launches, if not before, esp. for consumer apps. And IIRC both Intel and AMD provide assistance to various software firms to help them with various issues such as updating to the latest SSE versions.

As for enthusiasts, IIRC most games are mainly integer and not FP.
 



Thanks John.

I read your latest blog on the new Flex FP with interest and it seems to have generated plenty of debate across the main global hardware sites in various threads.

The 4 part series you put together on Bulldozer 20 questions was also interesting.

People if you want to read them simply click JF's sig and read on ... get a good cup of coffee first though !!

 
Well not much info, but no flames yet, so thats good. Heres my opinion. Id say they are both doing what the other should in a way. AMD is right on the edge. Phenom set them back a generation, and put them in some serious debt. Just now they are getting out of debt and getting back in the race. Intel on the other hand is filthy rich. And yet AMD is the one taking the risk here, and Intel is going with the small tweaks on a tried and true system. Its just ironic to me. AMD cant really go through another Phenom, yet they are taking the risk on a very new arch, and Intel can go through 5 netbursts and still have more money than they know what to do with, and they are playing it safe. Looking at it alternatively, perhaps AMD needs a homerun to fully get back in the race, not be a gen or 1/2 gen behind staying in the low end, and Intel is just on autopilot. As far as Sandy Bridge, we dont know much more than what the leaks and the review at AT told us. About 10% more performance clock for clock, 20% more with stock clocks. BD we know a lot less about. JF is doing a good job filling us in with the arch designs and details, so that helps, and after reading through most of what they have to say, all i can think is "wow, talk about going off the beaten path." AMD really seems to be thinking outside of the box with BD. It hardly seem recognizable to what we know. Everything about it seems to be targeting efficiency. Seems like AMD's new thing, efficiency, as show with Barts. All i know is BD likely wont be like much of anything we recognize, and its either going to be great, or it could be just flat out bad. Either way, it seems like it might be coming a full 2 quarters after Sandy Bridge which is unfortunate. We shall see...
 
AMD cant really go through another Phenom, yet they are taking the risk on a very new arch, and Intel can go through 5 netbursts and still have more money than they know what to do with, and they are playing it safe.
In the business world, if you are the underdog who doesn't have a pile of cash to sit on top of, you really do need to try new things, or else your business goes belly up down the road.

Anyways, I do hope AMD can pull this off. Esp. in the 2P area as I will be in the market for one mid/late next year 😛.
 


IMO, Intel learned a valuable lesson with Netburst about taking too much risk with design (i.e., putting all your proverbial eggs in one basket). However the Netburst effort wasn't wasted - there's a lot of similar concepts in the Sandy Bridge design, which is advanced enough to take advantage of them. And Intel is by no means risk-adverse - take Itanium and Larrabee. Please! 😀 But jokes aside, notice that SB uses a ring bus like in Larrabee to interconnect the CPU and GPU.
 


Thats because its to expensive and too much of a hassle to upgrade. Atleast for most of them. I kind feel like SB is like 6xxx, and BD is like Kepler. SB is only the first part of the lineup, and isnt so much a full upgrade as just a nice boost. IB should complete it. Then BD is entirely new, and an entire lineup. Not sure its fair to compare them, as BD also appears to be coming out a full quarter later. Im also getting the strange feeling that SB is going to dominate gaming and single/lightly threaded apps, and BD is going to destroy everything at multi-threaded/video editing/server work. Should be interesting to see how things play out.
 
P4 was really not netburst. It was after P4 that Intel pursued the GHz project. It was initially called P5 but it later change the cores and began the Multi-core series...what you and I call the "Core" series.
 
The core2 series were based on the Pentium M (souped up P3 with low power in mind) and not the P4.

The P4 line was Netburst ... it was a good design but suffered a few flaws - the Replay function caused the cache to get flushed too often (cache misses were terrible) and the l1 caches were too small). The pipes were also very long ... adding to the cache miss problem.


 



Also the orignal "Core" or Yonah was based on the Pentium M or Dothan.
 



No. The OEMs did not get AMD CPU's because of Intel. Intel used a practice of 'blackballing' AMD and offered cheaper prices on Intel products if the OEM went Intel. That is on reason Intel was caught with their pants down at the FTC.
 
Actually, Intel got their heads handed to them by the Europeans, not the USA. Just like Microsoft has been forced to "play nice" with the other browsers out there regarding disclosure of all the APIs that were at one time restricted to Microsoft products only (such as Office and IE). We may think that the US Government is keeping the big boys playign nice, but the reality is that the FTC and several other govt watchdogs are more than happy to let sleeping dogs lie as long ast heir back get scratched too. The Europeans, on the other hand, actually have a constituency that they have to answer to, and have decided that it may not be in the best interests of Europe to allow American companies to run roughshod over competing European companies.
 
Oh, yeah right. i forgot. The US has NO regulation. If someone does something,( the person with money,) they get a slap on the hand. The Europeans De-ball the guy for heaven sakes fr crap like that. Reaganomics screwed up this country. That is ONE reason Intel is expensive...minus their economic practices on their products.
 


In the server world it will be 16 cores on BD vs. 8 cores on SB. That means in 128-bit workloads we will have 16 FPUs to their 8 FPUs.

We both share to get to AVX. We share 2 FPUs to get a 256-bit AVX execution path. They take 128-bit away from the integer execution path to get to 256-bit.

Based on the fact that most workloads are integer-based, I'd personally rather share FPU resources than take away from my integer execution in order to do 256-bit AVX. But, hey, that is my opinion.