Die-Shot: 8-Core Bulldozer

From THG News: Die-Shot: Next-Gen 8-Core AMD Orochi Bulldozer

bulldozer-orochi,Q-A-260146-13.jpg
 
Solution
It is a real die, not a fake. We photoshopped parts of the die to keep the competition from counting pixels and doing the math on things like cache sizes, features and other things. It is a pretty common tactic in the semiconductor business.

We were not releasing the die shot, that comes at launch, but our foundry partner wanted us to show it off at their event.

It is an 8-core Orochi die, that is what you will find in desktop and server products.

Don't worry about modules, we aren't going to market them that way, you'll only see core counts.

jf-amd

Distinguished
Mar 3, 2010
238
0
18,690
It is a real die, not a fake. We photoshopped parts of the die to keep the competition from counting pixels and doing the math on things like cache sizes, features and other things. It is a pretty common tactic in the semiconductor business.

We were not releasing the die shot, that comes at launch, but our foundry partner wanted us to show it off at their event.

It is an 8-core Orochi die, that is what you will find in desktop and server products.

Don't worry about modules, we aren't going to market them that way, you'll only see core counts.
 
Solution

sarwar_r87

Distinguished
Mar 28, 2008
837
0
19,060


but in theory, a 2 module 4 "core" version (when they are released) will be worse than PII X4 that we have now :eek:
thats gona look bad, or am i missing something.

----------------


L3 cache looks a lil small. reminds me of Phenom I :whistle:
 

sarwar_r87

Distinguished
Mar 28, 2008
837
0
19,060


n ur missing MY point ENTIRELY, i know its a new architecture and all. but in todays dual core, there is two sets of integer , decoder (total of 6), FP etc; one for each core. but in one module of BD it has two sets of interger core, one set of decoders (total of 4) and 1 FP. the L2 cache may be shared, and it may have a brand new architecture, but, unless amd has figured out how to fusion two FP calculation in one thread, i just dont see how IN MULTITASKING FP calculation it will hold. however, IPC will help single threaded app. two integer cores will eat ur any encoding/decoding. but FP calculation? im not sure.

and lets not forget the reduced number of decoder. unless in K10.5, the decoders were sitting idle most of the time, in BD, decoders might turn out to be a bottleneck. (considering you wish to market 1 module as "dual core") n i never mentioned BD has small l3 cache. what i ment was they remind me of P1

either ways BD is something i am waiting for to shush the intel fannies, but there is nothing better than that.

@jaydeejohn: BD uses deeper pipeline, so it should have less logic per stage compared to PII. But i guess AMD thinks clock speed wont be a problem n insists similar if not higher clock speeds than PII. but if they can pull off the new prefetch designs, im sure IPC will increase. im not saying BD will be slower than PII in single thread. im insisting if its fair to compare a 2 module (4 core) BD to a PII x4 or a core i5/7 (with 4 fully equipped core).
 

And you know this.... how?
 

notty22

Distinguished

Follow the link in his signature, thats him.

I understand what sarwar_r87 is saying. IMO, they won't market a 2 module , 4 core version.
They sell a 6 core Thuban now, so they will market, "the new 8 core BD", which is really 4 cores, but not, its a new take on 4 [strike]cores [/strike] modules that AMD is calling 8 cores.
After all , thats how Windows will see it , lol
 

sarwar_r87

Distinguished
Mar 28, 2008
837
0
19,060


maybe they wont. but still, amd's approach to chiplevel multiprocessing module may create a bottleneck. and given that amd has always been behind in FP computation compared to intel (even with PII@3.4G vs C2Q@2.83Ghz), it is a big gamble they are taking.

but maybe they redesigned the FP schedulers to provide better performance, but still, i fear it wont be enough.

but if they were to market them as 4cores, it would mean that they wont be the first 8-core CPU, but it would help them compete with a four for core i7 with a 4 core (8-threaded) BD...and that would sound fancy. just like a four core intel wipes a x6 is some benchmarks



ok. ill see if you can wrap this:

2 core PII: 2x Integer core, 2x Floating Point scheduler, 2x Decoder (3-wide or total of 6-wide). 2x L2 cache
1 module BD: : 2x integer core, 1x Floating Point scheduler, 1x decoder (4-wide). 1 shared L2 cache

only real advance for BD is the new higher IPC due to better prefetch and shared L2 and offcourse the better architecture. meaning integer computation will improve and your office software will open faster.
but things like Ray computation (the kind of things that mordern hardware still laggs) and games, which depend on Floating point calculation are limited due to less hardware to work with. also lets not forget the decoders are now less in number of wide.
 
I can see where sarwar is coming from. In essence, Bulldozers CMT is their version of SMT only on a much larger scale. When it was first revealed how it worked, I always thought they were adding FPUs for their CMT so that each core had 2 units, 1 for each thread.

But instaed they are basically cutting the core down to half of what a Deneb core has and having the module shar them.

If they do release a 2 module 4 core part, I can see where it might not out perform somethin like a Deneb quad core.

Then again we have to wait and see if the way AMD is going is the best way. They seem to be trying something totally off the wall, which is not good for a company that hasn't made a consitent profit over the last few years. Normally off the wall stuff is for someone like Intel who can afford to not profit as much (Netburst).

Hopefully it does perform decently. We will have to wait and see though.

I am a little annoyed at them though. They are confusing the market with their new modules and core naming scheme. Hopefully we will be able to tell which CPU to test against which in SB vs BD so we can tell which is the better performer core per core and clock per clock.
 

sarwar_r87

Distinguished
Mar 28, 2008
837
0
19,060


1. exactly. i do agree that amd's answer to HT is far more superior than intels, however, calling it 8core may backfire, if it cant hold its ground against a 8core intel. because average joe dont know its 1.5 cores rather than 2 cores, so they will compare it with a intel 8core. so if amd 8core falls behind intels, amd only have their marketing policy to blame, even when the engineers have done an excellent job.

2. according to amd's 33% more core proving 50% performance booast claims, they want to position a 4 module BD against a six core i7.

@jaydeejohn: it would be nice if you can share the slide :) or atleast which site you saw it on :)
 

jf-amd

Distinguished
Mar 3, 2010
238
0
18,690


FP will be a big jump. Keep an eye out for a Bulldozer blog about FP in the upcoming weeks. I am about halfway through it at this point. I have some engineers that are working with me on it because floating point micro ops are not my strong point, I am not an engineer.
 

sarwar_r87

Distinguished
Mar 28, 2008
837
0
19,060


that is an xcellent news :)