Is Core 2 Duo the first true, nature dual core processor?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

quartzlock

Distinguished
Apr 16, 2006
18
0
18,510
From Intel's doc

I couldn't assure that the data one core writes out to the cache wont fall on the place there is data the other core is using, Im affraid I'n not well documented there but the "trashing" term is somehow useful here. Imagine the dataset of core 1 is 3Meg and dataset of core 2 is 2 meg and the whole code is cached in L1 of each core. Although core 1 is not hurting itself it could be hurting the other core's dataset because they together dont fit the 4M cache (c2d 6600 for example). What I understood of smartcache has not much to do with preventing trashing. I can imagine that smart cache could see that the data belongs to one specific core and in an attempt to avoid flushing the other cache's data it will flush one line of its own space, but such a decision is more based on prediction. Again worst case of such a prediction is that you cause trashing to the other core. Trashing however can be avoided when you use the right programming techniques and the same multithreaded app could run flawlesly in a Q6600 or clog compleetly the fsb if its bad written.
 

rninneman

Distinguished
Apr 14, 2007
92
0
18,630
That brings me to an interesting question that I haven't been able to find an answer to. In the case of a Q6600 which has 2 discrete shared L2 caches, does the OS or applications allocate a core off the first die first, then a core off the second die second, then the second core off the first die third, etc? This would avoid cache thrashing if 2 threads are running on a quad core cpu. It would also be effective in 2+ socket systems with dual core chips like a double dual core Xeon workstation. (In that case, each thread would benefit from both larger L2 cache and the full FSB bandwidth.) I would assume this would also benefit dual core processors with Hyperthreading. I would like to know whether it is best to allocate in logical core order (0,1,2,3,etc) or to allocate based on die or chip order (0,2,1,3).

Ryan
 

SockPuppet

Distinguished
Aug 14, 2006
257
2
18,785
That brings me to an interesting question that I haven't been able to find an answer to. In the case of a Q6600 which has 2 discrete shared L2 caches, does the OS or applications allocate a core off the first die first, then a core off the second die second, then the second core off the first die third, etc? This would avoid cache thrashing if 2 threads are running on a quad core cpu. It would also be effective in 2+ socket systems with dual core chips like a double dual core Xeon workstation. (In that case, each thread would benefit from both larger L2 cache and the full FSB bandwidth.) I would assume this would also benefit dual core processors with Hyperthreading. I would like to know whether it is best to allocate in logical core order (0,1,2,3,etc) or to allocate based on die or chip order (0,2,1,3).

Ryan

There is no such thing as "cache thrashing". It was a futile attempt by some AMD fanboys to cast the C2D in a negative light prior to launch.
 

quartzlock

Distinguished
Apr 16, 2006
18
0
18,510
HAha, well yes and no: Lets say you have a Q6600, and you are running 2 threads. If these threads have a common dataset you want them running on the same die(core1,2) so you can benefit from the shared L2, and not on core 1,3 because the cache coherency traffic would have to run via FSB. At the other hand, if you have 2 threads without a common dataset, you want them running on different dies, so they can use their local L2 to its maximum potential and reduce FSB coherency traffic.

What I do most of the time is I try to get the number of sockets, dies, and cores with cpuid to know exactly what for arch my code is running on and by using affinity selection to try to keep some threads together and some others outside of the rest's way.

There is no such thing as "cache thrashing". It was a futile attempt by some AMD fanboys to cast the C2D in a negative light prior to launch.

Assembler instructions like movntdq or movntq were made specifically to deal with the trashing problem. Trashing is not something related to AMD or INTEL, nor related to multicore. Even single cores suffer from it when you don't treat it with the right assembler optimizations( or intrinsics). Both intel and amd approaches have their pros and cons. Most optimized apps check what type of hardware they are running on and run code accordingly.
 

joefriday

Distinguished
Feb 24, 2006
2,105
0
19,810
I'd say the X2 was, since it was the first processor available with two cores, regardless of their cache structure or die type.

Wasn't it technically the Opterons? Servers hold priority over everything else.

Yes, Opteron was the first according to this article

http://en.wikipedia.org/wiki/Amd

Actually, Intel undercut AMD by a few days and was first to market with an x86 dualcore CPU, the Pentium EE 840. AMD's opteron was the first Sever dual core by quite a margin though. Intel didn't release the Xeon dual core until much later.

http://news.com.com/AMD+releases+dual-core+server+chips/2100-1006_3-5678562.html
 

rninneman

Distinguished
Apr 14, 2007
92
0
18,630
Generally, a native dual core CPU is regarded as one that has both cores on the same die.

If you make this as the definition of native dual core, then the Pentium D smithfield was a native dual core CPU as well.

I forgot Smithfield was on the same die. When I think of PD, I usually think of Presler which was MCM. Thank you for pointing that out.