Scott2010au
Distinguished
Most software these days already benefits from multiple cores, just every time the number of cores doubles it only scales by around +64% to +85% (in the good cases).
You don't want games running at near 100% load on every CPU core, only 97% (or so) maximum - even then it won't give a linear scaling of performance in most cases.
What consumer machines, at least the ones high end gamers are want, really need is something like NUMA or NUMAlink.
- http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access
- http://en.wikipedia.org/wiki/NUMAlink
- http://www.google.com (Google it!)
The reasons for this are that the cache of one CPU module sits between it's shared memory and another CPU module (and vice versa).
This can permit above +100% scaling, as the cache hit rate for memory access generally improves under a NUMA system. (It's hard to diagram in ASCII art).
With the HyperTransport (and similar) 'bus' technologies we have today in consumer hardware there is no reason why this can't be implemented --- when required. (Which will be when marketing say it is, not the engineers).
The memory controller is already integrated in the CPU, making this even easier to do. AMD was doing it with Opteron 200 series back in 2003. (Generally on server/workstation hybrids with Registered x4/ChipKill(tm) ECC DDR-SDRAM PC3200).
It's awesome stuff, just don't expect software developers to make stuff geared towards NUMA systems for the mainstream market for another 15 - 50 years.
You don't want games running at near 100% load on every CPU core, only 97% (or so) maximum - even then it won't give a linear scaling of performance in most cases.
What consumer machines, at least the ones high end gamers are want, really need is something like NUMA or NUMAlink.
- http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access
- http://en.wikipedia.org/wiki/NUMAlink
- http://www.google.com (Google it!)
The reasons for this are that the cache of one CPU module sits between it's shared memory and another CPU module (and vice versa).
This can permit above +100% scaling, as the cache hit rate for memory access generally improves under a NUMA system. (It's hard to diagram in ASCII art).
With the HyperTransport (and similar) 'bus' technologies we have today in consumer hardware there is no reason why this can't be implemented --- when required. (Which will be when marketing say it is, not the engineers).
The memory controller is already integrated in the CPU, making this even easier to do. AMD was doing it with Opteron 200 series back in 2003. (Generally on server/workstation hybrids with Registered x4/ChipKill(tm) ECC DDR-SDRAM PC3200).
It's awesome stuff, just don't expect software developers to make stuff geared towards NUMA systems for the mainstream market for another 15 - 50 years.