AMD Says That CPU Core Race Can't Last Forever

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Most software these days already benefits from multiple cores, just every time the number of cores doubles it only scales by around +64% to +85% (in the good cases).

You don't want games running at near 100% load on every CPU core, only 97% (or so) maximum - even then it won't give a linear scaling of performance in most cases.

What consumer machines, at least the ones high end gamers are want, really need is something like NUMA or NUMAlink.

- http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access
- http://en.wikipedia.org/wiki/NUMAlink
- http://www.google.com (Google it!)

The reasons for this are that the cache of one CPU module sits between it's shared memory and another CPU module (and vice versa).

This can permit above +100% scaling, as the cache hit rate for memory access generally improves under a NUMA system. (It's hard to diagram in ASCII art).

With the HyperTransport (and similar) 'bus' technologies we have today in consumer hardware there is no reason why this can't be implemented --- when required. (Which will be when marketing say it is, not the engineers).

The memory controller is already integrated in the CPU, making this even easier to do. AMD was doing it with Opteron 200 series back in 2003. (Generally on server/workstation hybrids with Registered x4/ChipKill(tm) ECC DDR-SDRAM PC3200).

It's awesome stuff, just don't expect software developers to make stuff geared towards NUMA systems for the mainstream market for another 15 - 50 years.
 
Imagine the .NET 1.1, Java 6 ME/SE, and parts of DirectX / WDDM v1.1 being integrated as 'instructions' in the processor, with firmware updates for new 'better optimized' microcode (including bug fixes).

We already have CRC32 as an 'instruction' on modern processors, and various AES acceleration instructions too (which I believe we have Via/Cyrix to thank for, as they did it 5 years before Intel or AMD).

That seams to be the solution to 'the power input/heat output problem'.

It's not rocket science, that's for sure...
 
Here's a idea (maybe a prediction). Instead of "cores" let's focus on "computing entities". Put 16 cores on a chip, design the system with tons of ram, drive space, etc and let the hardware turn this one machine into 4 logical units... Powering off cpu's, ram and resources entirely whenever an "entity" is shut down. Make this box 4 perfectly independent machines (or even 8 or even 16 all on one motherboard. and whatever other devices that are necessary. Instead of putting a computer at every desk in cube-topia, just put one for every 8 cubes... connect all the peripherals wirelessly...keyboards, mice, display, speakers, printers, so on and so forth... Let this box become the "terminal server" of the cube but instead of sharing the server space, you actually get a full blown machine logically cut out of a box... This would be much like the server blade concept except that the "blade" would be physical cpu's in the core with it's dedicated (or dynamically shared) ram and other resources. No competition for CPU time, it's all in 1 box, saves power, removes the wires and keeps it all independent. The major drawback would be when the entire box fails (power supply or whatever) each box goes down together.
 
Do the above with a fully virtualized HAL and have minimum/maximum sharing of CPU timeslices and cores/threads. (ie: Minimum of 2 SMT and 2GHz or X cycles and/or X MIPS/GLOPS and maximum of about 70% to 85% of the hardware compute power and physical RAM).

Wireless USB 1.1 is already here.

The main issue is the monitors and display to each of the 'cubicals'.

That's a fantastic idea though.
 
i think we got to end the races by copy rights. we cant have all the companys copying each others technology, or spending big buck trying to put outrageous technology out there
 
You can't make a 6 inch CPU as the chance of a fault that makes it a non-sellable-unit rises exponentially over a two axis fabrication. This leads to logrithmic selling prices since you would have to throw 85%-99.999,999,999% of your processors away due to defects. (Even with defect prevention/redunduncy the figure would be huge).

They do have 'stackeable' processor modules though, at least for mainframes. (Similar idea, but which much higher yeilds).
 
[citation][nom]eddieroolz[/nom]I suppose someone had to pop our dreams at some point. Realistically I think we'd top out at around 24 cores, if we even get there at all.When we do top out though, what will they focus on next? Instructions per cycle? Pipeline width?[/citation]

we have 6 core 45nm chips, and we are suppose to be in 11nm by 2015, if my math is right, we could see 96 cores on a single cpu at current sizes.
 
More cores haven't increased my processing speed as much as faster processors and more data bits. Yea, why aren't they talking about a 128/256 bit microprocessor.
 
Here is a clue...developers are writing code that leverages multiple cores. Anything that was written to take advantage of threadpools in any of the .NET languages (VB.NET, C#, C++ in Visual Studio) will take advantage of multiple cores. This requires, though, a whole new framework that has to be evolved since each new thread you add to your execution requires checks to make sure they aren't stepping on each others' toes, and cleanup to make sure that resources those threads are using are released to be used again. Granted, most modern IDEs goa long way to building most of these techniques into their environments, but I envision somehting along the lines of the Grand Central Dispatch that Apple implemented with their last version of OS X. This gives the operating system full control over thread creation and garbage collection based on the resources available. All the programmer has to do is define which operations need to be completed with whatever resources are available. Granted, the square brackets you have to use in ObjC aren't very pretty, but it does work, and it looks like the most efficient model I have seen so far. Far better than threadpools, anyway.
 
[citation][nom]back_by_demand[/nom]The reason is not because of regulation or noise as the Concorde fleet ran flawlessly for nearly 30 years in the lucrative Trans-Atlantic market.The reason why no-one travels supersonic anymore is the Concorde fleet was retired due to the French not keeping their runways clean, and the fact that the majority of Concordes regular passengers were killed in 9/11.Virgin Atlantic offered to buy the Corcorde fleet and bring them up to 21st century specs but the UK Government refused to issue a license. On top of that no-one has the money to design and build a new fleet of supersonic airliners so the focus has now become one of increasing the comfort level of passengers, rather than reducing flight time.[/citation]
Very valid point here
http://en.wikipedia.org/wiki/Concorde#Retirement
As far as producing noise it is very clear about that also
http://en.wikipedia.org/wiki/Concorde#Environmental
Noise abatement laws were introduced despite, and I quote,
Even before the launch of revenue earning services, it had been noted that Concorde was quieter than several aircraft already commonly in service at that time
Sounds like a law pushed by through to specifically target the Concorde as it harmed American airlines, anyone ever seen the debacle with Pan Am?
 
Great, so AMD's CTO oversaw what is largely considered to be one of the worst chips Intel has ever released? Is AMD really THAT desperate to get their hands on ex Intel staffers?

Ok, that aside. There are already 256 core CPU's in existence from multiple vendors; just not intel or AMD. Rapport Kilocore has 256 RISC cores per cpu and Clearspeed CSX700 has 192, Ambric Am2045 has 336. and... oh yeah, the gtx 480 in my box has 480 cores (240 in double precision); all nvidia needs to do is license x86 and add the extra instructions...

So what he should have said is: "Even though several other companies are already producing chips with this number of cores, AMD won't be doing it"
 
An addition to my previous comment...writing code in the future to take advantage of multicore (and multiresource) environments doesn't do anything for the untold millions of lines of legacy code out there that many companies will find it impossible to replace due to time and budget constraints.

Instead these companies need to look to moving this code off of antiquated computers and moving it on to virtual machines which can be hosted on modern equipment that takes advantage of faster and more power efficient processors. The only issue I can see is when proprietary hardware is involved, such as interface cards or hardware dongles.

I myself have written several applications for customers which do nothing more than act as a front end to applications that are decades old. Believe it or not, there is still a lot of money doing custom one-offs like this, and you would be surprised at the companies which are doing this.

Granted, it did not fix any of the issue with having to use such antiquated code, but it did give the company some breathing room so that when they bought a new accounting packages a few months later, they could take their time setting it up and easing the implementation into place. Their original concern when they called me was that they had to go to Goodwill to get a replacement power supply.
 
[citation][nom]kelemvor4[/nom]Great, so AMD's CTO oversaw what is largely considered to be one of the worst chips Intel has ever released? Is AMD really THAT desperate to get their hands on ex Intel staffers?[/citation]

to be correct, they had great ideas, and on paper and in math, flawless, but the real world doesn't always pan out to be what we expect. look at nvidia, they had there first gen fermi gpus about 50% faster than what they are, and planned for a 10% loss in over all power, but ended up with between 30-50% lower than what they expected, putting them realy close to amds performance.

[citation][nom]mhelm1[/nom]More cores haven't increased my processing speed as much as faster processors and more data bits. Yea, why aren't they talking about a 128/256 bit microprocessor.[/citation]

this is very simple for me to explain, because i was wondering why the hell we weren't doing 256 or higher bit already, its faster and it is better.

look at 64bit, pre win 7. 64bit makes you use more resources for lesser tasks. yea they may go faster, but they use more resources even when its unnecessary.

computers will not benefit from this, as much as lets say a console. because a consoles hardware is static, it needs to pull more from what it has, and if that means taking a hit early on until programmers adapt to using them, that so be it. it will let them have an over all longer lifespan, and considering the graphics and tech they can push when things are optomised, it should surprise the hell out of you that they are useing the equivalent to gforce 7000 something.

with a computer, we can throw better hardware at the things, so its not really necessary, as in i 128bit cpu may be great, but compared to its 64bit predecessor, by the time that 128bit is optomised for that chip, a new 64bit would come out eclipsing the 128bit performance increase.

now here is where i am mostlikely wrong, and right, its a mix, but 128 bit should be 2 times faster than 64bit, but in the real world, would only show 5% increases in power.

it would be better to optomise for only one or the other, but you have to take into account legacy and also the real world impact, requiring 3 sets for any given program. also, lets say that something was 32bit, it overflows a bit so 64bit would be faster, but not overflow enough to make 128bit faster. sure 128bit can handle more, but its about = to the 64 bit.

and everyone please correct me if im wrong in any part, i love to learn more about these kind of topics.
 
[citation][nom]dragoon190[/nom]/Off topicThe reason why passenger jets don't travel any faster now is because they would go (locally) supersonic if they fly any faster, and that would cause all sorts of noise and regulation problem (think Concord and how it's only allowed to fly over the ocean)./endOffTopicAnyway, I don't really see the CPUs going to 128 cores when the majority of the programs nowadays barely even utilize more than 2 cores.[/citation]

I'm waiting for compilier to start multithreading our programs for us. That will be the day...
 
Unless someone comes up with an algorithm to break an inherently serial task into a parallel one, some types of programs, specifically those that are inherently serial in nature, will never perform better on a multi-core system. It is not necessarily a matter of programmers writing more programs that use multi-core cpus. There are classes of programs that run better serially, and there are classes of programs that would run better in parallel and benefit from multi-core cpus.

The main type of program that would benefit from multi-core cpus is one where there is a large number of similar sets of data on which the same calculation needs to be done. This is the kind of task that you can easily break up many smaller tasks and run each task on its own core. This is why GPUs are so effective at rendering these days because the nature of the task is to have many sets of data where each set of data requires the same calculation. So, you break it up into many tasks and run each task on its own core.

For most programs that consumers run, like e-mail, word processing, etc., it would be more inefficient to break them up into parallel tasks. In fact, doing so would likely cause these programs to run more slowly than they do. So, it does not make sense to parallelize these types of tasks. In some cases a program like Excel might run faster on a multi-core system, but that would be highly dependent on the data that one is crunching.

So, I will not be expecting a treasure trove of programs that are coded to take advantage of multi-core systems simply because many programs would become more inefficient if they were coded that way.

If any one is more interested in learning about a task that takes just as long to run on a multi-core setup as it does on a single core, you will find some interesting references in the Seti@Home work unit known as the VLAR work unit. One of those work units takes just as long to run on a GPU as it does on a CPU simply because the nature of the data is serial and not parallel.
 
[citation][nom]Houndsteeth[/nom]An addition to my previous comment...writing code in the future to take advantage of multicore (and multiresource) environments doesn't do anything for the untold millions of lines of legacy code out there that many companies will find it impossible to replace due to time and budget constraints.Instead these companies need to look to moving this code off of antiquated computers and moving it on to virtual machines which can be hosted on modern equipment that takes advantage of faster and more power efficient processors. The only issue I can see is when proprietary hardware is involved, such as interface cards or hardware dongles.I myself have written several applications for customers which do nothing more than act as a front end to applications that are decades old. Believe it or not, there is still a lot of money doing custom one-offs like this, and you would be surprised at the companies which are doing this.Granted, it did not fix any of the issue with having to use such antiquated code, but it did give the company some breathing room so that when they bought a new accounting packages a few months later, they could take their time setting it up and easing the implementation into place. Their original concern when they called me was that they had to go to Goodwill to get a replacement power supply.[/citation]
What people fail to realise is that a single core in a modern multi-core processor is already far more powerful than the single-cored processors of years past.

By all means use a multi-core processor for the programs that use it, but older legacy code will be handled by a single core with ease.
 
[citation][nom]molo9000[/nom]How about replacing x86 instead of adding more and more stuff to it?I'm not very familiar with x86, but is a 32year old instruction set still useful?The number of transistors in a processor has grown from 29 thousand to well over a billion in those 32 years.[/citation]

Exactly. They should try to think "outside-the-box" and build a new design from the ground up for efficiency and speed. Yes, it would require all new programming, but so be it.
 
What about 128 bit chips? or 256 bit cpu processors? I think the intergrated GPU will add more float processing, but what about a physics chips? I think it would be cool to have a Physics built into the OS. doing allkind of fun things with windows and applications.
 
why not start a computer with two cpu .i mean we already have sli and crossfire technology?but i really hope at the end of the day nvidia enters the cpu race.having only two choices really makes us consumers unhappy.
 
[citation][nom]dragoon190[/nom]/Off topicThe reason why passenger jets don't travel any faster now is because they would go (locally) supersonic if they fly any faster, and that would cause all sorts of noise and regulation problem (think Concord and how it's only allowed to fly over the ocean)./endOffTopicAnyway, I don't really see the CPUs going to 128 cores when the majority of the programs nowadays barely even utilize more than 2 cores.[/citation]

You make a good point, though...
The laws of physics or physical properties of elements do not change. Technology designed to work with it can continually adapt, but in the end, are subject to these laws.

A similar argument, road vehicles can travel much faster than the posted speed limits. The tech is there. So why not??
Oh, wait... safety concerns / human factors must be added in.
 
Status
Not open for further replies.