AMD: You Want More Cores? OK, You've Got It!

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Well it is easily possible to get a CPU past 3.5 GHz ... if you have the proper cooling. Heat is one of the biggest problems they face. Metalization technology is also another big problem. As the transistors keep shrinking the wires must also keep getting smaller but this leads to severe problems as the smaller the wire gets the higher it's parasitic capacitance and resistance become. This leads to many many issues. Software is starting to catch up though as more people take advantage of thread-level parallelism.
 
[citation][nom]superjunaid[/nom]This article is apparently geared towards businesses and professionals working on workstations rather than consumers using computers to read the news, write emails and create documents in word.Faster cpus with 2-4 cores is more what consumers want. But for businesses and server environments more cores make sense.[/citation]
Agreed
 
I for one can NOT wait until we reach the point of deminishing returns as you continue to add more cores.

Remember people: More cores = more chances for a bottleneck for a particular application, especially one thats threaded too much.
 
So for server workloads (web services, DB, etc), Oracle (Sun) just last week announced at Oracle OpenWorld, the 3rd generation of their Chip Multi-Threading processor for general server workloads (ie: not a GPU or specialized cell-based) called the T3 (OK no one said they're creative at naming products but they do serious innovation in HW & SW) and it doubled the previous 1.6GHz 8-core to 1.65GHz and the worlds 1st 16-core (again for gen server workloads) and each core still has 8 threads with two pipelines per core. (ie: from the OS point of view that's 128 CPU's per physical processor) Each core also has built into it a FPU and a SPU (Security as in crypto-accelerator) that accelerates 11 (prev gen was 10) of the most popular ciphers including ones with big keys. Also has the PCIe 2.0 and 2x10GigE built into the processor instead of on the motherboard. Also built-in free virtualization hypervisor (Logical Domains) that you can turn on or run a single OS on bare metal. Comes in single-socket server and blade versions as well as 2-socket and 4-socket servers all avail in the next couple of weeks to 3 months dependent on version of server. BTW, it's binary compatible with all other SPARC processors and versions of Solaris going back to the mid 90's.

Yes this is not an x68 type processor and AMD and Intel make desktop procs that's of interest to this site, but many assume that AMD and Intel make the only server procs and are always in the lead in term of innovations like more cores and threads per proc which they are not. Sun (now Oracle) has been leading since Dec 2005 when the 1st gen of T-Series was released by Sun that had 8 cores and 4 threads each core and 1 pipeline per core using very little power and cooling even at 100% load in a single-socket server.
 
[citation][nom]gamerk316[/nom]I for one can NOT wait until we reach the point of deminishing returns as you continue to add more cores.Remember people: More cores = more chances for a bottleneck for a particular application, especially one thats threaded too much.[/citation]
Is this sarcastic? Sound like it. Now if you said a system that's not balanced as in enough I/O (name your favorite like mem, net, PCI cards, etc) to feed and send to/from the processor, I'd agree or if SW that can't take advantage of all the threads again I'd agree. But threaded too much?? Examples please and how they are threaded too much and what examples of bottlenecks are you referring?
 
So many applications are written to use only 1 core. People underestimate the power of a good clock speed... Even though the clock isn't a prefect indicator of speed. I DO think clock speed IS a bit more important today. The computer I am using right now (at work) is a 3.2GHz... SINGLE CORE! On the other hand it has 4GB of RAM.

Mine (at home) is 3.6 Ghz x 4
 
What use are extra cores and clockspeed if the initial algorithms and instruction sets are half hearted?
 
If they push more cores then they would need to create a proxy app to take multi threaded non optimized apps and fully load balance their work on the extra cores otherwise they need to wait for MS to or the apps developers to get jiggy with it and that might happen only when pigs fly.

I sure can use a few hundred more threads and would love to see VMware Workstation become ATI stream capable so we can do with VMware Workstation what we did ages ago with Virtualbox or XEN = offloads on GPU's and use the junk Intel i7-980X / i7-975 as glorified crappy memory controllers that they are.

Hopefully VMware / 3DSMax / other rendering apps would get to make use of this extra cores soon.

PS> Intel had 24 cores back in the 90'sand now only getting to make use of it why can't AMD lead with 64 or 128 cores @ 2 threads minm per core or more threads for less cores.

Hopefully (Sleepless in AMD/ATI) would read this and start to build something useful "proxy app + hardware"
 
[citation][nom]hawkwindeb[/nom]Is this sarcastic? Sound like it. Now if you said a system that's not balanced as in enough I/O (name your favorite like mem, net, PCI cards, etc) to feed and send to/from the processor, I'd agree or if SW that can't take advantage of all the threads again I'd agree. But threaded too much?? Examples please and how they are threaded too much and what examples of bottlenecks are you referring?[/citation]

Yup native Gentoo source compile on a dual core Intel t2600 laptop make.conf at -j16 normal emerge or -j4.
Gentoo source compile Intel Q9550 XP x64 + vmware guest at make.conf "-j6 -s" then emerge -j16 just to keep the junk busy in vmware while doing low load stuff in xp x64 host. (Vmware 4 cores guest). Just need to XPx64 raid hd's or ssd's then can push it even more.

Have to do this just to get the cpu's to run optimally not even at "nice" level yet.

Not sure where the bottle neck comes in but could see a valid point for hot-rodders + bottleneck and windows issues.
 
[citation][nom]schmich[/nom]You wanted to point out > in price right?[/citation]

Beings 12 core AMD do not exsist it is a moot point but... In most things the similar priced quad from intel beats hexa from AMD.
 
In 2006, I was speaking to a Microsoft exec who said that Intel had told Microsoft they were no longer working on the speed of the processor. Instead of speed, they were working on more cores so that each individual task within the software would be on it's own core. This exec explained that if there were 100+ cores, the data would be processed extremely faster compared to waiting on an overloaded clock cycle on one or two cores.
 
The problem with everyone quoting benchmarks and "performance" is their only looking at it from a single applications point of view. X application runs Y percent faster on Z CPU type stuff. This is all well and good, assuming you only ever run a single application at any point in time. But the moment you have more then one application running then having the advantage of multiple cores starts to show itself. VM's do this by virtualizing HW so you can have various OS's running each thinking their the only thing going.

The problem is that modern OS's and system implementations don't work for parallelism. We still can only have one keyboard / mouse and a single system "console". Modern OS's only support one "primary" display with all other displays being secondary. This needs to change. You need to be able to have as many primary displays as you have monitors, and as many keyboards / mice / user interfaces as you have displays. Each should be a fully realized user console within the OS.

Try to imagine a single "system" but having two or three displays in different rooms. Each display has some form of user interface attached to it and functions as a full user console. Allow each display to also have its own separate audio device. Each display can run its own fully accelerated DX11 game + browse the internet, download movies and so on. The merging of a PC with a mainframe concept, except instead of a terminal or a virtual display (remote desktop) its a real HW interface.
 
[citation][nom]CptTripps[/nom]Beings 12 core AMD do not exsist it is a moot point but... In most things the similar priced quad from intel beats hexa from AMD.[/citation]

Ummm, 12-core AMD's have been around for a while now.

http://products.amd.com/en-us/opteroncpuresult.aspx?f1=AMD+Opteron%E2%84%A2+6100+Series+Processor&f2=&f3=Yes&f4=&f5=512&f6=G34&f7=D1&f8=45nm+SOI&f9=&f10=6400&f11=12&
 
TOM - these NEW HP ADS won't go away! They are covering up your articles. You guys are screwing up with these ADS... Either horrible SCREAMING "MOOD LIGHTING" or "Lets cover your article" slide-outs.

What gives.
 
Yet I heard Cryengine 2 is using up to 8 cores, that means gaming is catching up, like they have caught up with 4 cores by now. No one can say they don't see better frame rates in games that are multi-core enabled on multi-core CPUs.
So what would make anyone think that games won't benefit from 8-12 or even 16 cores in 3-4 years? We can see that games benefit alot from multi-cores, the problem is that every added core needs rewriting the program from zero. So I say, let as many cores come into mainstream as fast as possible and the game programmers will decide when they have enough cores.
 
[citation][nom]husker[/nom]I understand why people say this, but I'm sorry to say software on multiple cores doesn't really work that way. This is like saying you want to get to a destination faster, so instead of driving one car at 60 mph, you will take 2 cars that each go 60 mph. That just doesn't make sense: You cannot divide the process of of taking one person between 2 cars and expect to get there any faster. There is no shortcut to fool single core software into running faster by using multiple cores just as there is no shortcut by taking one person in multiple cars.[/citation]


Your analogy is fundamentally flawed. A CPU's task isn't as simple as taking one person from A to B, its like moving the entire population of Canada to Mexico and back again. If you have 2 busses (duel core cpu) but only 1 bus driver (software not optimised for multithreading) you could stick one bus on top of the other (make 2 cores appear as 1) and move twice as many people.

If AMD had a 2.8ghz Athlon x6 with inverse hyper-threading, only 3 cores would be visible to the OS (just as twice the actual cores are visible with hyper-threading), but would perform nearly twice as fast as a 2.8ghz Athlon x3. It would be great for games like Starcraft 2.
 
Its too bad that AMD doesn't rebrand their server chips and market them as Phenom II X12. I realize that would probably cost both arms and legs, but that would give AMD a contender against the i7-980X.
 
the market data he's reporting makes sense, business is doing most of the buying and it's these types of businesses that are doing it and need as many multicores as they can get. i can see SETI maxing out 16 cores in every machine they could get their hands on very easily.
 
[citation][nom]Zingam[/nom]4 cores with 2 threads per core is more than enough for the general purpose desktop for the next 10 years...[/citation]

"640K ought to be enough for anybody." -Bill Gates
 
[citation][nom]L0tus[/nom]Why the thumb-downs for this comment. As far as performance, this is just FACT.The level of AMD (and ATI!) fanboyism on this site is beginning to concern me.[/citation]

You could say the same about Ferrari vs Toyota, Mazda, and any other company that makes normal cars, if you were only going to talk about performance. But in the real world that's not all that matters. Guess you've traded in your car for a racecar have you? As far as performance, your car is out of the picture.
 
To those trying to espouse some sort of inverse hyper-threading ... its simply not possible with the way today's software is encoded in binary. Instructions are sent to a CPU by the OS, those instructions then reside inside the CPU's cache until the CPU can get around to executing them. Two CPU's can't pull from the same pool of instructions because each and every thread in a CISC system assumes its the only one executing on the CPU. This was implemented in HW with the 80386 and is known as v86 mode. Binary instructions are nothing but math operations, compares and data movement done on registers inside the CPU. This context is unique to each thread and multiple threads can not share contexts. Every time a thread is switched in / out of the CPU its context must also be switched with it, this is extra data I/O that needs to take place.

Now lets look at a multi core CPU, something with 12 cores for example. That is 12 separate set of registers meaning a system can maintain 12 different sets of simultaneous contexts without having to swap I/O. Take this further and you get Intel's hyper-threading which is nothing more then assigning each core two separate sets of registers / stacks and have the execution engines shared between those registers. If you check at any point in time you have a few hundred separate process's running with each process having a few threads usually. That is a thousand contexts that must be tracked and swapped in / out for execution. Its how modern day multi-tasking happens, your CPU is actually processing hundreds of threads every second while constantly swapping in and out the contexts for each thread which reduces efficiency. So having more cores is never a ~bad~ thing even if no single core is running full steam its still dividing up the task switching work load.

Of course none of this seems apparent when your doing single measured benchmarks because you have a small handful of threads occupying all the CPU's time. But in real world scenarios where the user has 12~16 tabs open (each tab is a thread), email client, network sharing, virus scanning, malware scanning, and a game running in foreground, those extra cores definitely get a work out. I'd love to start seeing benchmarks with two, four and eight games running at the same time. Just to see how different architectures handle such complicated workloads.

Of course the absolute king of multi-tasking orientated CPU's is the Sun SPARC T2 or T3. A single T2 has eight cores with each core having two integer execution engines, one floating point execution engine, one memory management unit and eight sets of registers. Each CPU can maintain the context of 64 different threads while executing 16 integer and 8 floating point operations at once. The 5440 has four of these CPU's inside of it along with 256 GB of memory. And the T3 their working on promises to bring this to a whole new level. Of course the CPU only runs at 1.6~2.0 ghz.
 
I don't quite agree with their 1 core per vm idea. At our company we don't pay any real attention to the number of cpu cores, as the cpu's are so powerful we're running out of system memory much faster than we do cpu power. So our systems are essentially limited by dimm not cpu.
 
Status
Not open for further replies.