• Happy holidays, folks! Thanks to each and every one of you for being part of the Tom's Hardware community!

Intel to Share Next-Gen Poulson CPU Details

  • Thread starter Thread starter Guest
  • Start date Start date
Status
Not open for further replies.
50MB of cache.

32k X 8 l1 cache = 256k
256k X 8 l2 cache = 2MB
so the remaining cache is 48MB of l3 cache?? That's freaking huge! These definitely wouldn't be for desktop use. These are server chips.
 
[citation][nom]BulkZerker[/nom]What they won't tell you is this processor will set you back $1500 O.O[/citation]

...which is a bargain for server applications 😉
 
[citation][nom]BulkZerker[/nom]What they won't tell you is this processor will set you back $1500 O.O[/citation]

I'm sure it will be several times $1500.
 
[citation][nom]timothyburgher@gmailcom[/nom]Is this a server processor or the next step after Sandy Bridge?[/citation]

No, it is not. Sandy Bridge E series are. They will be very fast. This CPU is different.
 
Itanium processors cost much more than $1500. And Itanium L1/L2 caches are much larger than 256KB/2MB. What, no one here knows or remembers what Itanium is?
 
LoL Itanium is for nothing but special datacenters i wouldnt even put this in the same class as servers this thing runs super computer nodes and things of this nature. It requires a special operating system that runs the itanium instruction set i think they madea server 2003 itanium edition at one point due to it not be a x86 processor so yeah the old ones would set you back 3500 dollars and i dont know much about these new ones. when i left the eval lab at Intel back in 2003 we had tons of these things
 
[citation][nom]gfdsgfds[/nom]Itanium processors cost much more than $1500. And Itanium L1/L2 caches are much larger than 256KB/2MB. What, no one here knows or remembers what Itanium is?[/citation]
Very big, very expensive and unique chips. Thought x86 was bloated? Itanium uses an EPIC instruction set (explicitly parallel instruction computing) to attempt to achieve a much higher instructions per clock ratio.

I have absolutely no use for it, but damn, I want one.
 
For thoese of you commenting on the price.
itanium is not a cpu anyone person would buy. This main frame replacement.
Ment to run a entire data wharehouse.
so stop the pricing as if intel is trying to take advantage.
You thing the price is high. spec out a hight end sun main frome or a cray becasue this is what itanium is ment for.
This is not for crysis.
 
[citation][nom]jdamon113[/nom]For thoese of you commenting on the price.itanium is not a cpu anyone person would buy. This main frame replacement.Ment to run a entire data wharehouse.so stop the pricing as if intel is trying to take advantage.You thing the price is high. spec out a hight end sun main frome or a cray becasue this is what itanium is ment for.This is not for crysis.[/citation]
crisi on the cloud perhaps!
that's what this thing is designed for. and i'm pretty sure zuckerburg would buy one himself. future evil dictators have to have a way to manage their covert ops and manipulation of information some how! all pun intended.
bad example bad citing, just trying to keep my mind off the fact george soros makes dick cheney look like a two bit player at a poker game.
 
SSD are breaking the Gb/s mark now.
if u up to a 256MB cache and u can probably get away 20Gb/s like to a super fast SSD and the RAM disappears.
 
This article completely misses the most important thing about this release. The current Itanium is a 6-issue (in two bundles of three instructions each), the Poulson will be a 12-issue processor, which could mean even greater IPC than is currently possible. I'm curious if they're going to dynamically allocate bundles based on workload - if you have one thread use all the resources for one, but if you have several give each thread a bundle, or somewhere in between.

It will be interesting to see if Intel can finally get performance superiority over the horrible x86 instruction set processors. They've always been behind in manufacturing technology, and then they did weird things like make the L1 cache accessible in only one clock cycle (severely limiting clock speed), but with manufacturing parity, and finally two clock cycle access to L1 cache, this processor had better be able to beat processors crippled by an obsolete, difficult and inefficient instruction set.

Otherwise, they should have just gone with RISC, instead of VLIW.
 
Itanium was very bad as a server CPU. It had its own special *Intel* branded instruction set that you had to compile your OS / Drivers / Applications for. It used microcode emulation to run x86 instructions so that it appeared to be x86 compatible but was horribly inefficient as running x86 instructions. Because of this the Itanium sucked for anything where you would need x86 instructions for (90+% of the commodity server market) which left the "special use" systems that run OS / apps programmed specifically for a special CPU Architecture to run specialized software.

Itanium's competition isn't AMD / Pentium its things like Sun SPARC and IBM Power. And this is an arena where Intel gets spanked pretty badly. On one hand you have the SUN SPARC, the SUN T3 CPU which is,
"A 16-core SPARC SoC processor enables up to 512 threads in a 4-way glueless system to maximize throughput. The 6MB L2 cache of 461GB/s and the 308-pin SerDes I/O of 2.4Tb/s support the required bandwidth. Six clock and four voltage domains, as well as power management and circuit techniques, optimize performance, power, variability and yield trade-offs across the 377mm2 die"

In reality this means a single T3 CPU can process 32 integer (2 integer units per core) and 16 floating point (1 FPU per core) and 16 memory (1 MMU per core) operations per cycle. Each core has eight sets of register stacks allowing each core to process eight unique threads each. Each CPU has four DD3 memory channels to its own dedicated memory, 2 10Gb Ethernet ports and its own set of I/O circuitry. Each core has its own built in crypto circuitry for accelerating encryption and hashing. A single server would have four of these CPU's inside it along with 128GB ~ 1TB of memory depending. The only down side is that each CPU is clocked at 1.67Ghz, single thread performance is rather low compared to its IBM Power counterpart. These SPARC CPU's are designed to be used in databases and massively parallel servers, when you need to service thousands of users while processing hundreds of transactions per second, then you use a SPARC.

http://en.wikipedia.org/wiki/SPARC_T3

Their main competitors is IBM and their Power CPU, namely the Power 7.

POWER7 has these specifications:[5][6]

* 45 nm SOI process, 567 mm2
* 1.2 billion transistors
* 3.0 – 4.25 GHz clock speed
* max 4 chips per quad-chip module
o 4, 6 or 8 cores per chip
+ 4 SMT threads per core (available in AIX 6.1 TL05 (releases in April 2010) and above)
+ 12 execution units per core:
# 2 fixed-point units
# 2 load/store units
# 4 double-precision floating-point units
# 1 vector unit supporting VSX
# 1 decimal floating-point unit
# 1 branch unit
# 1 condition register unit
o 32+32 kB L1 instruction and data cache (per core)[7]
o 256 kB L2 Cache (per core)
o 4 MB L3 cache per core with maximum up to 32MB supported. The cache is implemented in eDRAM, which does not require as many transistors per cell as a standard SRAM[4] so it allows for a larger cache while using the same area as SRAM.

What this means in reality is that while it you get four threads per core with a maximum of four simultaneous instructions executed per core. Now a note needs to be made that IBM Power / AIX instructions differ from SPARC instructions so the two are very hard to compare. Power focuses more on getting a single task done as fast as possible where the SPARC focuses on getting as many tasks done at once as possible. Powers are clocked at 3 to 3.8Ghz per CPU (can shutdown cores to boost speed to 4.25Ghz) and are many times bigger then a SPARC CPU which often leads to unfair CPU vs CPU comparisons. Better comparisons have been done with system vs system competitions and they each win at different things (T3 at webserving / database work, Power at financial calculations / simulations).

These are the beasts that Itanium must compete against not home gaming rigs and low to medium server markets. Everyone rejected Itanium originally because of the horrible x86 performance, the commodity market doesn't want to recompile / redevelop their entire software base for a single CPU architecture.
 
Ok some pricing info, I'm very familier with purchasing Sun systems so I'll list the default quote off their site for a single system.

https://shop.sun.com/store/product/578414b2-d884-11de-9869-080020a9ed93
Config #3,
$177,057.00 Each,
4x SUN Sparc T3 CPU,
512 GB (64 x 8 GB DIMMs) Memory,
Internal Storage: 600 GB (2 x 300 GB 10000 rpm 2.5-Inch SAS Disks),
Max Internal Storage: 2.4Tb (8 x 300GB 10000 rpm 2.5-Inch SAS Disks),
Ethernet: 4 x 1 Gb 10/100/1000 MBs Integrated Ethernet Ports. Option Slot for 8 x 10 GbE XAUI Ports, 16 PCIE express module slots
Power: 4 PSU's @ 12.6 A @ 200 V AC
Space: 5RU, 8 systems per industry standard rack.

You need to purchase the 10GbE adapter separately, the circuitry already exists inside the CPU but you need the physical connector to be either copper or fiber, your choice. And while the system itself is 177 grand a pop, the specialized software this is most likely running will be twice that price.

I can't get a quote on an IBM Power 755 without contacting a sales agent, I figure it will be similar to the above SPARC range. Bonus points to the IBM for being very Linux friendly.
 
Is this a server processor or the next step after Sandy Bridge?

Its a server processor for a specific area, mainly 64bit. It uses IA64 which was developed by Intel and released back in 2001 as their successor to x86. It didn't fare well though because it could only emulate x86 code so it was slower than current x86 CPUs but is much faster in 64bit. Intstead we are stuck with x86-64 because of AMD and while I understand it, it also stuck us with the inferior x86.

Still the masses are hard to switch.



Probably a bit more than that. Its only in specific areas and current high end 4c/8t Itanium based CPUs cost $3838.00 each in 1kus. But as I said before, the area it is in it is a beast and hard to beat.

Still its nothing we will ever see but its impressive the technology behind it.
 
Actually EMT64 was a brilliant idea from AMD. Instead of trying to push a new 64 bit instruction set, they just extended the current x86 instruction set. Don't lament it as some poor idea, there have been "64 bit" instruction sets out for years and the commodity market didn't pick up on them for a very good reason. UltraSparc is a good example (sorry I'm mostly a SUN guy), its a very old 64 bit RISC that has amazing performance. There was even PPC which Apple used for years as a commodity platform. This isn't even a "OMFG Evil Microsoft" fault because MS made a version of NT for the DEC Alpha, an extremely high performance 64-bit RISC CPU. It didn't sell well and DEC eventually got bought out, MS dropped support for them and focused purely on its x86 software platform. When Intel released Itanium MS got behind them and built a NT 5.0 (Windows 2000) kernel for it, it supported 64-bit and everything. The application performance on it was horrendous unless the application manufacturer recoded their application for Intel specific Itanium. Very few of them did this and Itanium languished, some consider it dead.

AMD's push to create EMT64 was good because it allowed the existing industry to slowly adopt / grow rather then try to force them over all at once while creating a gatekeeper scenario (Intel was very stingy with Itanium licenses to HW manufacturers). You ~need~ competition at all levels to keep people honest and for the industry as a whole to progress. AMD licensed their 64-bit technology to Intel, would Intel have done the same if they created the 64-bit code? (No they didn't). How long have EMT64 CPU's been available? Application developers are just now including 64-bit binaries inside their programs, how long until their code is 64-bit exclusive? It takes application makers years if not a decade or more to migrate architectures.

So please, do not blame AMD for the current state of the commodity market. If anything you should be praising them, they are responsible for launching us out of NT x86 world and have brought mainstream 64 bit computing to the home user. Intel took their shot with Itanium and lost.

If you don't like x86 or EMT64 then use SPARCv9 or PPC. I personally run a SunBlade 2000 with dual UltraSparc IIIi @1.2 GHZ, 8GB memory (SUN) 146GB FC-AL 10K RPM disk + 76GB FC-AL 10K RPM, and an XVR-1200 graphics adapter. The OS is Solaris 10 with OpenGL support and a bunch of my own stuff running. Next to this I have my EMT64 machine that I use for gaming.
 
I use Itanium/OpenVMS daily and it gets owned by Xeons/Linux in about any algorithm. The Itanium compiler is an EPIC failure and no amount of HW tinkering is going to change that.
 
Here is another problem with VLIW and Itanium architecture. The binaries encoding of instructions are static and the HW is incapable of executing them out of order. The original Itanium had six execution units, binaries compiled for that CPU can execute up to six instructions in one pass. But if the user later upgrades their CPU to one with eight or twelve instruction units the binary could still only execute on six and the other units would be permanently stalled. You would have to go back and recompile ~everything~ to support the 12 instruction unit model. And if in the future they introduced a 16 or 24 unit model, then you'd have to do all that recompiling all over again.

Now lets reverse it, lets say MS goes out and compiles W2K8 for the *newer* Itanium with 12 instruction units. All the application makers go out and do this too, your entire software base goes out and does this. Guess what happens should you try to run those binaries on the older 6 instruction unit hardware? They will not execute properly if at all. They are statically encoded to send up to 12 instructions to a 6 instruction system. It will work just fine until the code tries to send a 7th simultaneous instruction and suddenly you will get an exception which will cause a nonmaskable interupt (NMI), most likely this will cause the system to crash. This would require the compiler to compile binary code for multiple instances of the CPU and then have the binary check and determine if it should execute 6, 12, 16 or 24 instruction code.

This is all because VLIW based architecture is perfect for when the software is being written directly to a very specific known architecture, stuff used in DSP's or GPU's. Its absolutely a bad idea in a general purpose CPU which can be upgraded or switched out and comes in multiple flavors and may be expected to execute any random amount of code at any random time.
 
Status
Not open for further replies.