IBM Making Nvidia Fermi-filled BladeCenter Server

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
I think you guys talking about mainstream and i7 cpu's etc are wasting your breath. This is not for you. This is for HPC in the datacenter. HPC as in High performance computing (supercomputing type stuff). Fermi based blade systems can provide a petaflop of processing in only 10 racks of floor space. Other HPC solutions require much more.


 
[citation][nom]BulkZerker[/nom]Because then you have to pay to cool the room also. That's why the Seamicro server based on hundreds of Atom processors is honestly a pretty damn good idea. This server idea may be a new attempt at some brute force OMGNUMBERCRUNCHING! super computer but TBH I would shoot down the funding for it if the power consumption of that box suddenly doubles our power bill. And since this IS made for the industrial market in mind, power consumption is one of the top 3 priorities in making the initial purchase.Price, Performance, Operating cost. And this server idea seems to be a little heavy on Operating cost unless they are using cherry picked GPUs out of the runs.[/citation]

I was responding to some comments that were not directly related to the article (read: fermi at large). I would tend to agree with you from a business perspective, but some new server room design reuse the heated air instead of cooling it in a closed circuit. Those blade looks awesome for increasing your density.
 
[citation][nom]TommySch[/nom]I dont see why you are all whining about it being hot... Heat = power. Im all for efficiency but its not a reason not to push the TDP as far as you can. I wouldnt mind having a 500++ Watt card or CPU if its still efficient.My liquid cooled Q6600 produce more heat than a Fermi card, its running at 3.6GHz @ 1.45V @ 50°C under load with 3x 120mm radiator flowing at around 400 L/hr.[/citation]

Thats semi-fail my q6600 was doing 3.4ghz on A semi cheese air cooler. G0 ftw. 8x425mhz
 
I don't understand why a company like IBM would choose FERMI to put in their blade. Why not just wait a bit for OPENCL and a better architecture?
 
What's with all the AMD homers. Isn't the next 7000series card a power hog, from the preview. But AMD fanyboy will give it a pass, and say it is the fastest.
 
Personally, I'll remain a bit wary over the concept of GPGPU in an industrial/scientific environment. Sure, both nVidia and AMD *claim* massive theoretical FLOPS figures for their cards, but truth be told, in areal-world computing application, they perform far more below their theoretical maximums than more conventional CPUs; that's because while a GPU was designed for raw theoretical power, a CPU has loads more branch predictors, registers, cache, and other things to make sure every clock cycle counts no matter what it's running.

[citation][nom]rhodesar[/nom]This is interesting, however I would like to know more about the performance expectations in relation to non-GPU based predecessors.[/citation]
We actually have a real-world example; earlier this year China took nVidia up on their offer, and built a massive supercomputer primarily out of nVidia Tesla cards.

The machine, the "Nebulae," Consists of 9,280 Intel Xeon X5650 CPUs, coupled with 4,640 Tesla C2050 cards. Figures for them are:
- Xeon X5650: 2.66 GHz, 6 cores, 64 GFLOPs theoretical total, (10.6 GFLOPS per core) 95 watts per CPU.
- Tesla C2050: 1.15 GHz, 448 SPs, 515.2 GFLOPs theoretical total, (1.15 GFLOPS per SP) 238 watts per card.
- Total theoretical performance: 593.92 TFLOPs (9,280 x 64) for CPUs, 2,390.5 TFLOPs (4,640 x 515.2) for GPUs (~2.98 PetaFLOPS total)

Now, the real-world benchmarking, (using LINPACK, the standard benchmark) the computer only gets 1,271 TFLOPs, or 42.6% of its projected power. Other benchmarks of pure-Xeon supercomputers (with the same Westemere Core) show that they manage to hit 80-90% of their theoretical performance in real-world testing, so 475-535 TFLOPS was from them, leaving 736-796 TFLOPS for the Tesla cards; That's 30.8-33.3% efficiency.

Of course, to put into perspective, we also must consider performance-per-watt; while the THEORETICAL number of a Tesla may look attractive, offering 2.16 GFLOPS/watt compared to only 0.67 for the Xeon, real-world results show numbers that are 0.72 and 0.6 for the Tesla and Xeon, respectively, a far smaller margin. Similarly, there aren't much savings in hardware cost; the Xeon costs $1,025, or 62.4 MFLOPS/dollar, vs. the Tesla, which costs $2,500, or 68.7 FLOPS/dollar. Given that using the Tesla involves wrangling with CUDA to write your app, a 20% increase in performance-per-watt and 10% increase in FLOPS-per-dollar seems a dubious trade-off.

Where this REALLY starts to fall apart is once you start comparing this to alternative designs. A comparable CPU from AMD, such as a 2.6 GHz Opteron 6-core, can be had for $231, or 216.1 MFLOPS/dollar, though it appears the performance/watt ratio sinks to 0.43 GFLOPS/watt. (though if the "ACP" figure is more accurate for AMD, then it's 0.67 GFLOPS/watt)

Worse yet is if you compare this to the PowerXCell, which boasts a theoretical 102.4 GFLOPS at 3.2GHz, and at 92w per chip and 80% efficiency, gets you up to 0.89 GFLOPS/watt for the 65nm version; the 45nm version would likely push that well over 1 GFLOPS/watt; and assuming a CPU price of $1,000, that's 81.9 MFLOPS-per-dollar. Note that this doesn't apply to the PS3, which uses a version not designed for DP-FP; hence, it only gets a measly 0.10 GFLOPS/watt and 25.5 MFLOPS/dollar, even with the cheapest Slim model; the PS3 doesn't make as good a supercomputer node as it does a console.

Potentially there's some promise in this area, but for now, for true supercomputers, I think Tesla and FireStream don't belong. They could be great in the future, especially if they implemented true native double-precision math, but for now I see it largely as a marketting gimmick.
 
[citation][nom]alyoshka[/nom]hmmmm Massive heat servers..... They ought to start think of Kitchens with the Fermi....[/citation]
You are such a loser.
 
[citation][nom]pocketdrummer[/nom]...Besides, what about the other 80% of the time you're using your computer? Why does it need to be so loud running windows' GUI?[/citation]

It doesn't have to be. If it is, it is because either your computer case has bad air flow, you need to dust the video card, or because the card is not underclocking and undervolting itself when not in 3D mode. It should be able to cut its idle power drastically when not rendering a 3d environment. If you do some research, you could probably find a program that would allow you to control the fan speed yourself..
 
[citation][nom]Zingam[/nom]They are doing that for money - not for fun. They obviously have clients for that right now. Tomorrow when a better tech is available they'll use it then.[/citation]
Well, part of this "for the money" depends on actually having a use for it. And as I demonstrated, using Tesla cards for servers/supercomputing is a dubious trade-off: compared to over-priced Xeon CPUs, Tesla offers a 20% gain in performance-per-watt and a 10% gain in performance-per-dollar, in exchange for having to deal with CUDA, a proprietary standard that (in all likelyhood) not have any use OUTSIDE of nVidia cards, and likely will be abandonned by nVidia in favor of OpenCL a few years down the road. (contrast to OpenCL or just plain C, which will be useable and migratable forever)

And likewise, if you compare this to using Opterons or Cells instead... Tesla makes zero sense for servers/supercomputers; it's both more expensive AND more inefficient. (and no, I'm not being a fanboy here; the same things I say for Tesla apply to Firestream, too)

GPU-based computing cards don't have a place in servers or supercomputers, at least right now. They are best for workstations, as they allow for high performance using basic PC architecture, which will make them CHEAPER than, say, investing in a 4-CPU motherboard and the appropriate multi-way CPUs. (which cost far more than normal ones) And likewise, the worse performance-per-watt isn't as much an issue there, as workstations are like PCs, in that they only run at load at most a couple hours a day, contrast to 24/7 like a supercomputer.
 
Status
Not open for further replies.