HPC Council Offer Free Time on Nvidia Tesla GPUs

Status
Not open for further replies.

d_kuhn

Distinguished
Mar 26, 2002
704
0
18,990
This is a great idea... I purchased a Supermicro GPU server with 4 2090's last year to do precisely this evaluation, having the ability to run evaluation code online (assuming their implementation doesn't add overhead) would give you a great way to spin up one of these GPU supercomputers without needing to invest the several tens of thousand dollars required to buy a box.

The folks at Supermicro and other oem's who've struggled through NVidia validation for the 2090 (which has no onboard cooling so requires a very dedicated system design) may not be particularly happy about it though.
 

wiyosaya

Distinguished
Apr 12, 2006
915
1
18,990
[citation][nom]D_Kuhn[/nom]This is a great idea... I purchased a Supermicro GPU server with 4 2090's last year to do precisely this evaluation, having the ability to run evaluation code online (assuming their implementation doesn't add overhead) would give you a great way to spin up one of these GPU supercomputers without needing to invest the several tens of thousand dollars required to buy a box. The folks at Supermicro and other oem's who've struggled through NVidia validation for the 2090 (which has no onboard cooling so requires a very dedicated system design) may not be particularly happy about it though.[/citation]
Buzz at BOINC stats is that some consumer cards, i.e., in the 400 and 500 series, outperform some of the Teslas. Teslas sound like NVidia's marketing baby that may or may not perform any better than "consumer" GPUs.

My bet is that this is another NVidia marketing thrust. IMHO, testing code like this might just as easily be accomplished on a consumer GPU - or, with this service, one could conceivably test on a consumer GPU in house and a Tesla based HPC at the same time to determine whether there is an advantage to a Tesla setup.
 

trandoanhung1991

Distinguished
Nov 7, 2009
83
0
18,630
[citation][nom]wiyosaya[/nom]Buzz at BOINC stats is that some consumer cards, i.e., in the 400 and 500 series, outperform some of the Teslas. Teslas sound like NVidia's marketing baby that may or may not perform any better than "consumer" GPUs.My bet is that this is another NVidia marketing thrust. IMHO, testing code like this might just as easily be accomplished on a consumer GPU - or, with this service, one could conceivably test on a consumer GPU in house and a Tesla based HPC at the same time to determine whether there is an advantage to a Tesla setup.[/citation]

Teslas have a much higher DP rate compared to regular cards. In addition, they're much more stable and heavily tested and certified for continuous running (24/7/365).

Otherwise, they're just consumer cards, really. They both use the same chips, after all.
 

d_kuhn

Distinguished
Mar 26, 2002
704
0
18,990
Also keep in mind that the 2090 was designed to run in a multi-card dense stack in a servers air management. I think if you could manage airflow on a stack of GTX590's you'd get great bang for your buck out of them, but they have a quarter the memory per GPU and 20% lower memory bw so there would be some challenges involved in getting the most out of them.
 

wiyosaya

Distinguished
Apr 12, 2006
915
1
18,990
FWIW - 2070 run BOINC WUs slower than a GTX 460.If you look at the poster's world BOINC position, he is the number 1 overall BOINC contributor in the world. Unfortunately, his computer stats do not give the details on which machines run what GPGPU. Since he has experience running various hardware in a virtual HPC environment, I tend to think that his opinion holds some weight.

Personally, I think the argument here is similar to that of using a Quadro in a CAD environment as opposed to a consumer card. While I am sure there are scenarios where the pro cards excel as they do in large CAD models, I would not be surprised if those HPC scenarios are presently a small number of all possible scenarios as they are in CAD.

Coming from a "pro" imaging software background, the software the company I worked for delivered to all users is the same - various users pay a premium to enable certain features - which is gravy to the manufacturer. As stated above, the silicon is the same; whether the "special treatment" given to the pro market is worth the extra cost is, ultimately, up to the end-user and their requirements.

From the 2090 spec sheet, it has a passive cooler. Many GTX 590 boards offer active cooling which lessens the need for airflow management for a 590 solution.

I am not sure how many of either one can run in a single box - and I am assuming that one would not run the 590s in SLI for a HPC scenario as SLI gives no advantage in an HPC scenario, AFAIK, for doing so.

In my opinion, pro cards are not worth the extra expense. However, if you have the budget and your use case scenarios are such that Teslas give a proven advantage that justifies the extra expense, then Teslas would, of course, be the better choice.
 

d_kuhn

Distinguished
Mar 26, 2002
704
0
18,990
The passive cooling is actually a desired feature for server integration... active cooling on boards disrupts the server cooling architecture and at best is something that designers need to compensate for (a problem to deal with that places restrictions on their ability to structure airflow as they desire), at worst it can do more harm than good in a server implementation.

There's no doubt that a good chunk of the cost is the 'corporate premium', but a lot of that is actually not 'gravy' but rather the increased cost required to design and test these systems to enterprise standards (and the fact that they need to recoup those costs on a relatively small sales volume). Back in the CAD world, getting certified for the various packages costs money... so if a company wants certified hardware, they're going to have to pay for it. In the GPU acceleration world, boards like the 2090 are designed for high end NIC use, which means (in a well designed computing center) spending as much time as possible near 100% utilization. It's a different world from a consumer card dealing with a couple hours of gaming a day (and some of them aren't particularly good at even that). Even graphically intense gaming generally sees pretty wide demand variation.
 

PreferLinux

Distinguished
Dec 7, 2010
1,023
0
19,460
[citation][nom]wiyosaya[/nom]FWIW - 2070 run BOINC WUs slower than a GTX 460.If you look at the poster's world BOINC position, he is the number 1 overall BOINC contributor in the world. Unfortunately, his computer stats do not give the details on which machines run what GPGPU. Since he has experience running various hardware in a virtual HPC environment, I tend to think that his opinion holds some weight.Personally, I think the argument here is similar to that of using a Quadro in a CAD environment as opposed to a consumer card. While I am sure there are scenarios where the pro cards excel as they do in large CAD models, I would not be surprised if those HPC scenarios are presently a small number of all possible scenarios as they are in CAD.Coming from a "pro" imaging software background, the software the company I worked for delivered to all users is the same - various users pay a premium to enable certain features - which is gravy to the manufacturer. As stated above, the silicon is the same; whether the "special treatment" given to the pro market is worth the extra cost is, ultimately, up to the end-user and their requirements.From the 2090 spec sheet, it has a passive cooler. Many GTX 590 boards offer active cooling which lessens the need for airflow management for a 590 solution.I am not sure how many of either one can run in a single box - and I am assuming that one would not run the 590s in SLI for a HPC scenario as SLI gives no advantage in an HPC scenario, AFAIK, for doing so.In my opinion, pro cards are not worth the extra expense. However, if you have the budget and your use case scenarios are such that Teslas give a proven advantage that justifies the extra expense, then Teslas would, of course, be the better choice.[/citation]
You're forgetting the already-mentioned double-precision performance – on consumer cards, it is far lower than on professional cards, and it is far more important than single-precision.

Also consider that the extra cost of professional cards is probably minimal compared to other expenses.
 
Status
Not open for further replies.