News Indiana University's AMD-Based Cray Supercomputer Postponed to Wait for Nvidia's Next-Gen GPUs

Wow, this might be the first anti-click bait headline I've seen in years. No one cares about a super computer in Indiana. The only reason this made news is because the head of the project said they decided to build this system in stages because waiting for Ampere allowed them to use fewer GPU's and still have a faster system because the new GPU's were up to 75% faster than last gen. I'm sure this guy didn't pull this number from an Nvidia marketing slide and almost certainly has had hands on testing when deciding what to use.

Granted, this must be compute tasks which doesn't translate to gaming, but with that much of an improvement, the 50% rumor in gaming looks very attainable.
 
Wow, this might be the first anti-click bait headline I've seen in years. No one cares about a super computer in Indiana.

Granted, this must be compute tasks which doesn't translate to gaming, but with that much of an improvement, the 50% rumor in gaming looks very attainable.
Ya know...not everything that is related to the word "computer" is gaming related.
They are sometimes used for other things. Even parts of "computer things" that are ostensibly for "games".

A decade ago:
https://phys.org/news/2010-12-air-playstation-3s-supercomputer.html
 
Granted, this must be compute tasks which doesn't translate to gaming, but with that much of an improvement, the 50% rumor in gaming looks very attainable.
Sure, but there's no telling whether they will have a 50% price hike to go along with it. : P

Going by product names alone, the RTX 20-series cards were around "50% faster" than their similarly-named GTX 10-series predecessors at launch. However, in reality, the cards names were simply shifted up to the next higher tier to disguise what were actually far-smaller performance gains at any given price level.

It was similar with AMD's RX 5000 series cards, though at least there they added an extra digit to help differentiate the two naming conventions (but oddly stuck with the same first number) An RX 5700 may be twice as fast as an RX 570, but it also launched for double the price. Due to the move to the the 7nm manufacturing process though, the graphics chip of a 5700 or 5700 XT isn't actually much larger than that used by an RX 570 or 580.

Comparing cards using the full processors, an RX 580 at 232mm to a 5700 XT at 251mm, AMD saw around an 85% performance uplift relative to the size of the graphics chip. The launch price of the card was around 75% higher than that of the 8GB RX 580 launched over 2-years prior though, so not much was really gained in terms of price to performance.

It sounds like Nvidia will be moving to a new process node soon as well, so there could very well be large performance gains for chips of a given size, though that doesn't necessarily mean there will be a substantial increase in performance at a given price level.
 
  • Like
Reactions: prtskg and bit_user
It was similar with AMD's RX 5000 series cards ... An RX 5700 may be twice as fast as an RX 570, but it also launched for double the price.
AMD more consistently resorts to price-cutting, towards the end of a product's life cycle. Therefore, I think we haven't seen the full price/performance benefit of Navi that AMD is prepared to offer.

I'd wager their pricing was more dictated by 7 nm supply constraints than anything else.
 
AMD more consistently resorts to price-cutting, towards the end of a product's life cycle. Therefore, I think we haven't seen the full price/performance benefit of Navi that AMD is prepared to offer.

I'd wager their pricing was more dictated by 7 nm supply constraints than anything else.
AMD would also like to sell as many 14nm gpus as possible to meet wafer supply agreements. So at present polaris should 'look' cheaper than Navi
 
AMD would also like to sell as many 14nm gpus as possible to meet wafer supply agreements. So at present polaris should 'look' cheaper than Navi
You think they're still manufacturing new Polaris GPUs? I don't know about that. We definitely know they got caught with an oversupply, after the crypto bust. I think that's the main reason for the aggressive price-cutting.

Anyway, I thought they had some escape clauses on the wafer-supply agreement, like for the situation where Glo Fo cancelled their 7 nm node.
 
Sure, but there's no telling whether they will have a 50% price hike to go along with it. : P

Going by product names alone, the RTX 20-series cards were around "50% faster" than their similarly-named GTX 10-series predecessors at launch. However, in reality, the cards names were simply shifted up to the next higher tier to disguise what were actually far-smaller performance gains at any given price level.

It was similar with AMD's RX 5000 series cards, though at least there they added an extra digit to help differentiate the two naming conventions (but oddly stuck with the same first number) An RX 5700 may be twice as fast as an RX 570, but it also launched for double the price. Due to the move to the the 7nm manufacturing process though, the graphics chip of a 5700 or 5700 XT isn't actually much larger than that used by an RX 570 or 580.

Comparing cards using the full processors, an RX 580 at 232mm to a 5700 XT at 251mm, AMD saw around an 85% performance uplift relative to the size of the graphics chip. The launch price of the card was around 75% higher than that of the 8GB RX 580 launched over 2-years prior though, so not much was really gained in terms of price to performance.

It sounds like Nvidia will be moving to a new process node soon as well, so there could very well be large performance gains for chips of a given size, though that doesn't necessarily mean there will be a substantial increase in performance at a given price level.


For the 1080 and 1080ti moving to turing saw up to 50% improvement. On average it was half that, about 25%, a bit more when moving to 4k, maybe 30-35%. The rumors indicate Ampere could be up to 75% faster with an average around 50%. That should be with much smaller dies. The 2080ti is a massive 754mm die. It would be shocking to see Nvidia release a 7nm die of that size out of the gate in the consumer space. If they can release a 550-600mm die that beats a 2080ti by 50% on average with significantly improved ray tracing, along with lower power requirements, that would be very impressive.

I would not expect to see any increased MSRP's for Ampere. I wouldn't be surprised if the 3080Ti saw a price drop to $999. Yes, I know that was the official msrp of the 2080ti, but with the founders edition at $1200, no card ever got close to $1000. Nvidia pushed the prices too high with Turing and hurt their bottom line, by their own admission, they aren't going to go any higher.

Nvidia sets the market. Recently, AMD seems content to sell their cards slightly lower than Nvidia's when the rasterized performance is equal with everything else ignored. That's not going to work when Ampere hits the market where all indications now are that ray tracing performance will be usable at least down to the mid range and maybe lower. If the 3070 is 30-40+% faster than a 2070 with very good raytracing performance at the $500 price point, no refresh of the 5700XT is going to get it in the ballpark, along with no ray tracing. The price is going to have to plummet.

Edit: why would adding the word "with" to my post require moderator approval?
 
Last edited:
  • Like
Reactions: bit_user
AMD more consistently resorts to price-cutting, towards the end of a product's life cycle. Therefore, I think we haven't seen the full price/performance benefit of Navi that AMD is prepared to offer.

I'd wager their pricing was more dictated by 7 nm supply constraints than anything else.
I suspect that too. They probably don't want to cut into their CPU production, hence the relatively mediocre performance gains relative to the price of these cards at launch. I wouldn't be surprised to see them significantly discounted by the end of the year in response to Nvidia's next generation of hardware though. They were mainly just brought up as an example of big performance gains to hardware not necessarily translating to big gains in value.

For the 1080 and 1080ti moving to turing saw up to 50% improvement. On average it was half that, about 25%, a bit more when moving to 4k, maybe 30-35%.
Yep, though some of that is down to the graphics hardware being limited by CPU performance in many games, hence why 4K shows greater improvements. And some of the other parts saw a bit larger relative performance gains compared to their similarly-named (but not similarly-priced) predecessors. The report where that rumor comes from was also suggesting "up to 50% more performance" though, not "50% on average". So, it sounds like a similar scenario, though perhaps without shifting around model names further to make it happen. Maybe performance will be around 50% better in some games, but the average will probably be lower than that.

And it's unknown how much or how little that 75% performance gain at certain compute tasks in a supercomputer might translate to gaming performance. We can't even be sure that these Tesla GPUs they will be using will even have corresponding equivalents in the consumer market. The compute gains mentioned here may also come primarily from additional Tensor cores or something. With Tensor cores tied to RT performance in the 20-series cards, if they were doubled for example, that might translate to a big improvement in raytracing performance. If the impact of enabling RT effects was cut in half, that could make them a lot more usable. It's very possible that "up to 50% more performance" is referring to performance in certain games where RT is enabled, in which case the RT performance gains might be accounting for a good chunk of that uplift. If a game gets 100fps with RT disabled, and only 50fps with RT enabled on a current card, but its successor can push 75fps with RT enabled, that's a 50% performance uplift right there, even if the gains in non-RT games might be minimal. I do suspect RT effects will likely become the norm in the coming years though, so that's likely to be very relevant down the line.

In any case, it sounds like Nvidia will be announcing their next GPU architecture soon, so more details should be available before long. I suspect a lot of marketing may cloud how the cards will actually perform until they eventually come out and can be tested though, perhaps half a year or more from now.
 
If the 3070 is 30-40+% faster than a 2070 with very good raytracing performance at the $500 price point, no refresh of the 5700XT is going to get it in the ballpark, along with no ray tracing. The price is going to have to plummet.
I agree that the existing 5000-series pricing is likely to come down, though AMD's "Big Navi" cards will be coming as well, purportedly with up to double the graphics cores, which should fill in to cover those higher price points. And since it sounds like their RDNA2 architecture will likely have some form of raytracing support as well, they will probably have that $400+ range covered with competitive options. Nvidia will undoubtedly pull ahead on efficiency though, seeing as AMD's current 7nm cards are only just matching Nvidia's 12nm options. That may give Nvidia more room to offer additional performance at the extreme high-end.
 
  • Like
Reactions: bit_user
it's unknown how much or how little that 75% performance gain at certain compute tasks in a supercomputer might translate to gaming performance. We can't even be sure that these Tesla GPUs they will be using will even have corresponding equivalents in the consumer market.
Almost certainly not. If you don't count the Titan V as a consumer card (because with a $3k price tag, it's really an outlier), then you have to go all the way back, 6+ years, to the Kepler GK110, to find a chip they used for both HPC workloads and consumer graphics cards.

They have truly bifurcated their product lines, and I wouldn't be too surprised if, like AMD's rumored Arcturus, they even produced a compute/deep learning accelerator with no graphics blocks on die. So far, even though their Tesla cards have no display connectors, they've still used graphics-capable GPU dies.

With Tensor cores tied to RT performance in the 20-series cards, if they were doubled for example, that might translate to a big improvement in raytracing performance.
Nope. Tensor cores aren't used for raytracing acceleration - they're too low precision. Nvidia's plan was to use them for DLSS, which it hoped would allow raytracing to render at lower resolutions without such a hit on quality as traditional scaling techniques.

We don't need to rehash that whole subject, but that's how they were meant to fit into the picture. They're too low-precision to accelerate any other part of the raytracing pipeline.
 
Nope. Tensor cores aren't used for raytracing acceleration - they're too low precision. Nvidia's plan was to use them for DLSS, which it hoped would allow raytracing to render at lower resolutions without such a hit on quality as traditional scaling techniques.
I can't say for sure how Nvidia's 20-series cards perform raytracing at a low level, since I don't think there's much detailed information about it, but I know they're all described as having 1 RT core for every 8 Tensor cores and 64 shader cores. Information on what the RT cores actually are seems pretty vague, but they clearly seem to utilize some hardware not found in the 16-series cards, as enabling the effects on those results in disproportionately worse performance, despite utilizing the same Turing architecture. It wouldn't surprise me if the Tensor cores might be involved to accelerate some part of the raytracing process, perhaps something like noise reduction. Or maybe Nvidia just bundles them together in that ratio, and the two are independant from one another. Either way, it appears RT performance could be increased substantially without increasing traditional rasterization performance much, which was the main point I was getting at.
 
Last edited:
Yep, though some of that is down to the graphics hardware being limited by CPU performance in many games, hence why 4K shows greater improvements. And some of the other parts saw a bit larger relative performance gains compared to their similarly-named (but not similarly-priced) predecessors. The report where that rumor comes from was also suggesting "up to 50% more performance" though, not "50% on average". So, it sounds like a similar scenario, though perhaps without shifting around model names further to make it happen. Maybe performance will be around 50% better in some games, but the average will probably be lower than that.

The original quote from the Taipei Times did not say up to 50%.

"A successor to units based on the Turing architecture, Nvidia’s next-generation GPU based on the Ampere architecture is to adopt 7-nanometer technology, which would lead to a 50 percent increase in graphics performance while halving power consumption, the note said."

The compute gains mentioned here may also come primarily from additional Tensor cores or something. With Tensor cores tied to RT performance in the 20-series cards, if they were doubled for example, that might translate to a big improvement in raytracing performance. If the impact of enabling RT effects was cut in half, that could make them a lot more usable. It's very possible that "up to 50% more performance" is referring to performance in certain games where RT is enabled, in which case the RT performance gains might be accounting for a good chunk of that uplift. If a game gets 100fps with RT disabled, and only 50fps with RT enabled on a current card, but its successor can push 75fps with RT enabled, that's a 50% performance uplift right there, even if the gains in non-RT games might be minimal. I do suspect RT effects will likely become the norm in the coming years though, so that's likely to be very relevant down the line.

Tensor cores are AI units (matrix multiplication/addition) that carried over from the Volta architecture that were professional only. With Turing being an evolution of Volta and also being used in professional cards, the Tensor cores remained and Nvidia tried to come up with gaming applications they could be used for. DLSS didn't really pan out as planned. The Tensor cores could be used for ray tracing denoising, but according to Anandtech, game developers have chosen not to go that route. Tensor cores are not currently used for ray tracing, or anything else useful on the gaming side.

RT cores are used for bounding volume hierarchy traversal (quad tree and spherical hierarchies processing) which calculates ray intersection points.

Either way, it appears RT performance could be increased substantially without increasing traditional rasterization performance much, which was the main point I was getting at.

Yes, you are right. From a gaming perspective, removing the tensor cores and replacing them with RT cores would result in significantly better ray tracing performance with no negative side effects, but the professional market requires the AI units and Nvidia will always prioritize them first. The new RT cores are reported to be much improved while the node shrink should allow for more cores.
 
The original quote from the Taipei Times did not say up to 50%.

"A successor to units based on the Turing architecture, Nvidia’s next-generation GPU based on the Ampere architecture is to adopt 7-nanometer technology, which would lead to a 50 percent increase in graphics performance while halving power consumption, the note said."
It does look like the article worded it that way, and Tom's added "up to" when they reported on it. However, even that article was only paraphrasing the report from that consulting company, rather than quoting it directly, and something could have been lost in translation. It certainly wouldn't be the first time details about unannounced hardware get reported incorrectly, so one probably shouldn't count on 50% performance gains, at least not at a given price point. That would mean 2080 Ti-level performance at around $500 and 2080-level performance at around $350. It could happen, but I'm not sure I would expect Nvidia to cut prices so much.
 
...

It sounds like Nvidia will be moving to a new process node soon as well, so there could very well be large performance gains for chips of a given size, though that doesn't necessarily mean there will be a substantial increase in performance at a given price level.

Exactly ...

Basically, we'll get a new "2080ti", but with a "3", it'll cost $400 more, and perform "50% faster"(however you want to calculate that) than the old 2080ti. Everything else will get a rebranded naming convention and offer the same or worse performance per dollar. Yay, I'm so excited. [/s]
 
It does look like the article worded it that way, and Tom's added "up to" when they reported on it. However, even that article was only paraphrasing the report from that consulting company, rather than quoting it directly, and something could have been lost in translation. It certainly wouldn't be the first time details about unannounced hardware get reported incorrectly, so one probably shouldn't count on 50% performance gains, at least not at a given price point. That would mean 2080 Ti-level performance at around $500 and 2080-level performance at around $350. It could happen, but I'm not sure I would expect Nvidia to cut prices so much.


That article stated 50% faster at 50% less power consumption ... this could be just some wordsmithing.

One of the ways I read this, is that "at a 50% lower power enveloped, ampere is 50% faster than a 2080ti". This might not be saying anything about the top end performance of the product, but might just indicate that if you dropped the 2080ti max power consumption in half, and tested ampere and turing at that power envelope, ampere would be 50% faster. That would be also a fully valid interpretation of that headline while it remains "true".

Intel could say something very similar for 10nm Ice Lake vs their 14nm, and while true, the 14nm parts are still faster at the very top end. It just means that Ice lake has better efficiency at low power, but we know it doesn't perform in the high end due to not reaching high clocks.

Basically I don't read much into those headlines because of the way the words can be spun, and are regularly spun, to create eye-catching headlines for a potential investors "benefit" -- not for an enthusiasts "reality".

I'm not saying ampere won't be 50% faster ... I'm saying I wouldn't conclude that from that headline. I'll wait for the reviews.
 
Last edited:
It wouldn't surprise me if the Tensor cores might be involved to accelerate some part of the raytracing process, perhaps something like noise reduction. Or maybe Nvidia just bundles them together in that ratio, and the two are independant from one another.
Yes and yes. The tensor cores can be used for noise reduction with global illumination, but that's basically the most expensive of the RT effects, and not utilized by a lot of games. However, I don't believe the ratio represents a magic formula, or perhaps it was motivated by an expectation that DLSS will be used as well.

it appears RT performance could be increased substantially without increasing traditional rasterization performance much, which was the main point I was getting at.
Agreed.
 
Many goods points.

With Turing being an evolution of Volta and also being used in professional cards, the Tensor cores remained and Nvidia tried to come up with gaming applications they could be used for.
Yup. Either as a point of product differentiation or just to get the volumes up, so that the corresponding Quadro and Tesla cards with Turing GPUs would be that much cheaper. Probably both. If DLSS had worked a bit better, that decision could've been a real win.

DLSS didn't really pan out as planned.
And a shame about that. I wonder if they'll have another go at it. Perhaps they just needed a better loss function, but it would also help if it didn't have to be custom-trained for each game. In principle, I don't see why it can't look at least as good as AMD's new scaling feature.

The Tensor cores could be used for ray tracing denoising, but according to Anandtech, game developers have chosen not to go that route.
Only because global illumination is perhaps the most expensive and subtle of the ray tracing effects. If ray tracing performance were better & support were more ubiquitous, then GI would certainly see more use.

Tensor cores are not currently used for ray tracing, or anything else useful on the gaming side.
They could be used for game AI, although game developers would still be restricted to deep learning models fast enough to run on hardware without tensor cores.

RT cores are used for bounding volume hierarchy traversal (quad tree and spherical hierarchies processing) which calculates ray intersection points.
This.

From a gaming perspective, removing the tensor cores and replacing them with RT cores would result in significantly better ray tracing performance with no negative side effects,
Until people decide to flip on Global Illumination. Then, you'll be wishing you had them. But, I could see an approach where the lower-end chips leave out the tensor cores, and you only get them on 3080 and 3080 Ti. That still satisfies Nvidia's desire to have them available for the professional and cloud markets, with the volumes provided by a consumer product.

the professional market requires the AI units and Nvidia will always prioritize them first.
That's not true, or else their high end GPUs would have lots of fp64. As I mentioned above, we haven't seen that since 2013/2014, with the Kepler series.

The new RT cores are reported to be much improved while the node shrink should allow for more cores.
That's exactly what I'd hope and expect. Once the lackluster performance numbers started dripping out, I quickly decided ray tracing was one of those features where it was going to be worth waiting.
 
If DLSS had worked a bit better, that decision could've been a real win.
Again coming back to the possibility of increasing Tensor core counts, if that were to happen, maybe DLSS could work better. If the card could perform double the inference calculations on each frame, for example, perhaps the end result might edge out alternate algorithms processed on standard hardware.

Or maybe Nvidia will just get rid of DLSS, at least the original method utilizing Tensor cores, and just apply the name to a process using more traditional upscaling and sharpening techniques. The current sharpening routines now employed by both AMD and Nvidia coupled with existing AA and upscaling methods seem to make DLSS more or less redundant in its current form. Again though, that could possibly change if the next generation of cards had more resources to put toward it.

Until people decide to flip on Global Illumination. Then, you'll be wishing you had them.
According to one of the lead developers of CD Projekt Red, CyberPunk 2077 will supposedly be utilizing a more advanced implementation of Global Illumination that the studio hasn't shown off yet. If I had to guess, they might be targeting 30-series graphics cards for that, and it wouldn't surprise me if they had early access to that hardware, or at least inside information on how it should perform, seeing as they partnered with Nvidia for bringing raytracing to the game. Maybe that even had something to do with why the game was recently delayed until September, seeing as that's around the time I would expect Nvidia's new high-end cards to be launching. And of course, there might be some high-end AMD cards with RT-acceleration by then as well, and I'm sure they will also want to show off raytracing on the new consoles.
 
Maybe that even had something to do with why the game was recently delayed until September, seeing as that's around the time I would expect Nvidia's new high-end cards to be launching. And of course, there might be some high-end AMD cards with RT-acceleration by then as well
Given how much $$$ is at stake, with these AAA games, I seriously doubt they'd delay it, just for that. Not when they can simply release a patch for it.
 
Given how much $$$ is at stake, with these AAA games, I seriously doubt they'd delay it, just for that. Not when they can simply release a patch for it.
Well, it is only a five month delay, which isn't really all that long considering the game has been in development for at least seven years already. And I'm not saying it was just delayed for raytracing, but perhaps the pending release of new hardware in general played a role in pushing back the release, both on the console and PC side of things. A launch positioned later in the year allows the game to still be new when the next consoles come out, and given the timing, it seems likely to be a launch title for at least one of them. It seems like a perfect game for showing what the new raytracing hardware is capable of, which could potentially make it a system-seller if it looks and runs substantially better on the new hardware compared to existing consoles. And of course, it also gives them time to polish the game a bit more in general.