Examining AMD Radeon Pro SSG: How NAND Changes The GPU Game

Guest · Aug 8, 2016

AMD's recent unveiling of its Radeon Pro SSG at SIGGRAPH created waves because the company announced that it had integrated M.2 SSDs on board its latest Fiji workstation GPU. We followed up with AMD for more on the specifics on how the system works.

Examining AMD Radeon Pro SSG: How NAND Changes The GPU Game : Read more

Xajel · Aug 8, 2016

The Idea is impressive, although I don't see any use for it in our consumer applications..

I think having the PCIe lanes in the GPU it self ( in addition the x16 lanes to connect to the CPU/chipset ), though I don't know how much better this will be compared to using a bridge chip, specially with the complexity of having a host PCIe in the GPU

danlw · Aug 8, 2016

Genome sequencing? Geological mapping? Real-time heart imagery?

This reminds me of one of those faux 1960s motivational posters...

VIDEO GAMES! (Why waste good technology on science and medicine?)

babernet_1 · Aug 8, 2016

Imagine having this on a gaming GPU. You could load all the levels and textures and all on the internal SSD so your game would never pause between levels or new areas.

BoredSysAdmin · Aug 8, 2016

The whole TWO Paragraphs (in Bottleneck section) are dedicated to how RAID 0 sucks and biz uses raid 5/6, while these are not-untrue - they also very much irrelevant to the point. The storage on GPU is meant to be temporary by definition so in this context - raid 0 is fine choice.
Not to mention Samsung Pro are one of most reliable SSDs on the market today and chances of sudden death are practically none-existent.

Also the point of bottleneck should be made more clearly - very limited implementation of DMI on consumer CPU sucks and even use of PCIe SSD doesn't solve issue of latency.

none12345 · Aug 8, 2016

I also think the rant on raid 0 is misplaced. With SSDs its largely irrelevant.

Lets for a minute compare a SSD with 4 flash chips and 256GB vs an SSD with 8 flash chips and 512 GB of storage. There is very little difference between having 2 of those 256gbs in one enclosure in raid 0 vs having the single 8 chips drive in that enclosure. Sure there are 2 controller chips in taht instance, so another controller chip to die, but really there is no practical increased risk of failure.

Having more bandwidth by using 2 controller chips is definitely the right choice in this instance. You could argue for more then 2 if you needed more bandwidth. If one controller chip could cut it, you don't need 2 tho.

PaulAlcorn · Aug 9, 2016

none12345 :

It is true that the data held in the GPU memory space is transitory in nature, but there are caveats to how it is used. The memory can be used as a storage volume, so data that needs persisted (stored) will be located in the RAID 0, so losing that data to a RAID 0 failure does place the user in a position to lose data. Also, if one were working with the NAND as memory and the RAID failed, all of that data, and thus work, would be lost. Not to mention the disadvantage of downtime, RMA, etc. The time expended to right the situation, even if the data itself was temporary, equates directly to lost money.

RAID 0 is inherently risky, and it doesn't matter what you use as a storage medium. There is no single storage device that has a zero chance of failure. ALL storage devices fail, given enough time.

Yes, as noted, SSDs are arguably safer, but the nature of SSD failures usually involves surface mount components on the PCB, and not the controller or NAND, so you are exposing yourself to more components, and thus failure. There are also firmware issues (freezing, locking) that are doubled when you have two separate SSDs. The M.2 sockets are also another possible point of failure, and that is doubled. As with any design, when you multiply the number of components you multiply the possible points of failure.

none12345 · Aug 9, 2016

I think the raid argument is completely moot.

There is absolutely nothing stopping someone from making a M.2 ssd that has 2 ssds in raid 1 instead of 2 in raid 0. Or 3 in raid 5, or 100 in raid 6, or whatever else configuration of controllers and flash packages they want. The card never has to see any of that nor care about how the data is stored after the connector. Let the customer decide what they want to plug into that M.2, its a trivial matter.

PaulAlcorn · Aug 9, 2016

none12345 :

I agree, there are other options, as mentioned in the article.

William Henrickson · Aug 9, 2016

Rip those samsungs out and stuff optanes in there. Yowww

fixxxer113 · Aug 10, 2016

babernet_1 :

I think that's the end goal. Storage is slowly moving towards having one big pool that's fast enough to be used for anything (data storage, RAM, VRAM etc.). It would also simplify the architecture of a PC, since you wouldn't need all kinds of different slots and maybe make graphics cards smaller, if their VRAM was no longer on the PCB but part of one big chunk of memory/storage connected to the motherboard.

If anything, the evolution of hardware until now has shown that it's a matter of time before we get to a point where we have ridiculous amounts of extremely fast storage with a low cost.

bit_user · Aug 16, 2016

However, there are newer and more refined machine-learning-like algorithms that can detect trends and predictively place data (prefetch) in the cache based upon usage history. Machine learning algorithms would likely be the best approach, as the storage is adjacent to the GPU, which is one of the most commonly used platforms for the task.

This is going too far. Memory prefetchers are very simple, by comparison with what anyone refers to as machine learning. They also cannot (reasonably) be done by software that would, itself, utilize the very memory it's trying to prefetch.

In practice, I doubt prefetchers do much more than to detect strided memory access. Once you get beyond that, access gets pseudo-random, really fast. For instance, when traversing heap-allocated trees and lists, there's no practical way to predict the cache misses.

The main way that GPUs deal with access latency is through SMT. When a thread stalls on a cache miss, just execute another thread until the cache line is fetched. I'm sure they also have write buffers.

bit_user · Aug 16, 2016

fixxxer113 :

You're still going to have RAM in your PC. It'll just be stacked inside the CPU package. The performance benefits to be gained by that are too great to pass up.

And if games are pausing between levels, even if your data is on a fast SSD, then the game is doing more than simply reading the data. Also, since the CPU needs a significant amount of that data, the pauses won't magically disappear, just by moving the storage into the GPU.

Xajel :

Agreed. There are good reasons, besides cost, that AMD launched this for professional applications. The main one being that PCs hardware and games are pretty well adapted to each other. There's not much potential, here, for anything concerning typical users. Remember, the original reason graphics cards even have programmable shaders is so that they can generate petabytes worth of texture & geometry, on the fly.

Besides, by the time this could trickle down to consumers, you'll have a fast APU with HBM2 on the inside, and NV DIMMs in the slots. It's still not quite the same thing, but much more relevant and interesting for day-to-day uses.

That said, non-volatile memory will probably be an enduring feature of professional GPUs, but I still don't foresee it appearing in consumer hardware.

dosmastr · Aug 17, 2016

I'm scratching me head at the line: "Even with copious amounts of on-board storage, there will always be workloads that require more, so deciding what to cache in the NAND-based storage on board the card becomes a challenge."

Most workstations don't have that much storage, you could "cache" the entire contents of the workstation on the card in most cases.
Beyond that wouldn't you need more than one card's worth of GPUs to do your processing anyway? (which would come along with more onboard storage!)

dosmastr · Aug 17, 2016

1310218 SAID:
I think the raid argument is completely moot. There is absolutely nothing stopping someone from making a M.2 ssd that has 2 ssds in raid 1 instead of 2 in raid 0. Or 3 in raid 5, or 100 in raid 6, or whatever else configuration of controllers and flash packages they want. The card never has to see any of that nor care about how the data is stored after the connector. Let the customer decide what they want to plug into that M.2, its a trivial matter.

Doesn't any good RAID controller these days READ from a RAID 1 as if it was a 0 anyway?

Search

Examining AMD Radeon Pro SSG: How NAND Changes The GPU Game

Guest

Guest

Xajel

Distinguished

danlw

Distinguished

babernet_1

Honorable

BoredSysAdmin

Distinguished

none12345

Distinguished

PaulAlcorn

Managing Editor: News and Emerging Technology

none12345

Distinguished

PaulAlcorn

Managing Editor: News and Emerging Technology

William Henrickson

Reputable

fixxxer113

Distinguished

bit_user

Polypheme

bit_user

Polypheme

dosmastr

Distinguished

dosmastr

Distinguished

TRENDING THREADS

Latest posts

Moderators online

Share this page