News SK Hynix Plans to Stack HBM4 Directly on Logic Processors

Status
Not open for further replies.
tbh anything that pushes advancement (and hopefully drip down to consumer) of submersion cooling is all good in my book :|

its literally the one tech thats been mentioned for decade but hasnt really advanced to market.
 
I see this as the logical follow-on from prior HBM PIM (Proccessing In Memory) efforts.

SK Hynix announced a version, also. I'm too lazy to look it up.

AMD also seems headed in this direction, as it's in line with their Zettascale roadmap.

"Within 10 years, the 'rules of the game' for semiconductors may change, and the distinction between memory and logic semiconductors may become insignificant," an industry insider told Joongang.co.kr.
I'm not too sure about that. HBM isn't scaling to higher capacities like DIMMs can. I think there will always be a need for some plain, old DRAM that lacks any integrated computation. It could be a slower tier, connected via CXL, that sits in between the HBM tier and the SSD tier.
 
Last edited:
  • Like
Reactions: gg83
Hmm. We're already seeing a rise in power draw, enough to bring water-cooled servers into consideration. Perhaps this sort of thing will accelerate that? I'd hope there's some efforts ongoing to deal with this rather than just keep on increasing the power draw of processors. I think it was around the Pentium MMX time that someone I worked with would respond to customers that their PC would use a similar amount of power to an incandescent light bulb.
 
tbh anything that pushes advancement (and hopefully drip down to consumer) of submersion cooling is all good in my book :|
Yuck. Submersion cooling is messy and expensive. I'm not interested.

its literally the one tech thats been mentioned for decade but hasnt really advanced to market.
I think it's continuing to make inroads into HPC and some datacenters. That's where it should stay, IMO.
 
Hmm. We're already seeing a rise in power draw, enough to bring water-cooled servers into consideration. Perhaps this sort of thing will accelerate that?
That's not germane to the issues discussed in the article. They're concerned about how to get heat from the die to the exterior of the package (or, I guess the top of the stack, if you're doing direct-die cooling).

I think it was around the Pentium MMX time that someone I worked with would respond to customers that their PC would use a similar amount of power to an incandescent light bulb.
According to this, no Pentium MMX version used more than 17 W.

I remember seeing the inside of a Compaq desktop, with a Pentium MMX, and the thing still had a passive heatsink! It was big and aluminum, but definitely had no integrated fan.

Nothing else inside that PC should've used very much power. Graphics cards of the day had tiny fans, if any, and HDDs rarely burn more than 10 W. That would put even 60 W (which is pretty low, for a conventional screw-in, incandescent lightbulb) as an overestimate.
 
This seems like something untenable for consumer space due to the clockspeeds desktop silicon typically runs at to make up for smaller chips. No matter what you'd probably need backside power delivery to make this work period as HBM stacks aren't exactly lacking in z-height. It would take an absolute revelation in manufacturing to make memory and logic on the same process so this is extremely unlikely. It makes sense that they're talking about HBM 4 since that's 2026+.
 
  • Like
Reactions: bit_user
On the cooling front, I'm reminded of a recent news article where a breakthrough was made in integrating some kind of cooling directly into the dies, allowing for transferring the heat out to say, a copper sideplate that a cooler could contact alongside the die itself, allowing for a wider cooling surface even as dies shrink.

I wonder if that will be implemented along side this.
 
Good Luck with cooling the bloody thing and not frying the memory.

The easiest approach that I can see is putting the HBM on the opposite side of the PCB and have a direct connection through the PCB to the HBM, then you cool the ASIC on one side and the memory on the other side.

Similar to old CPU Slot based processors., but this time, there's a point to having the slot.
 
I see this as the logical follow-on from prior HBM PIM (Proccessing In Memory) efforts.

SK Hynix announced a version, also. I'm too lazy to look it up.

AMD also seems headed in this direction, as it's in line with their Zettascale roadmap.


I'm not too sure about that. HBM isn't scaling to higher capacities like DIMMs can. I think there will always be a need for some plain, old DRAM that lacks any integrated computation. It could be a slower tier, connected via CXL, that sits in between the HBM tier and the SSD tier.
Exactly. I'm really excited to see where this tech goes. I could see memory storage and compute all in the same package. 3dxpoint was in theory trying to merg ram and storage, right?
 
That's not germane to the issues discussed in the article. They're concerned about how to get heat from the die to the exterior of the package (or, I guess the top of the stack, if you're doing direct-die cooling).


According to this, no Pentium MMX version used more than 17 W.

I remember seeing the inside of a Compaq desktop, with a Pentium MMX, and the thing still had a passive heatsink! It was big and aluminum, but definitely had no integrated fan.

Nothing else inside that PC should've used very much power. Graphics cards of the day had tiny fans, if any, and HDDs rarely burn more than 10 W. That would put even 60 W (which is pretty low, for a conventional screw-in, incandescent lightbulb) as an overestimate.
60W incandescent bulbs were pretty common indoors. Lamps were often limited to using 60W bulbs. Besides, if you counted the power draw of a CRT monitor, which I'll bet the quoted person was, that could easily reach around 100W.
 
  • Like
Reactions: Amdlova
They are taking the apple approach where if your memory dies you have to replace the entire processor.
Even today, if your GPU has bad memory, most people would just replace it - especially if it's still under warranty. Yes, you can potentially have a competent tech locate the bad chip and replace it, but I think people rarely take this route.

However, they are probably targeting this at high-end datacenter GPUs, at least initially. Those already use HBM, ever since the P100 launched in 2017, which I'd guess very few people in the world have the equipment and skills to replace. So, something like a H100 GPU is either toast when a HBM stack goes bad, or maybe there's a way to disable the failed address range (or an entire stack, if the problem exists at that level).
 
Last edited:
  • Like
Reactions: thestryker
The easiest approach that I can see is putting the HBM on the opposite side of the PCB and have a direct connection through the PCB to the HBM, then you cool the ASIC on one side and the memory on the other side.
And how is that better than routing through the interposer, like we do today??? GPUs like the H100 have a 6144-bit memory interface. It's nuts to route that off package, if you even can!

If you read the article, HBM4 is rumored to double the width per stack. So, the equivalent GPU would have 12288-bit databus off-package. That's why they don't even want to route through the interposer, any more. You should do a better job of sanity-checking these ideas, before you post them.
 
Last edited:
I could see memory storage and compute all in the same package.
Any GPUs with HBM already are that. They're just going a step further and putting it all in the same stack.

3dxpoint was in theory trying to merg ram and storage, right?
No, totally different. 3D XPoint was a storage-class memory that was meant to be fast enough and have sufficient endurance that you could use it as RAM or storage. However, it's the same memory chip and the only thing that changes would be how you use it.

In this case, they're taking two different kinds of dies and combining them within the same stack.
 
Last edited:
  • Like
Reactions: gg83
60W incandescent bulbs were pretty common indoors. Lamps were often limited to using 60W bulbs.
I was saying 60 W was toward the low-end of normal-sized incandescent bulbs, not that they were rare.

Besides, if you counted the power draw of a CRT monitor, which I'll bet the quoted person was, that could easily reach around 100W.
They didn't specify they were including the monitor. I know how much power CRTs monitors used, and it could range into the hundreds of Watts, depending on size and brightness. Monitors used so much more power than CPUs of that era that the CPU's power consumption hardly mattered. That's why I assumed they meant the computer without the monitor. Otherwise, a PC with a 21" monitor could easily surpass the power of a standard incandescent light bulb!
 
Source?

I'm pretty sure every DIMM I've ever bought had a limited warranty. At work, we see DIMMs start to develop errors, after years of continuous use.
Corsair, G.Skill, Crucial, Team Group all have limited lifetime warranty (I'd assume most consumer brands do), but the duration can vary depending on region which I assume is why they use "limited". I wouldn't say this inherently speaks to longevity though due to the nature of DRAM and how often errors aren't detected in consumer workloads.

Most of the time server memory carries real warranties (though G.Skill's DDR5 RDIMMs carry their limited lifetime).
Even today, if your GPU has bad memory, most people would just replace it - especially if it's still under warranty. Yes, you can potentially have a competent tech locate the bad chip and replace it, but I think people rarely take this route.
Not to mention the memory controller isn't isolated from the VRAM so it can be taken out at the same time depending on the type of failure.
 
  • Like
Reactions: bit_user
I think this can be possible as both amd and Nvidia are headed for chiplet design instead of monolithic. This would mean multiple lower temp hotspots to spread the heat. But I don't know if the advantage gained with a stacked hbm Memory would be lost due to the latency incurred with infinity fabric / nvlink.
 
And how is that better than routing through the interposer, like we do today??? GPUs like the H100 have a 6144-bit memory interface. It's nuts to route that off package, if you even can!

If you read the article, HBM4 is rumored to double the width per stack. So, the equivalent GPU would have 12288-bit databus off-package. That's why they don't even want to route through the interposer, any more. You should do a better job of sanity-checking these ideas, before you post them.
How do you plan on cooling both packages, doesn't matter if GPU is on bottom and HBM on top or vice versa?

Either one is going to be a HUGE problem for the other.

Having it mounted on the other side of the PCB and having a direct route to the GPU is the best choice that I can think of, regardless of how many bits you have to route.

That's what TSV's are for and advanced packaging or some other form of directing the data to the GPU, through the PCB, to the GPU.

They exist to figure out how to get the routing to work while making cooling viable.

It should be "Much Easier" to figure out how to cool a package on both sides than to have one directly stacked on top of the other.
 
Status
Not open for further replies.