News Intel and Lenovo Develop Future of PCs in Shanghai

Status
Not open for further replies.

bit_user

Titan
Ambassador
Lenovo jumped on the AR and VR bandwagons pretty early. I appreciated the effort and was sad to see it not work out better, for them.

I also applaud the work they're doing on the Thinkpad X13S, especially in regards to Linux support. If I needed an ARM laptop today, that's the one I'd buy. Sadly for them, I do not.

As for things like thermal dissipation, I do not want a laptop churning out 55 W, period. No matter how quiet it is, that's just a lot of heat sitting right in your space. So, unless I'm in a cold room, it's going to be unwelcome. This summer, I have configured a max power limit of 45 W on my work laptop (based on a i7-12850HX CPU) and that's really helped a lot.
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
I wish Intel would allocate much more resources to the development of low latency, low power (High Bandwith Memory (HBM)) Non-Volatile Memory (NVM) VG-SOT-MRAM (or VCMA MRAM) of at least 64GB/128GB.

Ideally, they should find a way of doing so re-using as many 3D NAND flash manufacturing tools as possible to lower manufacturing costs.

This would be REALLY disruptive for all computing devices, especially IoT devices to finally usher the era of low-power « Normally-Off Computing ».
 

bit_user

Titan
Ambassador
I wish Intel would allocate much more resources to the development of low latency, low power (High Bandwith Memory (HBM)) Non-Volatile Memory (NVM) VG-SOT-MRAM (or VCMA MRAM) of at least 64GB/128GB.

Ideally, they should find a way of doing so re-using as many 3D NAND flash manufacturing tools as possible to lower manufacturing costs.

This would be REALLY disruptive for all computing devices, especially IoT devices to finally usher the era of low-power « Normally-Off Computing ».
Intel is out of the storage business! There's no way they would do a 180, this soon after completely divesting from it.

As for HBM, you don't need that for such power-constrained IoT devices.
 
  • Like
Reactions: TJ Hooker

Diogene7

Prominent
Jan 7, 2023
72
13
535
Intel is out of the storage business! There's no way they would do a 180, this soon after completely divesting from it.

As for HBM, you don't need that for such power-constrained IoT devices.

Yes, I know that Pat Gelsinger is not favorable to Intel being in the memory business : he doesn’t like this business.

However, in the early 2000, it is because Intel innovated by selling together a computing chip + wireless chip (Centrino) that it really ushered for consumers a new era of WIFI enabled computers : it was disruptive at the time.

Phase Change Memory (PCM) Optane as a Non-Volatile-Memory has too many shortcomings (high-power consumption and low lifecycles) was a bad choice to begin with.

On the contrary MRAM, like VG-SOT-MRAM seems to have most (all) the technical requirements, and would enable new opportunities in designing the mobile computing devices (« Normally-Off computing ») : I believe it would be disruptive as well.

But yes, it is more a wishfull thinking because the problem is economics : it has to be (very) profitable, and as of 2023, there is more profits to be made in data center businesses rather than consumer.

Regarding HBM in a mobile device, if the power consumption was low enough (using VCMA MRAM), then I am sure new use cases could emerge (maybe more on-device machine learning training) : So there is likely the need for it, only that as of 2023, it is not yet technically and/or econmically feasible…

PS: Realistically, regarding MRAM, I think the main company that could have an incentive to manufacture it at scale is TSMC because it could increase their revenue/profit (because for Samsung, SK-Hynix,… it would compete with their other memory products).
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
Training big models requires not just a lot of fast memory, but also a lot of compute, and that takes power.

Yes, for big models. I don’t know exactly what would be the new opportunities, but I am confident that some new (maybe yet unforeseen) ones would emerge with HBM available at scale on a mobile devices…
 

bit_user

Titan
Ambassador
I don’t know exactly what would be the new opportunities, but I am confident that some new (maybe yet unforeseen) ones would emerge with HBM available at scale on a mobile devices…
Let's distinguish between the IoT cases where you want persistent memory for power-optimization purposes vs. mobile devices like phones. I can see why a phone would want HBM, since it offers the most bandwidth per Watt of any DRAM technology. Probably the main reason we don't already have it is cost. Maybe Apple will lead the way, here.
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
Let's distinguish between the IoT cases where you want persistent memory for power-optimization purposes vs. mobile devices like phones. I can see why a phone would want HBM, since it offers the most bandwidth per Watt of any DRAM technology. Probably the main reason we don't already have it is cost. Maybe Apple will lead the way, here.

What I am wondering is, as of 2023, what the the cost difference between 16GB LPDDR5x memory versus 16GB HBM2E or 16GB HBM3 memory ?

How much more expensive is HBM memory versus LPDDR5x memory at same capacity ? For example, is it 5x to 7x more ? Is it 8x to 10x ?….

It would provide some idea of how far (Apple) could indeed consider using HBM memory instead of LPDDR memory…
 

bit_user

Titan
Ambassador
What I am wondering is, as of 2023, what the the cost difference between 16GB LPDDR5x memory versus 16GB HBM2E or 16GB HBM3 memory ?
The only source I have on it is this:

"Compared to an eight-channel DDR5 design, the NVIDIA Grace CPU LPDDR5X memory subsystem provides up to 53% more bandwidth at one-eighth the power per gigabyte per second while being similar in cost. An HBM2e memory subsystem would have provided substantial memory bandwidth and good energy efficiency but at more than 3x the cost-per-gigabyte and only one-eighth the maximum capacity available with LPDDR5X."

Source: https://developer.nvidia.com/blog/nvidia-grace-cpu-superchip-architecture-in-depth/

If you want better than that, you could try doing your own "web research". Let us know if you find any good info.
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
The only source I have on it is this:
"Compared to an eight-channel DDR5 design, the NVIDIA Grace CPU LPDDR5X memory subsystem provides up to 53% more bandwidth at one-eighth the power per gigabyte per second while being similar in cost. An HBM2e memory subsystem would have provided substantial memory bandwidth and good energy efficiency but at more than 3x the cost-per-gigabyte and only one-eighth the maximum capacity available with LPDDR5X."​

If you want better than that, you could try doing your own "web research". Let us know if you find any good info.

Thanks very much for your insight !!! I had much difficulty to find this kind of information !!!

It is good to know that HBM2E memory may be let say 3x to 5x LPDDR5x at a cost-per-gigabyte : it gives a rough idea of how much more expensive it would currently cost Apple to integrate it in their devices.

I would think that for Apple Pro devices (Macbook Pro, iPad Pro, Mac Pro,…) models, the extra cost due to integrating HBM memory seems actually manageable…
 
  • Like
Reactions: bit_user
I wish Intel would allocate much more resources to the development of low latency, low power (High Bandwith Memory (HBM)) Non-Volatile Memory (NVM) VG-SOT-MRAM (or VCMA MRAM) of at least 64GB/128GB.
Yeah, as you said yourself this isn't going to fly on a consumer market, but they are doing what you are saying on datacenter CPUs.
In 50 years it might be cheap enough to end up in low cost devices.... :(
newsroom-max-series-wallpaper.png.rendition.intel.web.1920.1080.png
 

bit_user

Titan
Ambassador
They own the optane IP, they have done optane, they have done this.
Not with a HBM-like stack. Their Optane DIMMs aren't even as fast as regular DDR4 DIMMs.

I'm not saying that it will happen but I'm saying that they have all the parts they would need to do it.
No Optane fabs, though. Most of the people who knew anything about the design and manufacturing of Optane are probably now gone, as well.

Doesn't matter, though. There's no usecase for HBM Optane. @Diogene7 was starting with a Frankensteinian mashup of technologies, and then trying to find a problem it could solve.
 
Last edited:

Diogene7

Prominent
Jan 7, 2023
72
13
535
Doesn't matter, though. There's no usecase for HBM Optane. @Diogene7 was starting with a Frankensteinian mashup of technologies, and then trying to find a problem it could solve.

I am a firm believer that the importance of « Persistence » provided by emerging Non-Volatile Memory (NVM) like MRAM (VG-SOT-MRAM) is greatly overlooked : I believe « Normally-Off Computing » is disruptive, as it could drastically lower energy consumption of any IT systems when idle, especially the ones that react when sensors are triggered.

It also happens that VG-SOT-MRAM (from European research center IMEC) take smaller die area than an equivalent size SRAM cache, and so I would think it would first be manufactured for replacing part (all ?) SRAM cache in CPU/GPU/…

But I really believe the whole memory and storage stack (SRAM, DRAM, HBM, Flash,…) should by default be Non-Volatile : it just that historically, MRAM was not technically possible 50 years ago when SRAM appeared…

Assuming the MRAM dynamic power is lower than the volatile memory SRAM/DRAM, it is a transition that would bring great power efficiency benefits to all IT devices.

Also, I am not sure, but I think that for AI inferencing (ex: Google TPU), the model parameters needs to be loaded in the (HBM) memory, and from there, I would think that a Non-Volatile (HBM) Memory would make technical sense, in order to continuously keep the parameters in the memory (without any power consumed to keep the memory on like DRAM), instead of spending energy to shuffle the parameters from storage to memory…

I think, technically, MRAM make obviously sense, but from an economics standpoint, MRAM being an emerging technology, it may be too expensive to make financial sense : that is really likely the main reason is not (yet) widespread.
 

bit_user

Titan
Ambassador
it could drastically lower energy consumption of any IT systems when idle, especially the ones that react when sensors are triggered.
Depends a lot on how often they're triggered and how much work they do, when they are. Mobile devices, like a phone or watch, also spend most of their time asleep and processing asynchronous events. You could study how they optimize battery life. It does involve things like shutting down cores and cache slices, as well as throttling back clocks all the way out to DRAM.

It also happens that VG-SOT-MRAM (from European research center IMEC) take smaller die area than an equivalent size SRAM cache, and so I would think it would first be manufactured for replacing part (all ?) SRAM cache in CPU/GPU/…
Cache is probably the worst thing to use it for, since cache is very latency-sensitive and has a very high turnover. You might find that the read/write power of your MRAM makes it less efficient to use for cache than SRAM that's simply flushed out and powered down when not needed.

for AI inferencing (ex: Google TPU), the model parameters needs to be loaded in the (HBM) memory, and from there, I would think that a Non-Volatile (HBM) Memory would make technical sense, in order to continuously keep the parameters in the memory (without any power consumed to keep the memory on like DRAM), instead of spending energy to shuffle the parameters from storage to memory…
People talk about "computing in memory", meaning they indeed want to mix memory and computational elements to avoid wasteful & expensive data shuffling. This is very relevant to AI. The first problem is how to get the computational energy of AI down low enough that loading weights from PMEM to DRAM even matters. Then, if the inferences are sufficiently infrequent that holding them in DRAM uses too much power, perhaps you'd have a case.

A negative point about Optane is that I think its active power (i.e. read/write) is pretty high, compared to NAND. According to this, an Optane DIMM uses 12 - 18 W:

 

Diogene7

Prominent
Jan 7, 2023
72
13
535
Depends a lot on how often they're triggered and how much work they do, when they are. Mobile devices, like a phone or watch, also spend most of their time asleep and processing asynchronous events. You could study how they optimize battery life. It does involve things like shutting down cores and cache slices, as well as throttling back clocks all the way out to DRAM.
Cache is probably the worst thing to use it for, since cache is very latency-sensitive and has a very high turnover. You might find that the read/write power of your MRAM makes it less efficient to use for cache than SRAM that's simply flushed out and powered down when not needed.

You seems to make a lot of comparison with Phase Change Memory (PCM) Optane PMEM : Optane has indeed very high dynamic (ie read/write) power consumption, which make it a very bad choice to use as PMEM to begin with and quite low lifecycle (~10e6 I think)

MRAM does exist in many different flavors (Toggle, STT, SOT,…) with different trade-offs, and it seems that VG-SOT-MRAM may have low enough dynamic power (I think much, much lower than Optane anyway), while also having zero static power.


From there, you would shut down cores and the SRAM cache, but the great advantage is that all the information would stay in cache : just provide power again, and the core is ready to work : no need to shuffle/retrieve data from memory to bring it again in cache.

People talk about "computing in memory", meaning they indeed want to mix memory and computational elements to avoid wasteful & expensive data shuffling. This is very relevant to AI. The first problem is how to get the computational energy of AI down low enough that loading weights from PMEM to DRAM even matters. Then, if the inferences are sufficiently infrequent that holding them in DRAM uses too much power, perhaps you'd have a case.

Again, if you use MRAM (related to spintronics technologies) as PMEM, there are different flavors with different trade-offs.

If MRAM dynamic power (ie read/write) is low enough to be comparable or lower than SRAM/DRAM and lifecycle (endurance) is high enough (maybe at least 10e12 / 10e14), technically, it would be preferable to use Non-Volatile (Persistent) MRAM everywhere instead of volatile SRAM/DRAM as power consumption would be lower all the times (dynamic and static). Then only prohibitive costs hinder adoption of MRAM…

Therefore there would not be anymore any distinction between PMEM and DRAM : every part of your memory stack would be non-volatile PMEM, but you may tune/choose different spintronics properties for different memory level (cache (SRAM replacement), main memory (DRAM replacement)). Or, even better, re-architect your system in new innovative ways that were simply not possible before (using stochastic properties of spintronics for AI) to greatly improve energy efficiency (100x / 1000x…)

A negative point about Optane is that I think its active power (i.e. read/write) is pretty high, compared to NAND. According to this, an Optane DIMM uses 12 - 18 W:
 

bit_user

Titan
Ambassador
From there, you would shut down cores and the SRAM cache, but the great advantage is that all the information would stay in cache : just provide power again, and the core is ready to work : no need to shuffle/retrieve data from memory to bring it again in cache.
The point of cache is to be low-latency and high-bandwidth. If MRAM can't do that, then it's a nonstarter.

As I said, SRAM cache can be powered down, when utilization is low. When utilization is high, it's actually more energy-efficient than going straight to DRAM.
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
The point of cache is to be low-latency and high-bandwidth. If MRAM can't do that, then it's a nonstarter.

As I said, SRAM cache can be powered down, when utilization is low. When utilization is high, it's actually more energy-efficient than going straight to DRAM.

Yes, I agree with you for the cache.

It seems that the European research center IMEC has demonstrated that VG-SOT-MRAM is fast enough to be used to replace (at least some part of) the cache (L3), while having good endurance (10e12). Please review the article below :



So it seems that it can replace at some part of the cache. However, I don’t recall if the article speaks about retention time though (days ? weeks ? months ? years ?) and for DRAM replacement, ultimately it would be desirable to be several decades (10 years, 20 years,…) to be used both as memory and storage for IoT devices.



NXP Semiconductors already announced that they are planning to use MRAM with 20 years retention time in 2025 in an automotive MCU

 

bit_user

Titan
Ambassador
It seems that the European research center IMEC has demonstrated that VG-SOT-MRAM is fast enough to be used to replace (at least some part of) the cache (L3),
I wonder whether the limiting factor is latency or bandwidth.

However, I don’t recall if the article speaks about retention time though (days ? weeks ? months ? years ?)
If it has enough endurance, that's not a problem. It can simply be refreshed like DRAM.

NXP Semiconductors already announced that they are planning to use MRAM with 20 years retention time in 2025 in an automotive MCU
That's good, but a hard-requirement is probably something more like 5 years. It's basically whatever maximum time you want the thing to be able to sit, completely powered-down with no battery or anything. In the case of automotive, we have to consider individual components sitting in the supply chain and parts warehouses + distribution.
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
I wonder whether the limiting factor is latency or bandwidth.
I think that in the IMEC article, it states that the VG-SOT-MRAM works fine with a latency as low as ~ 300ps (so ~3.3Ghz speed).

I would therefore think it could be fast enough for all type of cache in most mobile phones application processor (and surely most IoT devices) that have a speed in the ~3Ghz, but may be suitable for only some cache levels in desktop/server processors that could reach 5Ghz…

If it has enough endurance, that's not a problem. It can simply be refreshed like DRAM.
I agree but the less often you have to refresh it, the less energy it consumes when idle.

Also the goal would be to simultaneously unify storage and memory, and replace both by one single unified MRAM.

That's good, but a hard-requirement is probably something more like 5 years. It's basically whatever maximum time you want the thing to be able to sit, completely powered-down with no battery or anything. In the case of automotive, we have to consider individual components sitting in the supply chain and parts warehouses + distribution.

Indeed. In industrial environments, it may be that some sensors are triggered only once every 10 years / 20 years or more (ex: airbags in a car,…) : you would need to be sure that it could still hold the execution code during that long (that it hasn’t decay to be unreadable/unsable) and be able to execute it when needed (ex: execute emergency procedure (call emergency with GPS coordinates) when airbags are triggered…)


I firmly believe that MRAM/spintronics is a key needed enabler to greatly improve/enhance many electronic devices, like NAND Flash (instead of HDD) enabled great improvements in mobile devices (iPods, iPhones, Macbook Air,…).

MRAM/spintronics is still an emerging technology, but recent manufacturing R&D breathrough (from IMEC), it begins to have all the technical requirements to be technically better than SRAM/DRAM…

The major issue is cost, and that’s why I think it would be better to first try to address a not too cost sensitive, and also high revenue, high growth market, in order to generate a bootload of cashflow that could then be invested to improve MRAM/spintronics manufacturing tools to accelerate cost reduction.

As AI datacenter servers for at least 2025 - 2030 seems to fit that description, it is the reason why I was suggesting the idea of Non-Volatile HBM MRAM or something like a high density crossbar MRAM, but it just needs to be something that use MRAM manufacturing tools and bring great efficiency improvements (100x / 1000x) to bring a boatload of cashflow to the MRAM/spintronics ecosystem, and from there drive significant enough cost reduction to make Non-Volatile MRAM financially attractive and a no brainer for all IoT / IT devices instead of SRAM / DRAM…
 

bit_user

Titan
Ambassador
I think that in the IMEC article, it states that the VG-SOT-MRAM works fine with a latency as low as ~ 300ps (so ~3.3Ghz speed).

I would therefore think it could be fast enough for all type of cache in most mobile phones application processor (and surely most IoT devices) that have a speed in the ~3Ghz, but may be suitable for only some cache levels in desktop/server processors that could reach 5Ghz…
Beware of trivial analysis. The devil is usually in the details.

In this case, cache also needs to implement CAM, for tag lookups. That multiplies the amount of reads you have to do per-access, as well as stacking another set of read latencies. So, perhaps what you should be worried about is approximating the raw latency of SRAM.

In the end, replacing SRAM is probably out of reach for MRAM. Maybe you'll have better luck with DRAM.

In industrial environments, it may be that some sensors are triggered only once every 10 years / 20 years or more (ex: airbags in a car,…)
It's not the time between triggers that matters, but the time between having enough power to complete a refresh cycle.

20 years would probably be long enough for something not to require self-refresh capability, but this would have to apply to the entire non-operating temperature range.

The major issue is cost, and that’s why I think it would be better to first try to address a not too cost sensitive, and also high revenue, high growth market, in order to generate a bootload of cashflow that could then be invested to improve MRAM/spintronics manufacturing tools to accelerate cost reduction.

As AI datacenter servers for at least 2025 - 2030 seems to fit that description,
If your solution is a poor fit or uncompetitive with current approaches, it doesn't matter how big the market is. You need to solve the problems, before deciding you have a solution that truly fits a problem. To supplant DRAM, for AI applications, you've got to be certain the density, cost, and active power are all roughly the same or better.
 

Diogene7

Prominent
Jan 7, 2023
72
13
535
Beware of trivial analysis. The devil is usually in the details.

In this case, cache also needs to implement CAM, for tag lookups. That multiplies the amount of reads you have to do per-access, as well as stacking another set of read latencies. So, perhaps what you should be worried about is approximating the raw latency of SRAM.

In the end, replacing SRAM is probably out of reach for MRAM. Maybe you'll have better luck with DRAM.
The article from the European research center IMEC (which is likely the world’s most important research center for developing semiconductor manufacturing tools and processes) seems to indicate that VG-SOT-MRAM has all the requirements to replace (at least some level) cache (at least L3 and maybe further down).

It is likely that before end of 2025, there could be some prototype chip demonstrating the viability of this (it seems they are already working on that…).

It's not the time between triggers that matters, but the time between having enough power to complete a refresh cycle.

20 years would probably be long enough for something not to require self-refresh capability, but this would have to apply to the entire non-operating temperature range.


If your solution is a poor fit or uncompetitive with current approaches, it doesn't matter how big the market is. You need to solve the problems, before deciding you have a solution that truly fits a problem. To supplant DRAM, for AI applications, you've got to be certain the density, cost, and active power are all roughly the same or better.

For AI applications, it seems that VG-SOT-MRAM could be used differently and seems an interesting candidate for implementing multi-level deep-neural network weights for analog in-memory computing, and likely much, much better suited than highly energy inefficient DRAM.

For the size of an SRAM array, it will be cost competitive (IMEC has developped VG-SOT-MRAM to be competitive to replace SRAM cache), but for larger sizes (DRAM size or bigger), it is hard to say : it really all depend how much efficiency improvement (100x / 1000x) it could bring compared to an AI system without MRAM, and therefore cost savings it could generate.
 
Status
Not open for further replies.