Intel confirms Adamantine L4 cache tile for Meteor Lake, other CPUs.
Intel's Patent Details Meteor Lake's 'Adamantine' L4 Cache : Read more
Intel's Patent Details Meteor Lake's 'Adamantine' L4 Cache : Read more
The scheduled forum maintenance has now been completed. If you spot any issues, please report them here in this thread. Thank you!
Depends on the application. With the IoT thing still in close to full swing, there will be a growing number of toasters, ceiling fans, lights and other stuff running full-blown Linux, BSD and other OS derivatives for which you may not want to wait for a 5s boot time every time you turn them on before you can set them to do whatever it is you want them to do.A lot of people here might not have been around back in the day when boot times were calculated in minutes not seconds but its already massively fast now even with just any SSD. Not really a need for faster booting.
Almost every mention of L4 in the patent are specific to boot-time initialization, management engine, secure firmware, secure engines and related stuff while a couple more claims focus on IGP/UMA graphics. Doesn't look like it is intended for CPU usage while running software. At least not in its first iteration. They even mention a flag to disable or "lock down" the "L4" before BIOS passes control over to the OS at claim #93.The days of SRAM based L4 Cache are here.
With Intel bringing their implementation on the way, you can bet that AMD will have their implementation.
It's going to be a great day when everybody starts using L4$ SRAM to benefit their CPU's =D
The Crystal Lake eDRAM on Broadwell enabled massively improved integrated graphics performance (3-3.5X the performance for 2.4X the IGP size) and if Intel scales Meteor Lake's IGP is 128 EUs just like the A380, it'll certainly benefit from having access to a large scratchpad to offset having half as much memory bandwidth to share with the CPU.On low power mobile the benefit the iGPU would get is just not worth all of this silicon. (for example Xe > Broadwell iGPU. DDR5 has enough bandwidth for mobile stuff, you can see that with AMD's bigger iGPUs) Also without a dGPU the extra cache doesn't help that much.
Actually in some of the recent patches, as discovered by Phoronix (can't find the proper link now), it was revealed that, unlike the previous designs, Intel Meteor Lake GPU cannot utilize the LLC on the chip which was previously shared by both the CPU and GPU.
I think there's a good chance that AMD will keep the GPU on its own Infinity Cache slice.With Intel bringing their implementation on the way, you can bet that AMD will have their implementation.
It's going to be a great day when everybody starts using L4$ SRAM to benefit their CPU's =D
That's an interesting point - they could power down the CPU tile and just run the Crestmont cores on the SoC tile. Likewise, they could power down the L4 tile - I know some phone SoCs power down parts of their cache hierarchy to save energy.I hate to cast doubt on something that looks good, but there does seem to be a possible use of some hybrid aggressive sleep function to save power on idle and low use. Maybe even shutting down extensive parts of the system if they could be woken quickly enough. That and the security stuff to make a cold boot like waking from sleep. Could lead to some nice, yet boring benefits in the power savings area.
Disagree. A big part of Apple's power savings has been through more aggressive use of cache. When the system is running under high load, it's much more efficient to go to cache than DRAM.On low power mobile the benefit the iGPU would get is just not worth all of this silicon. (for example Xe > Broadwell iGPU. DDR5 has enough bandwidth for mobile stuff, you can see that with AMD's bigger iGPUs) Also without a dGPU the extra cache doesn't help that much.
It's funny to me just how many devices run a Linux kernel. Such overkill, but often the easiest path when you have the power budget and a capable core.With the IoT thing still in close to full swing, there will be a growing number of toasters, ceiling fans, lights and other stuff running full-blown Linux, BSD and other OS derivatives for which you may not want to wait for a 5s boot time every time you turn them on before you can set them to do whatever it is you want them to do.
I'm not seeing how more "pre-boot memory" is supposed to translate into increased profit either. I can imagine it making it easier for motherboard manufacturers to throw together fancy UIs without worrying about DRAM controller initialization, maybe running UEFI apps without requiring external DRAM which could be significant for things like rear-view cameras and similar embedded applications and is one of the things mentioned in there.This sentence strikes me as very odd:
"Value is added for high end silicon with higher pre-initialized memory at reset, potentially leading to increased revenue."
Once your toaster requires support for cameras so you can watch your toasts, WiFi and BT networking so you can remotely monitor from your phone, PC, tablet or whatever else , USB storage to record your toasts, audio and video output for videoconference over toaster or just putting your toaster cams on external displays, may as well get a full OSIt's funny to me just how many devices run a Linux kernel. Such overkill, but often the easiest path when you have the power budget and a capable core.
I think you might be romanticizing. Broadwell's IGPU is better than Crystalwell's but it wasn't proportionally (per tflop) that much faster than Haswell's even though it had a significantly improved arch. Here's how it holds up vs other arches: https://www.anandtech.com/show/1665...ew-is-rocket-lake-core-11th-gen-competitive/2The Crystal Lake eDRAM on Broadwell enabled massively improved integrated graphics performance (3-3.5X the performance for 2.4X the IGP size) and if Intel scales Meteor Lake's IGP is 128 EUs just like the A380, it'll certainly benefit from having access to a large scratchpad to offset having half as much memory bandwidth to share with the CPU.
Apple has shown that an iGPU can be very competitive against all but the biggest mobile dGPUs, if supported with enough memory bandwidth. And, in their case, that bandwidth reaches up to 400 GB/s, which is a lot more than you can get with external DRAM.The L4 will help, but it isn't going to change a low powered iGPU into some powerhouse and dethrone the small cache 780m.
Benchmarking an almost 10 years old IGP in modern-day titles it likely has no support for on a chart that doesn't show any Haswell or Skylake numbers to compare to isn't particularly useful. Here are some more contemporary benchmarks that show how much of a leap forward Broadwell was from Haswell:I think you might be romanticizing. Broadwell's IGPU is better than Crystalwell's but it wasn't proportionally (per tflop) that much faster than Haswell's even though it had a significantly improved arch. Here's how it holds up vs other arches: https://www.anandtech.com/show/1665...ew-is-rocket-lake-core-11th-gen-competitive/2
Isn't the reason for a patent to patent the NEW stuff a thing can do?!Almost every mention of L4 in the patent are specific to boot-time initialization, management engine, secure firmware, secure engines and related stuff while a couple more claims focus on IGP/UMA graphics. Doesn't look like it is intended for CPU usage while running software. At least not in its first iteration. They even mention a flag to disable or "lock down" the "L4" before BIOS passes control over to the OS at claim #93.
More expensive CPU = more money for intel ??? ¯\_(ツ)_/¯I'm not seeing how more "pre-boot memory" is supposed to translate into increased profit either.
Actually the eDRAM failed to produce significant gains as far as I could tell. I started with a Skylake i5-6267 that had 64MB of eDRAM and a Iris 550 48EU iGPU, not the maximum configuration.The Crystal Lake eDRAM on Broadwell enabled massively improved integrated graphics performance (3-3.5X the performance for 2.4X the IGP size) and if Intel scales Meteor Lake's IGP is 128 EUs just like the A380, it'll certainly benefit from having access to a large scratchpad to offset having half as much memory bandwidth to share with the CPU.
No, it didn't. Those Xe EUs have a lot of other improvements. So, ideally, it should've scaled more than linear with the EU increase from a Gen 9.x iGPU.However, the 96EU Xe iGPU did scale linearly vs. the 24EU UHD iGPU, really putting the Iris Plus/550 48EU iGPU with eDRAM to shame.
It should've been additive with external DRAM bandwidth, in which case your best-case would've been 90 GB/s.The eDRAM was measured at 50GB/s somewhere in Anandtech. So its main advantage might have been latency: this ain't no HBM!
How does the Xe achieve the incredible speed gains without additional DRAM bandwidth?
Well, ideally boot time would be one cycle after power on.Until boot time is reduced to sub-1s, where it should be, there is a lot of space for improvement.
I am reporting on what I measured using the various 3Dmark benchmarks, PerformanceTest 10 and Unigine on hardware I still own (it runs a productive Linux cluster though, so I can't easily do live tests now).No, it didn't. Those Xe EUs have a lot of other improvements. So, ideally, it should've scaled more than linear with the EU increase from a Gen 9.x iGPU.
https://www.anandtech.com/show/15993/hot-chips-2020-live-blog-intels-xe-gpu-architecture-530pm-pt
Intel made lots of changes between Gen9 and Gen12, all the way from the ISA and microarchcitecture, up to the macroarchitecture. I remember reading about Gen11 that they removed some scalability bottlenecks.
I am pretty sure that Intel didn't double the memory channels for the eDRAM on those (mostly) mobile chips, because it would require a vastly increased pad area and 128 extra pins just for the data bus: My Haswell Xeons do 70GB/s with DDR4-2133 with four channels, so additive bandwidth is out of the question. I just had another look, the i5-6267 with DDR3-1600 manages 25.6GB/s so the 50GB/s alternate bandwidth on eDRAM would have doubled that, albeit with diminishing returns with perhaps 64MB eDRAM being too small. Both the i5-6267 and the i7-8559U don't come close to the potential that a double sized iGPU seems to promise.It should've been additive with external DRAM bandwidth, in which case your best-case would've been 90 GB/s.
I can't say that I care about boot time much.Until boot time is reduced to sub-1s, where it should be, there is a lot of space for improvement.
I ran Linux (also Unix System V.Rel3 and FreeBSD) on a 80486 with 16MB of RAM. I ran Linux in a VM on NT 3.51 with 32MB of RAM. My Microport System V Rel 2 ran on a 80286 with 1.5MB of RAM while QNX made do on a 128KB PC-XT.It's funny to me just how many devices run a Linux kernel. Such overkill, but often the easiest path when you have the power budget and a capable core.
Notably, the Raspberry Pi Pico does not run Linux. Nor does Arduino, of course. Microcontrollers are really past the limit of where Linux will fit.
I have a better comparison between with and without edram. Same arch CPU, GPU, iris 5200 has twice the shaders as the 4600:Benchmarking an almost 10 years old IGP in modern-day titles it likely has no support for on a chart that doesn't show any Haswell or Skylake numbers to compare to isn't particularly useful. Here are some more contemporary benchmarks that show how much of a leap forward Broadwell was from Haswell:
Broadwell: Intel Core i7-5775C And i5-5675C Review
Broadwell has only a few short months to shine before Intel's Skylake architecture is expected to surface. Can its two socketed CPUs steal the spotlight?www.tomshardware.com
On the productivity side, Broadwell's IGP beats Haswell's by as much as 6X in Cinebench OpenGL and Maya.
On workloads that played well with the eDRAM, Broadwell's IGP was a screecher.