Because it was originally PCIe 1.0 x4. Sandybridge upgraded it to PCIe 2.0. Skylake bumped it up to PCIe 3.0. Then, Comet Lake widened it to x8. Finally, Rocket Lake upgraded it to PCIe 4.0, where it's stayed since then. So, we're talking about up to 16x the performance of whatever you saw.I have personally seen many times how SATA ports are limited on a slow DMI bus.
Let's take a step back and see if you can find some data to show that DMI is currently a bottleneck in modern systems (e.g. LGA1700).It is quite obvious that even the latest version of DMI is several times slower than the total number of lanes available directly from the processor.
Oops, but you're talking about DMI. That's Intel, not AMD.Zen4 HX series has 28 PCIe 5.0 lanes - 24 are available. Compare with the thickness of the south bridge...
It's easy to find them all in use. A good dGPU uses x16 lanes and two CPU-connected M.2 slots use the other 8 lanes. The last 4 go to the chipset.The funniest thing is that none of the laptop manufacturers have used all these Zen4 HX processor lanes, literally hanging in the air without doing anything.
ROG Zephyrus Duo 16 (2023) | Gaming Laptops|ROG - Republic of Gamers|ROG USA
Dive headfirst into Windows 11 Pro gaming with up to an AMD Ryzen CPU, up to an NVIDIA® GeForce RTX™ 4090 Laptop GPU, and a dazzling Mini LED Display.
Where do you get this information?the Zen4 HX memory bus is extremely weak - only 60-65GB/s, and although all devices require at least 2 times more, and taking into account the reserve for the system and software - 3 times more, if not 4,
We already established the CPU cores can't get more than 196 GB/s, at the absolute theoretical max, based on the speed of their IF links.and here we are smoothly approaching the 256-bit Zen5 Halo controller with probably 200GB/s. Bingo! Eureka!
I'm not sure what you're saying, here. I will say that you cannot build a fast CPU that doesn't have cache, no matter how much memory bandwidth it has. Period. If you don't understand why, then you need to learn more about computer architecture and software performance optimization before you should try being an armchair computer architect.There is no point in me citing these graphs - they are obvious and trivial. The point was that the memory bus is as fast as the L1 cache, being located next to the processor, like the soldered memory (remember the context of our conversation). Therefore, any cache is a crutch.
Huh? Xeon W ships in two flavors: one with a 256-bit memory interface and one with a 512-bit memory interface. I don't even understand your reference to AMD, because Threadripper had 4+ DIMM channels since its inception.It is obvious that Intel will also be forced to switch to a 256-bit (or 512-bit) controller in the HX series for the HEDT market, a year late compared to AMD,
I already showed you a memory scaling article that shows games are much more sensitive to memory latency than bandwidth.if Halo provides real 200 GB / s + - it will go into absolute lead over the HX series in terms of intensive processing of heavy data arrays in memory. And naturally, this will affect the performance in games, which such series usually target.
In this case, DDR4-3600 outperformed DDR5-4800 by 1%, even though its raw bandwidth is 25% lower! However, we can start to see why, when we compare CAS latency. For the DDR4-3600, it's only 10 ns, while the DDR5-4800 is 16.7 ns.
As for DDR5-7200 was a mere 3.4% faster than DDR5-4800, in spite of its raw bandwidth being 50% greater!! The main reason for that is probably because the DDR5-7200 has 9.4 ns CAS latency, while the DDR5-4800 memory has 16.7 ns CAS latency.
I'm pretty sure you mean "intuitively". Empirical usually means it's based on data, which you need to start providing to backup your assertions, because none of them have been supported by the data I've cited.Purely empirical assessment based on my understanding of the problems of x86 architecture. Especially in terms of igpu blocks and output to high-resolution screens with high frame rates.
Nobody is going to disagree that more memory bandwidth is better. The critical question is how much. That's why I've cited memory scaling data which refutes the claim that it's bottlenecking at the level you think it is.I can't add anything except to repeat - such a scheme deprives the architecture of the universality of the memory bus for all devices equally and leads to bottlenecks for certain classes of calculations
Again, we've already established that each CCD can do a max of 64 GB/s read + 32 GB/s write.I hope that the 256-bit Zen5 Halo controller will give as before at least 80%+ efficiency for the CPU cores, but at the same time will dynamically efficiently distribute the bandwidth between all devices according to their requirements, unlike the limitations of the Apple architecture.
That's it, unless AMD enables the second IF link per CCD, which I'm sure they won't because it's a laptop processor and that would use more power for little benefit.
The display controller's FIFOs are only so big and display streaming is a hard-realtime problem. So, I guess in lieu of having a robust QoS mechanism, they just want you to supply enough excess bandwidth that the display controller can always meet its deadlines.Intel directly recommends in the datasheets for video decoders and when outputting 4k to use only 2-channel memory. Why, if 22GB/s+ (DDR4 3200+) is more than enough even for a pair of 8k@60 monitors? But in reality, their IGPUs begin to freeze screens already with 4k monitors on single-channel memory, these are proven facts, especially in the case of old DDR4 3200.
In practice, they don't. Because PCIe has such high latency, graphics APIs are designed to avoid the CPU doing a lot of poking around in VRAM.It is much better if sys mem = vram and the processor cores access vram directly without restrictions of the pci-e bus.
It's a waste of die real estate to put the frame buffer there, because writes to it are relatively infrequent relative to other memory I/O that happens in a rendering pipeline. That's why nobody does it, any more (Microsoft/XBox used to have a thing for local memories, but even they stopped doing this). Instead, what Nvidia and AMD is just use big caches and then they speed up whatever memory I/O you're doing. AMD got a huge speedup from doing this in RDNA2 and Nvidia followed in the RTX 4000 generation.the frame buffer (it is small in size even with 3 buffering) can be implemented separately on igpu, so as not to interfere with the common memory bus and common data processing by processor cores and gpu cores.
I don't like reading on my smartphone.to be convinced of my rightness, you just need to compare the text on the screen of your smartphone, and then on the screen of your monitor with less than 150 ppi.
I don't know how close you sit to your monitor, but I don't really see pixels on my 27" 1440p. I like to sit with my eyes around 24" to 30" away.The main problem for the eyes is that when it looks at a low ppi screen, it constantly refocuses from pixels to the objects themselves.
There are better low-pass filters than that. Bicubic is just what people did when computers were too slow to handle large convolutions. If you use a proper low-pass filter, then you don't need an integral division. This is something I'd only recommend for video content, however.You confirm what I said. Yes, 4k reduced by a bicubic algorithm to 2.5k will naturally look more or less, because 4k is excessive for 2.5k, as well as for fhd. But ideally, because this is not a division by an integer.
Heh, where do you think 2560x1440 came from? It's a 2x scaling of 1280x720, which is one of the ATSC resolutions and supported by Blu-ray and many video cameras. So, I wouldn't say there's no 1440p content out there.Only 8k, 4k, fhd are universal - all three are obtained either by multiplying lower resolutions by an integer or dividing higher ones into lower ones again by an integer.
In professional video, they use 4:4:4. Support for this, in consumer products, is less common, but not unheard of.in commercial video, only 4:2:0 color thinning scheme is used
I mostly use it for text or web. Hardly ever streaming video.That's why I don't understand why you keep a completely inferior 2.5k at home.
Last edited: