News AMD deep-dives Zen 5 — Ryzen 9000 and AI 300 benchmarks, Zen 5, RDNA 3.5 GPU, and XDNA 2 microarchitectures

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Could it possibly be a "both-and" situation, where they both moved the thermal sensor and made some TIM or IHS-related improvements, but the person claiming that was only aware of one and not the other?

I think it's not in their interest to have worse information about die temperatures. It seems likely to blow up in their faces, if you'll excuse the turn of phrase.
That is what I'd hope, yes.

Thing is, nothing I have read makes me think that's what they were trying to say, which I'd imagine would be explicitly stated? Then again, this IS AMD's marketing we're talking about.

Regards.
 
BTW, the font kerning on some of these slides is just atrocious. Some letters are overlapping, while others are spaced so far apart that it almost looks like there's a space where there's not!

Here's one such example (but I'm sure not even the worst):
kZnYxZYHko84gTZ2h9r4jm.jpg
At the top, the letters in "Gen" are all overlapping, while "Strix" looks more like "St rix". At the bottom, "heterogeneous" looks more like "het erogeneous".

I'm not normally one to complain about such things, but it's so bad that it really affects their readability. So, um, like what gives, AMD??
I noticed that as well and wondered if it was something with the PDF or Paul's PC. Maybe he sent the PDF to a converter to generate the JPG files that did a shoddy job. TBH, it's probably Paul's issue and not something AMD did. LOL
 
  • Like
Reactions: bit_user
The internet is full of speculation and people who think they know. The latest word from Intel is they don't know, but are still looking into it!

Modern photolithography is complex business, pushing the boundaries of physics, chemistry, and manufacturing. I don't trust anyone who's not a lithography expert who thinks they know what's wrong. However, you can find some worthwhile advice on howto mitigate the risk or frequency of the issue.
This, TBH.

I haven't looked into non-i9 CPUs having issues much, and there are always going to be a small percentage of parts that fail over time (probably far less than 0.1% I'd guess is typical). The biggest factor in general seems to be time, but I can say that my i9-13900K actually had stability issues in certain games from day one. I figured out workarounds to get things running, which I used for most of a year. Then I decided to get serious about trying to find the root cause and a permanent fix.

I did find a fix earlier this year. I was so happy! Finally I could run all my games without constantly having to set affinity to only a few cores to get shader compilation to complete. But then after maybe three months, suddenly I started getting more instability than before! The biggest problem was with Nvidia's driver installer, which would just fail for no clear reason. First it was maybe 25% of the time... then it was 50%, then 80%, and at the end I literally tried to reinstall new drivers (after using DDU) like 25 times in a row and it failed every time. This was over the course of a month or two in March/April. I tried tweaking BIOS settings, I tried going back to the original settings that had worked in 2023, and nothing solved the problem.

That's when I reached out to Intel about an RMA. The replacement chip immediately worked with the settings that had previously been failing. Ergo, the CPU itself had degraded to the point that it was no longer usable. I'm sending it back to Intel, though I don't know if they'll do anything with it. So basically, some subset of CPUs seem to have had issues with certain mobo 'stock' settings right from the start, in certain applications. But then there's also a more prevalent issue where a larger percentage of CPUs might be failing.

We still don't know what percentages we're talking about, though. Intel sells millions of CPUs every week. (I think? It's some big number like six million per week, or maybe it's per month.) So even if as an example 100,000 CPUs failed, that would only be something like 0.03–0.1 percent of all Intel CPUs that have shipped in the past two years. But if that figure starts trending upward and it ends up being 1% or more that are failing within three years? That would be catastrophically bad!

It's also very telling that Intel hasn't come up with any specific root cause or solution. I think internally it may actually have a good idea what's going on, but it might also be so bad that it doesn't want to admit what the problem is. I've encountered this in the past, where I would email a manufacturer about a problem and they'd claim they couldn't reproduce it... and then email me a month or two later with a fix for the problem. LOL
 
Mar 10, 2020
437
396
5,070
So basically, some subset of CPUs seem to have had issues with certain mobo 'stock' settings right from the start, in certain applications. But then there's also a more prevalent issue where a larger percentage of CPUs might be failing.
Therein is the base for the questions I was asking in my previous posts. The general populous doesn’t know how others have treated their cpus re overclocking, voltages etc. also the populous doesn’t know how badly (or not) the cpus have been affected by mobo manufacturers pushing to get the last MHz out of any presented cpu.
The only contrast presented has been L1Techs reporting from the server customer where “standard” voltages and clocks were applied with the memory running substantially slower than 5000Mt - and they still had problems.

Curiosity makes me ask what is so different between i9 12900 and the subsequent iterations? (Clock speed and voltages excepted)
 
  • Like
Reactions: KyaraM and Makaveli

bit_user

Titan
Ambassador
That is what I'd hope, yes.

Thing is, nothing I have read makes me think that's what they were trying to say, which I'd imagine would be explicitly stated? Then again, this IS AMD's marketing we're talking about.
You saw the slide I quoted, right? Here it is, again:

BcbcS2WeHB9ufN2jozkSKE.jpg

They didn't just claim "7 degrees lower at same TDP", they even said that how they achieved it was through a "15% improvement in thermal resistance"! If they were trying to fudge this temperature reduction thing, why would they give that extra level of detail? It'd just be needlessly piling an outright lie atop a misrepresentation.

Now, I don't want to get into the whole game of debating the reliability of this leaker or that source, but when AMD is that clear about something, I'd be inclined to just wait and see if the facts bear it out. Churning the rumor mill over this just seems unnecessary and potentially libelous.

But, since you brought it up, I'm curious what you would do if your source turns out to be wrong (and, to be clear, I consider claiming there's no actual improvement in thermal conductivity to be wrong, even if they're write about moving the sensor). Are you going to stop reading them or following them on youtube? Or will you give them a pass, because "hey, it's just a rumor"? It really bugs me that some of these youtubers seem to enjoy an accountability-free existence, where their main incentive is just to be sensational. Furthermore, will you apologize to us for propagating bad information that not only directly contradicted AMD, but essentially accused them of being duplicitous and liars?

I'm just saying this because, in this age of misinformation & disinformation, I think we need to be more careful than ever about media hygiene. I hope that's not putting too fine a point on it. I don't mean to pick on you, specifically.
: /
 

bit_user

Titan
Ambassador
Never buy new hardware until its been out in the wild for a good half a year or else you risk being a victim of good ol fashion human incompetence.
I like to be somewhere between the middle of the pack and trailing-edge. However, if I needed a new machine and some new hardware just launched, I'd definitely consider buying it. If we look back over the past decade or so, it's pretty clear you're grossly exaggerating the risk. Most CPUs and platforms didn't launch with any issues that couldn't be addressed with minimal impact through software updates.

Furthermore, I'd like to point out that I'm a daily consumer of tech news, from 3-5 different outlets, and I'd didn't become aware of these issues with Raptor Lake until about 15 months after launch! That's a wholly unreasonable amount of time to expect most consumers to wait, especially when Intel is on an annual product cycle!

4 grand for a gpu? um no...haha..of course, no...
OMG, what are you talking about? Where do you live, bro?
 

bit_user

Titan
Ambassador
I noticed that as well and wondered if it was something with the PDF or Paul's PC. Maybe he sent the PDF to a converter to generate the JPG files that did a shoddy job. TBH, it's probably Paul's issue and not something AMD did. LOL
Thanks for the clarification! I somehow assumed AMD had released JPGs. It didn't occur to me that you guys would've done your own PDF -> JPG conversion.
 

bit_user

Titan
Ambassador
The biggest factor in general seems to be time, but I can say that my i9-13900K actually had stability issues in certain games from day one. I figured out workarounds to get things running, which I used for most of a year. Then I decided to get serious about trying to find the root cause and a permanent fix.

I did find a fix earlier this year. I was so happy! Finally I could run all my games without constantly having to set affinity to only a few cores to get shader compilation to complete. But then after maybe three months, suddenly I started getting more instability than before! The biggest problem was with Nvidia's driver installer, which would just fail for no clear reason. First it was maybe 25% of the time... then it was 50%, then 80%, and at the end I literally tried to reinstall new drivers (after using DDU) like 25 times in a row and it failed every time. This was over the course of a month or two in March/April. I tried tweaking BIOS settings, I tried going back to the original settings that had worked in 2023, and nothing solved the problem.

That's when I reached out to Intel about an RMA. The replacement chip immediately worked with the settings that had previously been failing. Ergo, the CPU itself had degraded to the point that it was no longer usable.
@rluker5 should read this, as it directly contradicts all of his theories and claims.

We still don't know what percentages we're talking about, though.
Take this with a grain of salt, but L1Techs was saying they probed some sources at big PC OEMs who said it appeared to be in the range of 10% to 25%. I assume they mean the number of CPUs they expect to replace during the entire time those systems remain under warranty, but that wasn't specified. I find it stunning, if true. Also, not clear if they meant just i9's or both i7's and i9's.
 

bit_user

Titan
Ambassador
Curiosity makes me ask what is so different between i9 12900 and the subsequent iterations? (Clock speed and voltages excepted)
According to Intel, one difference is indeed the manufacturing process.

Raptor-Lake-Slides_28_crop.png


I think I remember reading something about the improvement in the Raptor Lake version of "Intel 7" had something to do replacing the pure cobalt they introduced in the original "Intel 7" node. I think that might be discussed here, though I don't recall where I initially read it.


If the Raptor Lake version of Intel 7 started to tweak the formulation of that cobalt layer, then this graph would seem to be pretty ominous. However, I have no idea if it was.

Intel-materials-EM-and-resistance.png

It should go without saying that I don't know what the heck I'm talking about, but go ahead and check out the article if you're curious.

While continuing to poke around, I found this discussion of design changes in Raptor Lake, which goes into quite a bit more detail, and includes this gem of a slide:
In the above article, Intel claimed that "utilizing the same underlying architecture and platform compatibility as Alder Lake resulted in a 6-month reduction in the development cycle, enabling a 1-year cadence of Core CPU generation." So, I gather that means cutting down that development cycle from 30 months to 24 months.
 
Last edited:
  • Like
Reactions: usertests
Mar 10, 2020
437
396
5,070
According to Intel, one difference is indeed the manufacturing process.
Raptor-Lake-Slides_28_crop.png

I think I remember reading something about the improvement in the Raptor Lake version of "Intel 7" had something to do replacing the pure cobalt they introduced in the original "Intel 7" node. I think that might be discussed here, though I don't recall where I initially read it.

If the Raptor Lake version of Intel 7 started to tweak the formulation of that cobalt layer, then this graph would seem to be pretty ominous. However, I have no idea if it was.
Intel-materials-EM-and-resistance.png
It should go without saying that I don't know what the heck I'm talking about, but go ahead and check out the article if you're curious.

While continuing to poke around, I found this discussion of design changes in Raptor Lake, which goes into quite a bit more detail, and includes this gem of a slide:
raptor-lake-dev-cycle-wc.png
In the above article, Intel claimed that "utilizing the same underlying architecture and platform compatibility as Alder Lake resulted in a 6-month reduction in the development cycle, enabling a 1-year cadence of Core CPU generation." So, I gather that means cutting down that development cycle from 30 months

I don’t know enough to pretend to speculate…. Anyone still got an abacus?
 
  • Like
Reactions: bit_user
I’m waiting to see more from Level 1 Techs. In Wendell’s research he has uncovered i9s failing in a server environment without overclocking and with memory dialled back to sub 5000mt for stability.

If this proves to be valid then there is far more going on than simple overheating.
He didn't specify which configurations had been running what, but RC die are rated at no more than 4400 for 2DPC (3600 for 2R which 32GB+ DIMMs likely are) so anything over that is overclocking the memory controller/DRAM.
Could it possibly be a "both-and" situation, where they both moved the thermal sensor and made some TIM or IHS-related improvements, but the person claiming that was only aware of one and not the other?

I think it's not in their interest to have worse information about die temperatures. It seems likely to blow up in their faces, if you'll excuse the turn of phrase.
At this point I'm waiting for the delid before passing judgment since AMD doesn't seem keen to explain. The IHS on Zen 4 was already very good in terms of copper purity which is usually where an IHS falters and I believe they use an indium based TIM (I don't recall and don't feel like looking to verify) so I'm not sure how much better they can get there alone (yes I still want the vapor chamber dream, but if it was that I'm certain they'd have said).
 

NinoPino

Respectable
May 26, 2022
496
310
2,060
Yes, we are seeing. All cooler reviews make it obvious how much easier Intel are to cool. Just cross compare temperatures and youll see it too. Well, nvm, you personally won't, you are too far gone
Funny. If the chip produces heat, then this heat must be dissipated somewhere. How somebody can rationally think that is easier to dissipate 370W (when lucky) vs 280W ?
 
You saw the slide I quoted, right? Here it is, again:
BcbcS2WeHB9ufN2jozkSKE.jpg
They didn't just claim "7 degrees lower at same TDP", they even said that how they achieved it was through a "15% improvement in thermal resistance"! If they were trying to fudge this temperature reduction thing, why would they give that extra level of detail? It'd just be needlessly piling an outright lie atop a misrepresentation.

Now, I don't want to get into the whole game of debating the reliability of this leaker or that source, but when AMD is that clear about something, I'd be inclined to just wait and see if the facts bear it out. Churning the rumor mill over this just seems unnecessary and potentially libelous.

But, since you brought it up, I'm curious what you would do if your source turns out to be wrong (and, to be clear, I consider claiming there's no actual improvement in thermal conductivity to be wrong, even if they're write about moving the sensor). Are you going to stop reading them or following them on youtube? Or will you give them a pass, because "hey, it's just a rumor"? It really bugs me that some of these youtubers seem to enjoy an accountability-free existence, where their main incentive is just to be sensational. Furthermore, will you apologize to us for propagating bad information that not only directly contradicted AMD, but essentially accused them of being duplicitous and liars?

I'm just saying this because, in this age of misinformation & disinformation, I think we need to be more careful than ever about media hygiene. I hope that's not putting too fine a point on it. I don't mean to pick on you, specifically.
: /
It's about breaking down the wording of that page/slide.

It's... Weird.

Look at the opening statement and they mention efficiency. Then they say "thermal resistance reduction" and talk about a hard 7° measured reduction, but none of those measurements are detailed in technical speak other than what we can assume they meant by them.

So, it's not information being leaked or anything like that, but different people's interpretations and I side with this one: "this is marketing speak and more than likely incorrect in the technical side, but correct in the outcome". Basically the "thermal resistance improvement" they state because they are measuring a lower in temperature (remember, marketing digesting technical data). And the opening "efficiency" header is just dumb. We all know temperature is related, indirectly, to efficiency, but that's what throws the rest of the page's information off for me.

I think that's the most reasonable interpretation of that slide and that's why I concluded it's smokes and mirrors. I'll be happy to be wrong, but I doubt I'll be too off the mark.

Regards.
 

bit_user

Titan
Ambassador
Look at the opening statement and they mention efficiency. Then they say "thermal resistance reduction"
Okay, so they miscategorized it (you'll note it's not the only slide with that heading - I think it's like the title of a chapter in a book). That doesn't invalidate the content of the slide, IMO. It's just sloppy organization.

Then they say "thermal resistance reduction" and talk about a hard 7° measured reduction, but none of those measurements are detailed in technical speak other than what we can assume they meant by them.
Thermal resistance is the complement of thermal conductivity. I don't know why they put it that way, but it seems clear to me what they meant.

Your point about not being "detailed in technical speak" had me looking up the reference at the bottom of that slide (GNR-11), but I didn't find it in the slides included in the article. However, I know Anandtech also tends to post up slides. I visited their coverage, and noticed the following statement:

"Unfortunately, when asked at the Tech Day in LA last week, AMD wouldn't divulge how they managed these improvements, but that's not a surprise."

So, it looks like we won't get the implementation details, although it sounds like Anandtech's interpretation agrees with mine.
 
Funny. If the chip produces heat, then this heat must be dissipated somewhere. How somebody can rationally think that is easier to dissipate 370W (when lucky) vs 280W ?
Because it's a fact? AMD's atrocious IHS causes heat dissipation problems and there's less silicon surface area as well.

Here's a couple of TPU charts. I picked maximum for AMD and 250W for Intel since Intel's maximum temp is 5C higher than AMD. It should be plenty obvious just the same that it's easier to dissipate the heat with Intel:
maximum-heat-amd.png

temp-intel-250.png

note: I'm not saying this is good, bad or indifferent this is just the fact of Zen 4 and ADL/RPL.
 
  • Like
Reactions: bit_user

bit_user

Titan
Ambassador
Because it's a fact? AMD's atrocious IHS causes heat dissipation problems and there's less silicon surface area as well.

Here's a couple of TPU charts. I picked maximum for AMD and 250W for Intel since Intel's maximum temp is 5C higher than AMD. It should be plenty obvious just the same that it's easier to dissipate the heat with Intel:
maximum-heat-amd.png

temp-intel-250.png

note: I'm not saying this is good, bad or indifferent this is just the fact of Zen 4 and ADL/RPL.
I pretty much said the same thing, at the same time, over in this thread:

Bonus: I linked a tweet that includes the thermal density for Alder Lake, Raptor Lake, and several Zen 4 models.
 
  • Like
Reactions: thestryker

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
I put it simple for you :
more wattage = more problems
high wattage = huge problems
Again, heat dissipation is a thing. If you see any cooler review you'll realize your self how much easier it is to cool intel chips. It's not that hard stop your intel hate for a minute and look at the data.
https://www.techpowerup.com/review/id-cooling-frozn-a620-pro-se-cpu-air-cooler/7.html

Same cooler, can cool 320w on an Intel chip and only 240w on an amd chip. Amd chips are harder to cool, period.
 

NinoPino

Respectable
May 26, 2022
496
310
2,060
Because it's a fact? AMD's atrocious IHS causes heat dissipation problems and there's less silicon surface area as well.

Here's a couple of TPU charts. I picked maximum for AMD and 250W for Intel since Intel's maximum temp is 5C higher than AMD. It should be plenty obvious just the same that it's easier to dissipate the heat with Intel:
maximum-heat-amd.png

temp-intel-250.png

note: I'm not saying this is good, bad or indifferent this is just the fact of Zen 4 and ADL/RPL.
I knows the fact that AMD CPU are hard to cool so that works at high temperatures but this does not means that is more difficult to dissipate 1000W than 500W and more difficult to dissipate 500W than 100W and so on.
What is not clear of this ?
So, in the end, Intel CPUs that producing more heat are also more difficult to cool.
In simple terms you need more cooling.
 

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
So, in the end, Intel CPUs that producing more heat are also more difficult to cool.
In simple terms you need more cooling.
Intel cpus don't produce more heat. The 7950x produces more heat than the 12100 of the 14900t. Stop being obtuse.

At the end of the day you take a 7950x and a 14900k, you lock them both to 250w,slap the same cooler on them, the 14900k will be considerably cooler. Like 20c cooler. That's a huge delta.
 

NinoPino

Respectable
May 26, 2022
496
310
2,060
Again, heat dissipation is a thing. If you see any cooler review you'll realize your self how much easier it is to cool intel chips. It's not that hard stop your intel hate for a minute and look at the data.
https://www.techpowerup.com/review/id-cooling-frozn-a620-pro-se-cpu-air-cooler/7.html

Same cooler, can cool 320w on an Intel chip and only 240w on an amd chip. Amd chips are harder to cool, period.
In the article, same cooler (FROZN A620 PRO SE Max Speed) I see :
300W @ 100°C for Intel
229W @ 95°C for AMD
This confirm the fact that is more difficult to cool a CPU that have more Wattage.
What I continue to say but somebody seems to not understand.
 

NinoPino

Respectable
May 26, 2022
496
310
2,060
At the end of the day you take a 7950x and a 14900k, you lock them both to 250w,slap the same cooler on them, the 14900k will be considerably cooler. Like 20c cooler. That's a huge delta.
At same Wattage, yes, it is obvious.
But the problem is that Intel use a lot more power and so, it is more difficult to cool.
It is simple, more watt = more cooling problems.
I could confirm your point only with a marginal difference in power consumption, but in case of 7950X and 14900k the difference is huge (insane imho) and the more efficient IHS or the larger area will never compensate the wattage difference.
 
Status
Not open for further replies.