Discussion AMD Radeon RX 6000 GPUs dying due to Driver issues or a defective batch ?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Hi guys/gals,

I just wanted to ask from those gamers using AMD's RX 6000-series GPUs, that whether have you experienced a dead card after the latest driver release ? I'm not sure whether this issue is related to a specific batch of chips, or it's a GPU driver issue.

Right now, I don't think this issue is widespread though, because only German users have so far reported on these issues. But how can this be a coincidence that 48 GPUs died suddenly ? Can GPU drivers really brick a card ?

Anyways, let me keep this short.

German Techtuber and repair shop, KrisFix-Germany, posted a video sharing how his shop has received several AMD Radeon RX 6000 series graphics cards for repair. The shop got 61 cards in total which were a mix of Radeon RX 6900 and RX 6800 series flavors and out of 61, 48 cards had come with a cracked GPU. Kris states that this is the first time he has seen so many GPUs come to his shop with cracked GPUs.

But it's not just the GPU that has cracked, all 48 cards come with shorted SOC rail, shorted memory rail, or a shorted memory controller rail. Upon asking users what they were specifically doing with the cards, most of them responded differently. Some said they were gaming normally while others were watching YouTube or even had their PC on idle when the GPU died on them.

But one thing that matched across all affected GPUs was the driver version. It is stated that all GPUs were using the latest drivers which launched last year in December aka Adrenalin 22.11.2 (WHQL).

Also, KrisFix doesn't mention what AIBs came with the dead GPUs but the one shown in the video is an AMD MBA design which is used by the reference Radeon RX 6900 / RX 6800 graphics cards. Now we can't blame this on either the drivers or this particular variant yet without investigating further into the problem but it should definitely be taken seriously.

For now, Buildzoid states that it won't be the first time that a GPU manufacturer would accidentally disable the thermal protection in a new driver release. Some users have also reported hearing aggressive coil whine on the newer drivers which wasn't there before.


View: https://youtu.be/pQDnwpc_k4E

View: https://twitter.com/Buildzoid1/status/1612868725128019968
 
They all have issues no matter it's AMD, Nvidia or Intel. And I would be very interested to compare the number of issues Nvidia drivers got in the last 12 months as compared to AMD. We might be surprised. You mentioned yourself that the damages on those cards couldn't be caused by a driver issue but people will still falsely remember that AMD drivers are toasting cards. The same way they are convinced that AMD drivers are making games crashing all the time.

I got a 7900 XTX a month ago and had no crashes beside The Wicther 3 (DX12), but several Nvidia users are also experiencing similar problems with the new version of that game. And before to get my 7900 XTX, I was running a RTX 2080 that has crashed for 3 weeks on COD MW2 (both Nvidia and Infinity Ward had to release several patches to fix the problem). So no drivers are perfect, they all have their issues time to time but I don't think AMD's ones have a lot more than Nvidia's (at least in the last couple of years).

Yes, I know that there are issues with all vendors, and I never take any camp's side, be it AMD, NVIDIA, Intel, and/or APPLE. I never favored any company for that matter though.

And in fact like you said it would be good to know and compare the number of issues Nvidia drivers got in the last 12 months 'vs' AMD. I'm pretty sure the result is going to be a mixed bag. Depending on the game being played, and whether it is an AMD-sponsored or Nvidia-sponsored title, the GPU drivers sometimes play a role here.

Some games favor AMD's hardware, whether others are more optimized for Nvidia GPUs /drivers. But there is no hard and fast rule here either.
 
Hi guys/gals,

I just wanted to ask from those gamers using AMD's RX 6000-series GPUs, that whether have you experienced a dead card after the latest driver release ? I'm not sure whether this issue is related to a specific batch of chips, or it's a GPU driver issue.
It turns out that it wasn't the driver. A great many of these GPUs came from the same seller and were bought used. They were used for mining in hot and humid conditions and it is now believed that the damage was caused by that.
Videocardz:
Radeon GPU cracking not caused by drivers, storing conditions and cryptomining to blame - VideoCardz.com
Notebookcheck:
Improper storage believed to be the number one suspect in the strange case involving 48 broken AMD Radeon RX 6800 / 6900 XT cards - NotebookCheck.net News
 

zx128k

Reputable
Good to know these cards was used for MINING before.

BTW, do you guys think that even GPU drivers can brick a card/hardware in some rare cases ? I know flashing a wrong BIOS can brick any hardware, but never heard of drivers killing any component though. lol

Drivers can update firmware and BIOS. So yes they could brink a card. Drivers can overclock hardware with the risk of making it unstable and leading to hardware failure (bricking the card). There was the issue of nVidia RTX 3080 cards boosting too high and becoming unstable which was fixed in a driver update. The "cap issue" was widely reported in the media and fixed in a driver update. There were reports that a nVidia driver bricked cards, this is the same in AMDs forums. Note that these tend not to be problems that get proven as fact (cause and effect found, just implied) but events that normally happen to coincide with a driver update. They happen but route cause isn't proven. Flashing/Blinking Display issues and games freezing are totally driver like issues. There were AMD GPU drivers reported as overclocking AMD CPUs if in the same system.

Those dies on the AMD RX 6000 series cards were totally destroyed like massive voltage and heat was applied. With zero thermal protection. The cards were stated to have been likely cleaned as well and then not correctly dried. Got to wait for the cause to be proven. Working theory is cryptomining and environment are the cause. 48 cards with the same issue is noteworthy enough to make the news.
 
Last edited:
Good to know these cards was used for MINING before.

BTW, do you guys think that even GPU drivers can brick a card/hardware in some rare cases ? I know flashing a wrong BIOS can brick any hardware, but never heard of drivers killing any component though. lol

Drivers can include firmware updates. These firmware updates can cause a system to operate outside normal bounds.

Case in point:
AMD when they initially released the Zen 2 (Ryzen 3000 series) they failed to reach advertised clock speeds. So they issued an update to the AGESA. This changed the throttle settings and made them more aggressive. The CPU/GPU still controls the power delivery.

You can make something too aggressive leading to thermal runaway.
 
  • Like
Reactions: Metal Messiah.
Drivers can update firmware and BIOS. So yes they could brink a card. Drivers can overclock hardware with the risk of making it unstable and leading to hardware failure (bricking the card). There was the issue of nVidia RTX 3080 cards boosting too high and becoming unstable which was fixed in a driver update. The "cap issue" was widely reported in the media and fixed in a driver update. There were reports that a nVidia driver bricked cards, this is the same in AMDs forums. Note that these tend not to be problems that get proven as fact (cause and effect found, just implied) but events that normally happen to coincide with a driver update. They happen but route cause isn't proven. Flashing/Blinking Display issues and games freezing are totally driver like issues. There were AMD GPU drivers reported as overclocking AMD CPUs if in the same system.

Those dies on the AMD RX 6000 series cards were totally destroyed like massive voltage and heat was applied. With zero thermal protection. The cards were stated to have been likely cleaned as well and then not correctly dried. Got to wait for the cause to be proven. Working theory is cryptomining and environment are the cause. 48 cards with the same issue is noteworthy enough to make the news.

Thanks for providing these links. They are quite informative and helpful. I will keep a note of all these issues. Btw yes, these RX 6000 series cards were victim of crypto-mining usage though.

I'm pretty sure desperate gamers might have bought these cards at a discounted price, so I blame them for not enquiring properly as to why the seller was selling them cheap. He got all the profit in the end though.


It turns out that it wasn't the driver. A great many of these GPUs came from the same seller and were bought used. They were used for mining in hot and humid conditions and it is now believed that the damage was caused by that.
Videocardz:
Radeon GPU cracking not caused by drivers, storing conditions and cryptomining to blame - VideoCardz.com
Notebookcheck:
Improper storage believed to be the number one suspect in the strange case involving 48 broken AMD Radeon RX 6800 / 6900 XT cards - NotebookCheck.net News

Yeah, I have seen those links before when there was an update on this issue. In fact zx128k posted that update video before under this discussion thread.
 

r_jackdaw

Reputable
Apr 6, 2019
80
0
4,540
I can confirm this issue as well. Im from the Philippines and experienced the same issue twice on amd rx 6600. ill be getting my 3rd replacement now and hopefully this issue is being fix
 
This January, my less than month old Gigabyte Rx 6600(non xt) died suddenly while on desktop, doing nothing. just shut down, run cool and undervolted, gave it no cause to die. While it was at RMA I bought Gigabyte Rx 6600XT, works like charm. First one was replaced but sold and still working fine.