[SOLVED] Rx 5700 XT cache hierarchy error

CapsLuke

Honorable
Jan 7, 2017
163
0
10,710
Hello everyone,
Lately I've been dealing with this error that's driving me crazy:

While playing call of duty (modern warfare and black ops cold war), my pc suddenly reboots, and the event viewer says "machine check exception cache hierarchy error reported by processor core".
I've done a lot of googling and tried out a bunch of fixes which never actually worked:
Most of posts i read where cpu focused, with things such as increase vcore, set vcore fixed to 1.35 volts and clock multiplier relatively low (I settled on 4.15 ghz), disable PBO, disable xmp because it's consider overclocking, increase load line calibration (i set it on the max level) and so on.
None of them worked, with every voltage settings and everything disabled my pc would encounter the same reboot again and again.
Other things I tried:
  • fresh windows install
  • disable fast startup
  • change motherboard
  • change cpu
  • change psu
  • run memtest86 on every ram stick
  • DDU on display drivers
  • chipset driver reinstall
  • update windows
  • update to latest bios
  • stress testing to reproduce the issue.

Since i swapped out every component except for the gpu I am now inclined to think that the problem is there.
Prime95, occt and cinebench r23 were not able to reproduce the issue, and temps are fine.

As a side note, my pc is dual booted with fedora linux, but the os are on separate drives and shouldn't interfere with each other.

My specs:
Asus rog strix x570-f (replaced with asrock x570 taichi)
Ryzen 7 3700X (replaced with Ryzen 7 5800X)
Psu corsair rm 750x 750w 80+ gold (tested also with a thermaltake toughpower grand rgb 650w 80+ gold)
Gskill tridentz rgb 32gb ddr4 @3200 ghz c16 (4x8gb, xmp enabled, memtest86 tested)
Gpu RX5700XT powercolor red devil
1x sabrent rocket 4 500 gb (os)
1x sabrent rocket 4 1tb (games)
1x kingston sata ssd 240gb (fedora linux)
1x wd blue 1tb 7.2k rpm (general files storage).

I don't know what to try next, any help is appreciated.
Thanks
 
Solution
I have an update.
I still was not completely sure why the crashes occured just on call of duty, because other games like RDR2 and AC: Valhalla didn't trigger any reboot, even though they are fare more demanding and resource hungry and I had pretty long gaminig sessions (2.5/3 hours).
I started questioning low level graphics apis and per-game optimization, I mean maybe in cod (which uses dx12) there's some kind of memory leak and the vram buffer gets filled in such a way that data is hard to retrieve and consistency errors in the cache hierarchy occur, and other games aren't affected cause they use vulkan/dx11.
To takle that I made two things:
first, I wiped out linux on the ssd, removing it from the dual boot configuration (I know that...
You did all the things that we would have suggested and even more.
New Windows installation takes software issue out of the equation. You also tried DDU and many other things for software so it's definitely not that. You also swapped almost every hardware except GPU. It does seem to be the issue.
The only thing that I can think you need to try, is to put the GPU in another system. If the issue persists then you are certain that GPU is to blame.

After that contact Powercolor for RMA.
 

CapsLuke

Honorable
Jan 7, 2017
163
0
10,710
You did all the things that we would have suggested and even more.
New Windows installation takes software issue out of the equation. You also tried DDU and many other things for software so it's definitely not that. You also swapped almost every hardware except GPU. It does seem to be the issue.
The only thing that I can think you need to try, is to put the GPU in another system. If the issue persists then you are certain that GPU is to blame.

After that contact Powercolor for RMA.
thanks, I fear that's the issue too. unfortunately I cannot test the gpu elsewhere, so I have to go with "I'm almost sure it's the gpu". My only concern is that powercolor may have not any stock to replace my gpu, and in case of refund i wouldn't have a gpu for a long time...
 

CapsLuke

Honorable
Jan 7, 2017
163
0
10,710
I have an update.
I still was not completely sure why the crashes occured just on call of duty, because other games like RDR2 and AC: Valhalla didn't trigger any reboot, even though they are fare more demanding and resource hungry and I had pretty long gaminig sessions (2.5/3 hours).
I started questioning low level graphics apis and per-game optimization, I mean maybe in cod (which uses dx12) there's some kind of memory leak and the vram buffer gets filled in such a way that data is hard to retrieve and consistency errors in the cache hierarchy occur, and other games aren't affected cause they use vulkan/dx11.
To takle that I made two things:
first, I wiped out linux on the ssd, removing it from the dual boot configuration (I know that 99% it wasn't the one to blame, but I've tried so many things that I thought it was worth a shot), I have all my work on various clouds anyway, so I never care about my data on full OS reinstalls.
second I limited the frames to 90 on call of duty games specifically, and managed to get a full day without crashes! I still don't consider it a victory, because my crashes happend once or twice per day (evening gaming sessions, usually from 21:00 to 01:00), thus statistically it could be that that single crash didn't occur right yesterday. So unless i manage to get at least 2 weeks of stable gaming, I'll not consider my problem solved. I'll keep you updated. Many thanks @dotas1 for your interest.
 
Solution
May 26, 2022
1
0
10
I found this thread while searching for a similar problem, so I'll add my story to the pile. I put together a new workstation with older parts using an Asrock motherboard, Ryzen 3 3900x, 32 gigs of DDR4-3600 and a powercolor RX 5700 XT gpu. Installed windows 10 LTSC (21H2). Experienced a lot of random reboots and crashes, and saw event log entries complaining of "CPU cache hierarchy" errors. I swapped a lot of parts without finding a resolution (Motherboard, CPU, Ram, power supply). When I finally removed the powercolor video card and swapped in a used Geforce 980 GTX from an old gaming machine, everything stabilized. No more reboots or errors in the event log.

tl;dr:
If you're using a Ryzen 3 cpu and powercolor xt 5700 or similar GPU, and seeing "cache hierarchy error" in the windows event logs with random reboots, try swapping out the GPU for something else.