[SOLVED] Defective Ray Tracing Cores on my 3070?

Status
Not open for further replies.
Jul 3, 2021
5
2
25
I was lucky enough to nab a 3070 at my local Microcenter just over a month ago – an MSI Ventus 3x. I was floored by the performance, 60+ FPS at 1440p. Delicious. I was super excited to try out some ray-traced games. That's when the problems began.

I had played through Control on my previous RX470. It ran well with tuned-down settings at 1080p. I maxed-out everything on my new 3070. Within 5 minutes of gameplay, Control would freeze, forcing a Task-Manager hard close of the game. These sub 5 minute crashes happened every time - but only with RT enabled. "Weird," I thought, maybe just a buggy RT game. I set Control aside and didn't think too much into it.

Then Metro Exodus Enhanced came along and I definitely wanted to play through that with all the RT bells and whistles. After installation and booting it up, I got to the main menu. I eagerly maxed out all the settings and enabled DLSS, then started a new game. The new-game intro played. It ended and the game attempted to load. The game instantly crashed back to desktop and the game's error reporter popped up. Tried again. Same result. Over and over. I tried messing around with all the settings, DLSS on/off, lower resolution, etc. I contacted Metro's developer's customer support (very responsive, btw), followed all their suggestions, and troubleshot the game via Google searches. Nothing worked to solve the crashes.

Amongst the fixes I attempted: Adjusting the size of the Page File, turning off all overclocks, turning on/off G-Sync, disabling any overlays, turning off any other running applications, updating the VBIOS, setting PCIE to 3.0, performing a clean system start with no other non-Windows services running, trying other cables (HDMI and DisplayPort)/ports, and more. Finally, I did a full clean reinstall of Windows. Despite all these attempted fixes, the problems with Control and Metro Enhanced persisted, and even got worse: Control would crash within a couple minutes, Metro crashed after the launch cinematic, right before the main menu would load.

All these issues only happened when RT features were enabled. Any non-RT games or games that had RT features but they were turned off ran flawlessly. I began to suspect that there was some kind of hardware issue with the RT features of the card. To attempt to confirm this I got a few more RT enabled games: Quake II RTX, Battlefield V, and Bright Memory: Infinite's RT benchmark. Sure enough, all had issues: Quake and Bright Memory would each crash in under two minutes. Battlefield V would crash instantly if RT was enabled. If it was disabled, the game ran perfectly.

To further corroborate my defective hardware theory, I removed the card and installed it into a completely different system. The issues with RT games were identical. That system also had an RTX card, a 3060. I reinstalled it and it had no problems whatsoever with RT.

Have I missed any potential fixes? Is my next move to RMA the card? It would obviously suck to be without a GPU for a month or more, but I paid a premium for an RTX card and I want it to work.

The specs for the systems I've used with the card:

System 1: Ryzen 5 2600 (stock), 1TB NVME SSD, 16 GB DDR4 Ram @ 2933 Mhz, MSI X470 Gaming Plus MB, EVGA 650 Watt PSU

System 2: i5 10600K (stock), 1TB NVME SSD, 16 GB DDR4 Ram @ 3600 Mhz, Gigabyte Z590 AORUS Elite MB, EVGA 650 Watt PSU
 
Last edited:
Such a 'selective' GPU failure is certainly possible. I had once such a card that was running fine all games that did not use hardware 3D acceleration but would crash instantly on any game that use it. Your case looks very similar.
As for fixes, the only other possibility would be drivers problem. You haven't specifically mentioned (or I missed that) installing other versions of NVidia drivers which is something worth trying (even if chance for success is low). I would also agree with rgd1101 that running 500W PSU might be a bit too low (also depends on which exactly model it is). While I don't think it's the cause of your problem, I do believe it might be unhealthy for you system in long run.
 
I had not tried older drivers. Unfortunately, I just tried a legacy package (460.89) and the crashes persisted.

I actually just installed a new 650 watt PSU, today, as well. Though I was doubtful that the PSU was the issue, I was happy to use this GPU issue as an excuse to upgrade. Unfortunately, it did not affect the crashes.

Thanks for the suggestions. Any other ideas?
 
I was afraid you were going to say that. Haha.

Given how difficult it is to get a GPU in the first place, these days, I'm not looking forward to the time it takes to get an RMA replacement. Oof.
 
To close the loop on this:

I RMAed the card about a month ago. Initially, MSI shipped the card back to me about two weeks after I shipped it to them, saying they had "repaired an electrical issue" with the card. Unfortunately, regardless of whatever they did, the problem with ray tracing games persisted... almost instantaneous crashes whenever RTX was enabled.

Shortly after discovering that the problems still persisted, I called up MSI directly. Surprisingly, I got someone on the phone within a few minutes. I explained the issue to them and the very helpful support specialist issued a new RMA number and provided me with a pre-paid shipping label to send the card to them once again. He promised that the card would receive an expedited replacement, this time. Today, I received that replacement and I am happy to report that the card seems to be functioning perfectly, so far. I just tested all the RTX titles that had issues and they now have no apparent problems!

To sum up, the problem I encountered did indeed seem to have been a hardware issue associated with the RT cores of the card causing them to fail and a replacement was the appropriate solution.
 
  • Like
Reactions: DRagor and rgd1101
Status
Not open for further replies.