Question Potential GPU issues ?

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
I recently purchased a used RTX 3070 from a miner on ebay (I asked him about the conditions the card was in and the info he gave was fine as far as I can tell, the card itself was also visually fine). I've been using it now for about 2 weeks with 0 issues - it does fine in all the benchmarks I ran like furmark and heaven though I only ran those for less than 30 mins each which is probably too short and I will test for longer sometime soon.

Today however my monitor suddenly lost signal while playing Rainbow 6 Siege and I lost audio. With holding the power button to force shutdown not doing anything, I had to flick my power supply switch off and on then turn the pc on multiple times (took 4 attempts before it finally booted up). After that the same thing happened after 5 minutes but this time I was simply browsing forums. Again I restarted after multiple psu switch flicks and the system booted but the same issue happened after about 5 minutes of browsing forums (same as the previous time).

I then took out the card and put in my old GTX 1060 to see if any issues arose and after using my system for 30 minutes there were no problems. I then reinstalled my 3070 and it has now been running unigine heaven for 30 minutes with no problems.

This is somewhat confusing and may have just been a seating problem (though I reseated the 3070 and its pcie cables multiple times before testing with my 1060), however my ebay buyer protection for the 3070 will only have me covered for another 2 weeks so I'd like to identify whether the previously described problem is an issue with the card itself or something else? How would I go about doing this and what are potential culprits here?

System specs:
CPU: Ryzen 5 5600 with stock cooler
GPU: MSI RTX 3070 Gaming X Trio
RAM: Corsair vengeance LPX ddr4 2x8gb (16gb) 3200mhz
PSU: Corsair TX650m

Appreciate any help!
 

NaClKnight

Prominent
Dec 7, 2022
64
14
545
I recently purchased a used RTX 3070 from a miner on ebay (I asked him about the conditions the card was in and the info he gave was fine as far as I can tell, the card itself was also visually fine). I've been using it now for about 2 weeks with 0 issues - it does fine in all the benchmarks I ran like furmark and heaven though I only ran those for less than 30 mins each which is probably too short and I will test for longer sometime soon.

Today however my monitor suddenly lost signal while playing Rainbow 6 Siege and I lost audio. With holding the power button to force shutdown not doing anything, I had to flick my power supply switch off and on then turn the pc on multiple times (took 4 attempts before it finally booted up). After that the same thing happened after 5 minutes but this time I was simply browsing forums. Again I restarted after multiple psu switch flicks and the system booted but the same issue happened after about 5 minutes of browsing forums (same as the previous time).

I then took out the card and put in my old GTX 1060 to see if any issues arose and after using my system for 30 minutes there were no problems. I then reinstalled my 3070 and it has now been running unigine heaven for 30 minutes with no problems.

This is somewhat confusing and may have just been a seating problem (though I reseated the 3070 and its pcie cables multiple times before testing with my 1060), however my ebay buyer protection for the 3070 will only have me covered for another 2 weeks so I'd like to identify whether the previously described problem is an issue with the card itself or something else? How would I go about doing this and what are potential culprits here?

System specs:
CPU: Ryzen 5 5600 with stock cooler
GPU: MSI RTX 3070 Gaming X Trio
RAM: Corsair vengeance LPX ddr4 2x8gb (16gb) 3200mhz
PSU: Corsair TX650m

Appreciate any help!

Was this the first crash you've suffered while using it? Any prior issues or distress?

Were you monitoring GPU/VRAM temps at the time?

Have you tried a different port on your GPU or monitor?

Keep your 3070 in for now as we want to replicate the problem as conclusively as possible.

The problem occurring twice sounds like an issue once the GPU/VRAM is hot, and synthetic loads like furmark and ungine don't push things in the same way as actuall use.

However
With holding the power button to force shutdown not doing anything, I had to flick my power supply switch off and on then turn the pc on multiple times (took 4 attempts before it finally booted up). After that the same thing happened after 5 minutes but this time I was simply browsing forums. Again I restarted after multiple psu switch flicks and the system booted but the same issue happened after about 5 minutes of browsing forums (same as the previous time).

Sounds like something else is going on other than the GPU. Even a GPU crash shouldn't require multiple MB/PSU cycles like that.

The TX650M comes with two separate PCIe power cables right? Do you have both cables connected to the PSU, one plug from each cable? Or just one cable with two daisy chained plugs?
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
Was this the first crash you've suffered while using it? Any prior issues or distress?

Were you monitoring GPU/VRAM temps at the time?

Have you tried a different port on your GPU or monitor?

Keep your 3070 in for now as we want to replicate the problem as conclusively as possible.

The problem occurring twice sounds like an issue once the GPU/VRAM is hot, and synthetic loads like furmark and ungine don't push things in the same way as actuall use.

However


Sounds like something else is going on other than the GPU. Even a GPU crash shouldn't require multiple MB/PSU cycles like that.

The TX650M comes with two separate PCIe power cables right? Do you have both cables connected to the PSU, one plug from each cable? Or just one cable with two daisy chained plugs?

  • This is the first time that there have been issues of any kind with the card.
  • I tried a different port on the 3070 with the same issue.
  • I was not monitoring things at that time, but when I was doing tests on multiple games before (R6S included) the GPU temps didn't go any higher than 65 in any game.
  • I'm using 2 separate PCIe cables to power the card.
 
You might to be safe, run ddu in safe mode to remove all of the nvidia drivers and then do a reinstall with the latest driver to see if it still acts up. Also, if you have a friend with a gaming PC you may see if they can test in their system to see if it works for them or not.
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
You might to be safe, run ddu in safe mode to remove all of the nvidia drivers and then do a reinstall with the latest driver to see if it still acts up. Also, if you have a friend with a gaming PC you may see if they can test in their system to see if it works for them or not.
The interesting thing is that I've now been running the card with no changes since the time of my post and it has been doing just fine.
 
I’m just saying let them know what happened with the card so they are aware that there is a potential issue. Then if it does it again you have a paper trail. But I still do wonder if you weren’t having a power supply issue since the power supply had to be reset so many times.
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
I’m just saying let them know what happened with the card so they are aware that there is a potential issue. Then if it does it again you have a paper trail. But I still do wonder if you weren’t having a power supply issue since the power supply had to be reset so many times.

I've never had to reset the power supply more than once before so that is indeed slightly concerning. However, if it was a PSU fault then shouldn't it continue causing problems with the same load? Currently I'm not even sure how to replicate the issue because it just seems to have 'fixed itself', though that may well not be the case.
 

falcon291

Honorable
Jul 17, 2019
647
145
13,290
I recently purchased a used RTX 3070 from a miner on ebay (I asked him about the conditions the card was in and the info he gave was fine as far as I can tell, the card itself was also visually fine). I've been using it now for about 2 weeks with 0 issues - it does fine in all the benchmarks I ran like furmark and heaven though I only ran those for less than 30 mins each which is probably too short and I will test for longer sometime soon.

Today however my monitor suddenly lost signal while playing Rainbow 6 Siege and I lost audio. With holding the power button to force shutdown not doing anything, I had to flick my power supply switch off and on then turn the pc on multiple times (took 4 attempts before it finally booted up). After that the same thing happened after 5 minutes but this time I was simply browsing forums. Again I restarted after multiple psu switch flicks and the system booted but the same issue happened after about 5 minutes of browsing forums (same as the previous time).

I then took out the card and put in my old GTX 1060 to see if any issues arose and after using my system for 30 minutes there were no problems. I then reinstalled my 3070 and it has now been running unigine heaven for 30 minutes with no problems.

This is somewhat confusing and may have just been a seating problem (though I reseated the 3070 and its pcie cables multiple times before testing with my 1060), however my ebay buyer protection for the 3070 will only have me covered for another 2 weeks so I'd like to identify whether the previously described problem is an issue with the card itself or something else? How would I go about doing this and what are potential culprits here?

System specs:
CPU: Ryzen 5 5600 with stock cooler
GPU: MSI RTX 3070 Gaming X Trio
RAM: Corsair vengeance LPX ddr4 2x8gb (16gb) 3200mhz
PSU: Corsair TX650m

Appreciate any help!

I faced the same problem with the new RTX 3070 I bought on November. The problem only resolved after DDU and installing Nvidia drivers and updating some graphic related files using Armory Crate (Yes my GPU is Asus). If they didn't work I would be installing Windows 11 again.

The problem was same, maybe the solution is also same.
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
I faced the same problem with the new RTX 3070 I bought on November. The problem only resolved after DDU and installing Nvidia drivers and updating some graphic related files using Armory Crate (Yes my GPU is Asus). If they didn't work I would be installing Windows 11 again.

The problem was same, maybe the solution is also same.

Do you know specifically which files you updated with armory crate? Thanks for the input.
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
I've tried DDU and reinstalling drivers, but the problem has returned. I'm also starting to think that the PSU may in fact be the culprit and so I've ordered a slightly higher wattage one to test with - I intend to return it if the issue persists at which point I'd probably be looking at a windows reinstall. I will update the thread if the issue arises again with the new PSU.
 
  • Like
Reactions: falcon291

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
Update as promised. The new PSU did not in fact fix the issue and the ebay money back guarantee was running out so I've decided to just send back the card as defective as I don't have time to properly test it again. The issue seems to take some time to become apparent since when I installed the new PSU my system ran fine for 2 days. On the third day, I got a black screen while playing Wargame Red Dragon (which is not a graphically demanding game) and the problems started again as before.
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
Update to the update. I put my old GPU back in but after 5 minutes of my pc working fine the problem came back, this almost certainly means it was not a defective GPU. I shall do a windows reinstall and see what happens, but any other ideas for what the issue could be are still appreciated.
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
In regards to reinstalling windows, the issue has not been fixed and there are no changes in the way it happens. I suppose this means that a software issue can be ruled out as well.


May not hurt to run a ram diagnostic and check the health of your ssd to be safe.

Ran memtest86 for 3 loops with all tests enabled which returned no errors so I guess RAM should be ok. Using CrystalDiskInfo has revealed that my HDD has some reallocated sectors which is not a good sign but since it's now empty with all data backed up from before reinstalling windows, that shouldn't have any influence here I don't think. My SSD has 'good' status at 94% however I'm not really sure how to interpret the output from the program and where exactly this 6% imperfection is. Here is the text output from CrystalDiskInfo for my SSD:

----------------------------------------------------------------------------
(02) SanDisk SDSSDH3 1T00
----------------------------------------------------------------------------
Model : SanDisk SDSSDH3 1T00
Firmware : 401100RL
Serial Number : 191619802412
Disk Size : 1000.2 GB (8.4/137.4/1000.2/1000.2)
Buffer Size : Unknown
Queue Depth : 32
# of Sectors : 1953525168
Rotation Rate : ---- (SSD)
Interface : Serial ATA
Major Version : ACS-4
Minor Version : ACS-4 Revision 5
Transfer Mode : SATA/600 | SATA/600
Power On Hours : 12188 hours
Power On Count : 1999 count
Host Reads : 70597 GB
Host Writes : 54944 GB
NAND Writes : 44658 GB
Temperature : 22 C (71 F)
Health Status : Good (94 %)
Features : S.M.A.R.T., APM, NCQ, TRIM, DevSleep, GPL
APM Level : 0080h [ON]
AAM Level : ----
Drive Letter : C:

-- S.M.A.R.T. --------------------------------------------------------------
ID Cur Wor Thr RawValues(6) Attribute Name
05 100 100 __0 000000000000 Reassigned Block Count
09 100 100 __0 000000002F9C Power On Hours
0C 100 100 __0 0000000007CF Power Cycle Count
A5 100 100 __0 00A511CC0D0D Block Erase Count (SLC)
A6 100 100 __0 000000000001 Minimum P/E Cycles
A7 100 100 __0 00000000003D Maximum Bad Blocks per die
A8 100 100 __0 000000000046 Maximum P/E Cycles
A9 100 100 __0 000000000255 Total Bad Block
AA 100 100 __0 000000000000 Grown Bad Blocks
AB 100 100 __0 000000000000 Program Fail Count
AC 100 100 __0 000000000000 Erase Fail Count
AD 100 100 __0 00000000002C Average P/E Cycles
AE 100 100 __0 000000000110 Unexpected Power Loss Count
B8 100 100 __0 000000000000 End-to-End Error Detection/Correction Count
BB 100 100 __0 000000000000 Reported Uncorrectable Errors
BC 100 100 __0 000000000002 Command Timeout Count
C2 _78 _57 __0 0039000B0016 Temperature
C7 100 100 __0 000000000000 CRC Error Count
E6 __6 __6 __0 064404280644 Media Wearout Indicator
E8 100 100 __4 000000000064 Available Reserve Space
E9 100 100 __0 00000000AE72 NAND GB Written
EA 100 100 __0 0000000135C2 NAND GB Written (SLC)
F1 253 253 __0 00000000D6A0 Total GB Written
F2 253 253 __0 0000000113C5 Total GB Read
F4 __0 100 __0 000000000000 Temperature Throttle Status
 

SocialFish

Honorable
Dec 24, 2016
22
2
10,515
In the end, the thing that fixed it for me was clearing my CMOS. The issue has been gone for a while now but I just randomly remembered about the thread and thought I'd update for anyone reading this in the future.