Help! Graphics Lockup, unknown cause

Status
Not open for further replies.

Eredas

Distinguished
Sep 25, 2011
9
0
18,510
My wife and I are moderately PC literate, having built our own rigs from scratch. That as a basis, we're stumped.

Her comp locks up in multiple games -- League of Legends, Dead Island, Dragon Nest -- without warning. The error used to cause a hard freeze, with a solid black, grey, green, or other random color. The only way to fix it was to manually reboot. The most recent crash was a bluescreen error message with nv4_disp.dll being the flagged culprit.

History:
Hard lock in LoL and Dragon Nest. Immediate lock in DN, lock after 30 minutes to four hours in LoL.
We replaced graphics card, mobo, and HD (thinking corrupt disk issues)
New GPU is Nvidia GTX 285, and was used in a previous PC with no issues. GPU stress test passed, max temp @85°C
Memory error test passed at 1600% scanned
Continued hard lock in LoL, though after a driver update, no crashes YET (update was a few days ago).
Install of Dead Island, 30 minutes of gameplay, hard lockup.
Opened case, thinking maybe heat was the problem, and directed floor fan on GPU. Dust is minimal, ventilation is unobstructed
Approx 2 hours of DI later, bluescreen error as previously stated.

I can post DXDiag if needed.

She has recent (May 2011) drivers. Dead Island was played with third-party bug fix software, with her graphics settings reduced.

Thank you for any help you can provide.
 
Solution
ok , 89c is too hot ..although it says it within the safe specs of the cards ..its just too hot , especially for an extended period ..EVGA has a precision tuner , that you can download and adjust the fan speeds , if possible add an extra case fan and point it at the card ..most of the stuff I read , said it was fro mthe card being overclocked too much ..aparantly a lot of people having that same issue with that make model and batch of cards ..you want the Temps to get below 80c if possibel ..upping the fan speed might help (though noisier) ..also even though its OC card ..try underclocking it a tad till gets less hot and more stable ..a LOT of people RMA's those cards back to EVGA and got replacements that they said are working fine now...

Eredas

Distinguished
Sep 25, 2011
9
0
18,510
We tried new drivers on the other card which we replaced. I think that upgrading the drivers is actually what caused this whole thing to start happening-- she was running perfectly fine before I updated them out of the blue. Immediately after updating, her system started with the hiccups. At that point, rolling back didn't help at all, which is what took us on our goose chase through so much hardware.

She's running WinXP with vastly newer hardware than that OS because we are shy on funds to pick up a new copy of Win7 (buying new hardware is partially to blame).

The version of drivers we have now have worked for friends (who were helping us troubleshoot) for a while. We know these don't cause problems and are error free. Nvidia has always annoyed me with their buggy drivers. I'd rather wait a few months to make sure they're stable.
 

Eredas

Distinguished
Sep 25, 2011
9
0
18,510


Yes. When we swapped the HDD to another, we reformatted and installed a new copy of WinXP. Ran updates, installed old drivers (which have been replaced with those from May).

We have whiped old drivers with CCleaner and DriverSweeper.
 

ltrazaklt

Distinguished
Mar 14, 2010
193
0
18,690
how old are the Video Cards ? do you know EXACTLY which make and model you have ? ..
says you replaced the Graphics card, MB , and HD .. what exactly did you put in ?
what MB , make and model ?
what Vid Card , make and Model ..you said GTX 285 ..but is it Galaxy? XFX ? EVGA ?
sounds like the cards are overheating OR a driver problem ..Could be a compatability ussue with new Mobo ...have you check the BIOS settings on the MB ?
also the HD ? what is it ? ..wre these parts replaced , were they new or used ?
 

ltrazaklt

Distinguished
Mar 14, 2010
193
0
18,690
a quick scan and search of this issue shows most people having this issue on older overclocked cards ..(some from factory even) ..they set them back to default clocks ..but a LOT of different people showing a LOT of different fixes ..everything from the BIOS settings to Refresh rates on Monitors ..but I dont know if your card was new or overclocked at all ...
 

Eredas

Distinguished
Sep 25, 2011
9
0
18,510
Used card, but worked like a charm for the guy who had it-- he upgraded so he sent us this one hoping to help fix the problem. EVGA make.

Mobo is a new ASUS M5A88-v EVO
PSU: 730 watt
Monitor is default settings MAG flat screen.

If I remember right, the GTX 285 is a factory overclocked version of the GTX 275. I'll see about downclocking it to lower settings.

The stress test may have thrown a false negative for heat, since the temp was still increasing at the end of the test, and hits 85°C moments before the test ends, while the temp doesn't seem to plateau at all.
 

Do you know the guy personally? Did you you see the card running when he had it?
 

ltrazaklt

Distinguished
Mar 14, 2010
193
0
18,690
yeah was this a local guy or Ebay ..also if the GPU stress temp was still rising ..what Test did you use ? have tried FURmark for testing the GPU temp ?
and yeah did you see the card running for more than 20 mins ? the guy that had it before you could have used the heck out it (ie overheated it death , then 'upgraded' when issue started coming in) ..EVGA has great warranties though (usually Lifetime I think, though some are still standard 1-3 year stuff, and warranty may not transfer to you, but If you know the guy , maybe he could send it back on warranty for ya )
try FUR mark and see if the temps steady out ..personally 85c is too high imo (though I know many cards will just fine at that temp) ..if worse comes to worse ..you might try and open up the card , (remove the heatsink on it, and re -apply thermal paste) ..though I would not do this unless warranty is already expired
first Try FURmark and see what the Temps say and what the temp does ..ie plateau or just keep going higher and higher till lockup etc..
 

Eredas

Distinguished
Sep 25, 2011
9
0
18,510


I do know the guy personally. I never saw the card running, as he lives in another state, but I trust that he wasn't just trying to get rid of his card. He's the kind of guy who likes the newest, biggest thing -- the card was "lying around his office, taking up space" at that point.

I did use FURmark for the test, with the crazy fire-donut. I'll adjust the test settings and run it again.

Regarding seeing the card run for 20 minutes-- that's one of the problems. Sometimes it'll run for hours. Sometimes only minutes.
 

Is there any other machine that you could run it in?
 

Eredas

Distinguished
Sep 25, 2011
9
0
18,510
I'm trying FURmark with different resolution settings to try and stress the card more than the benchmark. After only 5 minutes at a smaller resolution (960x540) the card is hitting 89°C. Fan speed is only 53% though, and the temperature definitely seems to be leveling out (been ~89° for a solid minute now). I'm going to let it run for a while (as would happen in game) and see how the temperature behaves.

I don't know about the BIOS, I haven't modified BIOS on any of the new hardware.

Unfortunately we can't run the card in my PC because my PSU is significantly smaller than hers (550 I think?) and won't feed enough power to the card to allow boot up.

If the card's heat is the problem, what would be the proper course of action? Use software to modify fan speeds to keep the card cooler? Underclock the GPU (seems odd to do for a card built with overclocking specifically in mind)?
 

Eredas

Distinguished
Sep 25, 2011
9
0
18,510
Allowing the FURmark test to run continually while I make these posts, the temperature has leveled off at 89°C (for the last 5 minuteS), fan speed at 75%. Resolution: 1920x1080 @ 19 FPS
 

ltrazaklt

Distinguished
Mar 14, 2010
193
0
18,690
ok , 89c is too hot ..although it says it within the safe specs of the cards ..its just too hot , especially for an extended period ..EVGA has a precision tuner , that you can download and adjust the fan speeds , if possible add an extra case fan and point it at the card ..most of the stuff I read , said it was fro mthe card being overclocked too much ..aparantly a lot of people having that same issue with that make model and batch of cards ..you want the Temps to get below 80c if possibel ..upping the fan speed might help (though noisier) ..also even though its OC card ..try underclocking it a tad till gets less hot and more stable ..a LOT of people RMA's those cards back to EVGA and got replacements that they said are working fine now ..I dont know if that card was a FTW or OC or SSC or SC edition of the card ..but a lot of those are tweaked to max capacity ..and will start having issues like this after a long time or heavy use etc .. soo find the tuning utility from EVGA ..try upping the fan speed to a noise level you can tolerate ..see how much that drops the temps ..try to get those temps below 80c ..you can also get a cheap case fan ..and mount/position it so it blow directly at the card too ..(every little bit helps ,lol) ..you will have to look at your case for possibilities there .. but honeslty sounds like the card is overheating (and this would explain the random times, it might go 10 mins , might go go 10 hrs before lockup or Nv4Dll stuff ) ..although if I were you and the you know guy the card came from , and he's a friend etc ..I would ask him what the warranty was on the card , and if its still in Warranty ask him to RMA the card back to EVGA ..and get a new one (cheapest solution, though time consuming and you're without a card till it gets back) also ..EVGA has some kind of trade up program ..dont know if that card quailifies ..but might be possible to trade that card back to EVGA and get a newer model/series for a discount ..though I dont know all the details of that program ..might be worth looking into..
I think this is the page you need to find to the Software for your card ..
http://www.evga.com/support/download/

hope this helps ...
 
Solution

Eredas

Distinguished
Sep 25, 2011
9
0
18,510
Thanks for all the help, both of you.

We decided that using the EVGA software tool to downclock the card was the best route. After trying it, we dropped the factory clock speed (we had an SSC version, clocked to 648mhz) to 602mhz. The temperature never exceeded 75°C while playing Dead Island for over an hour.

We could probably turn the clock up ~20mhz and keep the card below 80°C, but my wife doesn't much care, so long as the graphics function. At this point, if it's not freezing up, I don't much care either.

The card was never warrantied, and the Trade-Up is only within 90 days of purchase. It was free to us, so this fix is good enough for us.

Thanks again!
 

ltrazaklt

Distinguished
Mar 14, 2010
193
0
18,690
glad to hear it ..hope it continues to function well , yeah a lot of those 'Special Edition' OC cards are just tweaked out to the max the card handle like very borderline stability no safety net or margin for error (some a little too much) ..I never OC until my warranty is over, lol, Im cheap..yeah less heat will also prolong the life of the card in general ..if the issue still happens ..let us know, we'll get it solved for ya.. enjoy
 
Status
Not open for further replies.