Very strange graphics freeze problem: not hardware, not software?!!

karvala

Distinguished
Sep 19, 2009
11
0
18,510
Hi all,

I'm having a very strange problem with 3D graphics on a previously rock solid rig, after restoring my main XP partition from a True Image 8 disk image. This thread is essentially a follow-up to my previous thread (http://www.tomshardware.co.uk/forum/page-271462_13_0.html), which documents other problems with this restore in some detail. I've now got to the point, though I honestly don't know how (I didn't do anything different from when it worked to when it didn't!), that I can login to Windows (a number of times now, always seems fine), run all other apps etc. without any problems, but if I stress the 3D graphics at all, I get freezes and/or corruption. Things like Call of Duty 4 are a slideshow right from the first second until it crashes 30 seconds later. Things on lower settings or less stressful things like Half-Life 2, seem to be fine, but if I push them too far then it always feels like they're about to fall apart too.

Now here are the weird bits, which make this a seemingly impossible problem to solve:-

(1) I've replaced almost all of the hardware in this machine, and extensively tested all of it, and it has not made any difference. This includes different graphics card, different $memory, different motherboard, different power supply, different power outlet, different cables, all other cards removed, Prime95 torture test run (no problem), memtest and Windows memory diagnostic run (no problem).

(2) BIOS settings all checked/tested; no difference.

(3) Temperatures strictly monitored; under load the CPU and motherboard never get above 50C, and the graphics cards never above 75C with the fans on full. Previously, when the machine was running okay, I've run them over these temps without any problems, so I don't believe it's at all a heat issue, and especially as the problem will occur right from the start of a 3D app, before temperatures have had to the chance to reach a peak level. Conversely, turning the fans down and running something like Furmark to push the temperatures over these levels, does not result in any problems.

(4) A clean install of XP also showed these problems after a while (not when first installed, but a couple of days later as more stuff was installed; it also then disappeared entirely with supposedly missing boot files, which were actually still present).

(5) Trying various driver versions makes no difference, and these problems are occurring on the same kit that had the same hardware and drivers installed with no problems before this.

(6) A couple of different disk images have been restored, and both show the same problems at the same time. The severity of the problem also seems to vary between different restoration efforts; in The Godfather II, for example, previously corruption/crashes would happen straight away, and the 3DMark06 Firefly Forest test would crash on highest resolution. Now, however, The Godfather II can run more or less okay except for regular freezes (see below for more details of the freeze) every 10-30 seconds, and the Firefly FOrest test can run okay. There was no difference at all in the restoration procedure, the images or the disks used in these restorations, and yet the 3D problem severity has definitely changed between them.

(7) Initially while restoring the images, I also had a 64-bit Vista installation, that subsequently went down (in the process of trying to restore that at the moment). The same 3D apps running on that, and indeed all 3D apps, were absolutely fine. That also seems to suggest that it is not a hardware issue, or at least not a straightforward one.


For the freezes in The Godfather II, which is a good example, they occur seemingly randomly every 10-30 seconds, sometimes seemingly regardless of any significantly new textures being loaded (i.e. if I just stand in one place without moving). If there is significant movement during the freeze, it will cause graphics corruption and a slideshow effect; otherwise it is literally a freeze for a second or two, and then back to life again. It's also notable that during a slideshow/corruption freeze, if I alt-tab out and back again, the corruption is cleared and it seems fine until the next freeze.

I know that seems a very weird set of symptoms; I've never seen anything like it (and combined with the filesystem problems that are described in the other thread, it's just plain confusing/seemingly impossible). Does anyone have any insight, or suggestions as to what I could do that I haven't already tried, that might get to the bottom of it?


Finally, details of the current hardware setup, after changes during testing:-

Thermaltake Soprano Black case
OZC ModXStream 700W power supply
Asrock P45XE Crossfire motherboard
Q6600 Quad Core (previously overclocked to @3.2Ghz, but currently being tested at stock 2.4Ghz)
4*2Gb Corair XMS2 DDR2 800Mhz RAM (stock timings now)
2*Sapphire Radeon HD4850 512Mb PCI-E graphics cards (just one installed for now, though)
Creative X-Fi Elite Pro soundcard (currently not installed)
Texas Instruments 1394 network card (not actually used)
Samsung 22" 226BW monitor
 
Have you tried replaceing the HDD? seems like to me like the symtoms go away when you reload images. files may be getting corrupted overtime causing these errors.

Also open the side of the case and force air over the chipset. It could be failing causing read right errors.
 

jennyh

Splendid
You were using 2 graphics cards, so switch the power supply going into the one you have left to see if you get any issues. My friend had an issue with her second 4870 not being supplied enough power, and it caused a bunch of symptoms.
 

karvala

Distinguished
Sep 19, 2009
11
0
18,510
Hi guys, thanks for the replies. The case is actually open at the moment, and the temps are definitely okay. I've replaced the motherboard, and get the same symptoms on both (which have different chipsets), so I think that probably rules out a chipset problem. I've also tried three different HDDs, and get the same on each.

Files certainly seemed to be corrupted over time when the HDD is installed in the desktop; when attached the laptop they seem to be okay. The, combined with the graphics corruption under stress, led to me to suspect the power as well, so I changed the PSU to another one, but still had the same symptoms. Same applies to graphics cards (same symptoms with either), and PCI-E power connectors to the graphics cards (also same symptoms still).

I'm just in the process of switching out the CPU, and also testing a third graphics card not in the original system (only a rather bad one, though; it's the only one I can get hold of at present) which are literally the last components (other than the case, I guess, but that really would be a stretch) of the original system remaining, so it'll effectively be a different machine after this. If it still shows up after that, then I really don't know what to make of it.