Unstable new build failing prime95 blend test

caesparktom

Distinguished
Apr 25, 2011
106
0
18,680
New Build is failing prime95 blend test. After 12 hours blend w/ 8 threads it crashes. It does not find any prime number errors and no bsod just shuts down and reboots to "windows did not shut down properly" screen.

fyi, I posted problem on my old new build thread here. Flong has been helping me troubleshoot but suggested I start a new thread. (Troubleshooting starts near end of page two.)

History/Steps taken so far:

Assembled new build (my first build) without major incident. Actually, I should mention that initially computer did failed to boot with a chassis intrusion error. I reset RTC RAM and didn't have any problems after that. So, I booted up. Installed win7 64 bit. Made no changes to bios and ran Prime95 blend test + Coretemp to monitor cpu temps. It crashed without errors after about 7 minutes--just shut down. I rebooted and confirmed that p95 results log was empty-no errors. Temperatures at full load during test were approx 53-57C.

Temperatures were a little higher than expected so I reseated cooler and re-ran blend test. This time if crashed after 5 minutes and temperatures were unchanged (53-57C).

I researched difference between blend, sml fft, and lrg fft tests here. Of interest was that fact that sml ffts test does not access memory as much as lrg ffts:

The "Small FFTs" test uses relatively small FFTs which can fit into the CPU cache. As a result, the small FFT test is the one which accesses your main memory the least but it still makes some memory accesses. Prime95 automatically creates a FFT size range which will fit into the L2 cache of your CPU.

The "In-place large FFTs" test uses relatively large FFTs which cannot fit into the CPU cache so this test accesses main memory a lot. It only accesses a relatively small amount of main memory because it runs the FFTs in-place so it accesses the same RAM over and over.

I realize that this is not best way to test memory but decided to run the sml fft test because both times a ran the blend test, the first test it ran and crashed on was incidentally a large prime. So I ran and it ran for an hour without errors or crashing. I stopped it and concluded that it was a memory problem.

I checked bios ram settings (timing, frequency and voltage) and found that they were off by a bit. So i set explicitly to manufacture specs. Re-ran blend test. This time it ran 13 hours, produced no errors, but did still crash... again no bsod just sudden crash. Here are some temp readings during first hour or so.

time | c1 | c2 | c3 | c4
5:00 | 48c | 51c | 55c | 52c
5:13 | 52c | 54c | 58c | 55c
6:50 | 53c | 53c | 57c | 55c
.
.
.
temps were pretty consistent for duration of test. Actually I turned A/C off for bit to see how temps reacted and core 1,2 and 3 went up to ~ 60c and core 3 to 64c. But, it IS pretty hot here in beijing without A/C. I should note that crash occurred in middle of night when A/C was off so temps were probable around 60-64c before crash. btw, should i be concerned that core3 is always about 4c hotter than other cores? btw, GPU temp also around 52C.

Anyways, Stock system w/no Overclock should have no trouble burning for 24+ hours right? Not sure what to do now. Seems my system is just not quite as stable as it should be and not cooling quite as well as i would like. What do you suggest? I would like overclock a bit, but if my stock system is already marginal in terms of stability i don't know that i have any room to push it.

Btw, this is a workstation build so stability is relatively important.

Specs:

MOBO: ASUS P8P67 WS REVOLUTION
CPU: i7 2600k
GPU: NVidia Quadro 4000
RAM: G.SKILL Ripjaws X Series 8GB (2 x 4GB) 240-Pin DDR3 SDRAM DDR3 1600 (PC3 12800) F3-12800CL9D-8GBXL --- x2(16GB)
SSD: OCZ vertex 3 (120GB)
HDD: SAMSUNG Spinpoint F3 HD103SJ 1TB ---x2(2TB)
DVD: Sony Optiarc AD-7240S
LCD: Dell UltraSharp U2410
Case: Corsair 650D
PSU: Corsair HX750
COOLER: Corsair H60

Steps I am considering:

1. replace ram w/ same ram or different ram(perhaps try different freq/timing)
2. upgrade cooler to H80 or H100.
3. jump out the window and take my new build with me

thanks for any help
 
Solution
I have not read all posts, my blinders are on to the RAM. FEW people know about the 4x4GB issues so @Proximon isn't the blame - at all; 4x4GB is very rare. Personally, I'd get a 4x4GB Matched Set because they are guaranteed to work together.

You can try setting them manually: Frequency -> DDR3-1600MHz, CAS 9-9-9-24 + Command Rate -> 2, raise the DRAM Voltage -> 1.55v~1.60v, plus VCCIO -> 1.20v. Then 'see' what happens... 50/50.

Good Luck! :)
Before you do #1, burn a memtest boot disk, boot from it and let it make a few passes testing the memory. You will probably have no issues, but will be a better test of the memory. Also, try taking the memory down to DDR3-1333 speeds and ensure you are at 1.5 volts.
http://www.memtest.org/

Let us know the results... You choose quality parts and I wouldn't expect those types of problems at stock settings.

Skip the water cooling. You will do just as good with a solid air cooler. The Hyper 212+ linked below will do a great job as-is, and even better if a second 120mm fan is added in a push / pull configuration.
http://www.newegg.com/Product/Product.aspx?Item=N82E16835103065
 
Before you do #1, burn a memtest boot disk, boot from it and let it make a few passes testing the memory. You will probably have no issues, but will be a better test of the memory. Also, try taking the memory down to DDR3-1333 speeds and ensure you are at 1.5 volts.

running memtest now... but what do you mean by "take the memory down to DDR3-1333 speeds and ensure you are at 1.5 volts"? --to check that psu is supplying correct voltage? but why take freq down to 1333 to check that?

Skip the water cooling. You will do just as good with a solid air cooler. The Hyper 212+ linked below will do a great job as-is, and even better if a second 120mm fan is added in a push / pull configuration.

ok, but are you just saying i paid too much for adequate cooling or do you think hyper 212+ will cool better and improve stability?

thanks
 


The P8P67 motherboard SHOULD be fine with 1.6 or 1.65 volt memoy modules (or some users report), but 1.5 is what is officially supported by Sandy Bridge. In your BIOS make sure the default voltage is being set correctly at 1.5v to match the memory modules. Taking the frequency down to 1333 is to further test and possibly point at a motherboard fault. Just another troubleshooting step...

Sorry... I overlooked that you already had the H60. My point was for performance per dollar, there is still no substitute for an air cooler (..with the Hyper 212+ being a good choice.).
 


ran memtest for 2 passes with no errors. Not sure if that means the memory is good or just that the memory has not yet been proven to be bad. Should i run few more passes? I still need to try turning ram freq down to 1333 and re-running prime95 like you suggested.

Also, Proximon suggested i run furmark to stress gpu. i ran burn test twice (xtreme burn-in+Post FX), 10 minutes each time. GPU temps leveled at 91C both times and computer didn't crash so I don't think gpu is causing instability--at least not as result of high temperatures b/c gpu temps during prime95 are in 50-60C range.

I'll probably try swapping out memory next and just see if that solves problem. Any recommendations for this build? perhaps i will have better luck w/ corsair Vengeance™ — 16GB Dual Channel DDR3 Memory Kit (CMZ16GX3M4A1600C9)


thanks
 



The hyper 212 will NOT improve cooling over the H60. Every professional review lists the H60 as superior to the Hyper 212. Also Proximon does not think it is a temp issue with your mobo and CPU - your GPU reaching 91C seems too high though - check with others on this thread though.
 
Well, I found ONE post from a PNY tech that said 90C at full load with a Quaddro 4000 was acceptable. The card itself just has poor cooling.

I doubt very much that the Quaddro plays any part in a Prime95 crash after 12 hours though.

Let me recruit some more help here....
 



especially since prime95 does not fully load the gpu. gpu temps during p95 blend are about 60C not 90C. and if GPU was going to crash system, it probably would have done so during the furmark burn test.

btw, thank for recruiting help. i really appreciate it.
 


Proximon, I had a GPU (ATI 5450) that did not cool well in a ZT Systems I-7 920 computer and it would function very well for several hours and then it would produce numerous BSODs. The symptoms of this failure look very similar to what I experienced with my GPU (in that computer I am not sure whether it is the GPU or poor case cooling). In this situation, the 650D is a top tier cooling case (I own it and my CPU temps stay under 40C most of the time, my GPU temp hovers between 50C - 60C under full load - I don't game; the mobo stays under 40C most of the time). The case is very good at cooling on high fans and yet the GPU is reaching 91C under load - this could be a defective GPU.

If the GPU is the problem, then it won't manifest until after several hours of heavy use. This is what is happening with this system. What do you think?
 



Thats an excellent idea that would not cost anything to try. It could be that as the Quadro is pushed to its limits as it nears the temperature ceiling it becomes unstable and fails sometimes but stays working at other times. 91C is pushing the 110C ceiling.
 
Here we have a VERY good point from one of our great helpers:

Assuming the 'Option 1' Tri-Channel set then do NOT use XMP, and instead manually set the Frequency, DRAM Voltage, and CAS Timings sometimes a QPI/VTT/VCCIO -> 1.20v helps.

Then I noticed 'F3-12800CL9D-8GBXL' I've posted here a zillion times that those will not 9/10 run Rated in 4x4GB and to use F3-12800CL8D-8GBXM IF you want DDR3 1600 MHz speed.

Solution 1 - Run the F3-12800CL9D-8GBXL at DDR3 1333 MHz and look at CPU-z 'SPD' for 667MHz CAS Timings and add Command Rate -> 2.

Solution 2 - RMA RAM and get 2 sets of F3-12800CL8D-8GBXM

Solution 3 - Get a 4x4GB Matched Set -> F3-12800CL9Q-16GBXL


I had failed to note that you bought two sets of that RAM. You should try each set separately, or just do as he suggests.
 
I have not read all posts, my blinders are on to the RAM. FEW people know about the 4x4GB issues so @Proximon isn't the blame - at all; 4x4GB is very rare. Personally, I'd get a 4x4GB Matched Set because they are guaranteed to work together.

You can try setting them manually: Frequency -> DDR3-1600MHz, CAS 9-9-9-24 + Command Rate -> 2, raise the DRAM Voltage -> 1.55v~1.60v, plus VCCIO -> 1.20v. Then 'see' what happens... 50/50.

Good Luck! :)
 
Solution
caesparktom, remove one of your 2 x 4 GB RAM kits and just run 8GB. Make sure that you have them in the correct slots for this setup. Run your computer under Furmark/Prime 95 and it should tell you if this is the problem. If it is the problem, you can always get a 4 x 4 GB matched set if you want to run 16 GB.

You might try 8GB to see if it meets your needs because reviews have shown that there is only a 1-3 % performance increase by going to 16 GB.

Good luck
 
excitedly pulling two dimms as i type. If this works, I think i will just get a 16gb matched set rather than reducing ram speed to 1333. Will corsair Vengeance™ — 16GB Dual Channel DDR3 Memory Kit (CMZ16GX3M4A1600C9) work for me? Strictly speaking, its not list on QVL

@flong I understand that performance increase is negligible, more concerned with not running out. I run very ram-intensive programs at same time. Typically have rhino, photoshop, illustrator, indesign, and 50 chrome tabs open. Have always had to tip toe around my available ram and switch to single tasking when files get too heavy.

@jaquith, great observation/solutions! thank you. setting ram timings manually almost stabilized my system so I am eager to try setting the DRAM Voltage to 1.55v~1.60v and VCCIO to 1.20v. Though will probably still get a 4x4 matched set in the end.:)









 
ran blend test with single 2x4 kit of ram installed. Ran 12 hours then froze.>< i'm at a loss once again. Can try the other 2x4 kit or just swap out for 4x4 corsair kit. These 12+ hour tests are killing me...
 


Changes:
DRAM freq 1600
DRAM Timing 9 9 9 24 2
DRAM Voltage 1.5V

everything else is default or auto

H60 pump is connected to pwr fan header (~4300 rpm)
H60 fan connect to cpu_fan header (~1000 rpm)

Thats all I have done.

I do notice turbo-mode is auto enabled and that "Ai overclock Tuner" is set to auto(vs manual or xmp). Does that mean it is adjusting my manual dram settings in real-time?. At the top of bios/Ai Tweaker tab it reads

Target CPU Turbo-mode Speed: 3800MHz
Target DRAM Speed: 1600MHz


 
Reverse:
H60 pump is connected to pwr fan header (~4300 rpm)
H60 fan connect to cpu_fan header (~1000 rpm)


H60 pump -> CPU Header
H60 fan -> PWR Fan Header

BIOS:
AI OC Tuner -> Manual
DRAM freq 1600
DRAM Timing 9 9 9 24 2
DRAM Voltage -> 1.55V
VCCIO -> 1.20v
Q-Fan -> Disabled
Phase Control -> Optimized {~4.0GHz-4.3GHz; or high loads}, Extreme {>4.3GHz}
 
ok, i'll try and retest

Q, set phase control to optimize or extreme?

fyi, h60 manual reads:
"connect fan power to CPU_FAN header"

(btw, also have 3 unused CHA_FAN headers because chassis fans are connected to case's fan controller.)

 
The reason for the H60 manual is most BIOS regulate the pump's speed along with the CPU temp, and the end user doesn't know to set it properly. I was looking for a BIOS setting for CPU Fan failure which is why I like the pump on the CPU Fan header -- Beeps = Pump failure. I'm hoping the BIOS considers the CPU Fan failure as a H/W failure and you still get the Beep tones.

If NO OC but high load e.g. Prime95; Phase Control -> Optimized
If OC see above.
 
perhaps voltage bumping was premature. b/c if Ai OC tuner has been set to auto this whole time. I haven't yet run a test with the following bios settings:

AI OC Tuner -> Manual
DRAM freq 1600
DRAM Timing 9 9 9 24 2
DRAM Voltage -> 1.5V

In fact even with both 2x4 kits installed (unmatched 16gb). The system was almost stable(ran 13 hours blend) with these settings:

AI OC Tuner -> Auto
DRAM freq 1600
DRAM Timing 9 9 9 24 2
DRAM Voltage -> 1.5V

what do you think?



 
It tells me your BIOS is unstable, not from my settings, and needs to be updated.

The latest version is 1302. Prior to BIOS Flashing please use the 'Jumper' method as outlined in your manual on page 2-15. This fully resets the BIOS to Defaults. In addition, you cannot use any USB 3.0 ports, and only use the USB 2.0 ports on the back of the MOBO {I/O}.

See for additional help -> http://www.tomshardware.com/forum/289507-30-what-flash