CPU is failing, time for a new build.

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

nerwin

Commendable
Aug 5, 2016
42
0
1,530
Hello! I'm new here and just came from another tech forum that has pretty much turned into a ghost town, no one is willing to help out with my new PC config so hopefully someone here can give me some advice as I've been out of the PC building race for a little awhile, just been focusing on my photography.

Anyways,

I've been getting a LOT of crashes from my current PC which has Windows 10 Pro 64 bit installed and I managed to get someone to analyze the mini dumps and said they are related to the CPU.

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred.
FAILURE_BUCKET_ID: 0x124_GenuineIntel_PROCESSOR_CACHE

Which tells me there there is someone wrong with my processor. It still works right now but the crashes are occurring more frequently than before, sometimes it will just lock up with the sound on a loop and I have to manually restart it. It just isn't that reliable anymore. So I'm looking into options.

My current setup is:

CPU: Intel Core i5-3570K
MOBO: ASUS P8Z77-V LK
RAM: G. Skill Ripjaws 16GB DDR3 2133 CL9 (4x4gb)
VIDEO: EVGA GeForce GTX-750 Ti SC 2GB
PSU: Corsair TX650M
CPU COOLING: Cooler Master Hyper 212 EVO/w Noctua Fan
SSD: Samsung 128GB 840 Pro

Picture of the setup

I had the CPU overclocked to 4ghz for a couple years and had no issues, but change it back to stock settings to prolong it..crashes are not as frequent as they were or so it seems anyways.

I don't know for an absolute FACT that my processor is failing but every mini dump that is created is always related to the CPU. I ran Intel's CPU diagnostic tool, passed and I ran a stress test on the processor and that also passed.

The New Build

Keep in mind that I don't do any gaming (or much of it anyways) But just do a lot of photo and video editing mainly.

I think I can reuse my PSU, Video Card and SSD on the new one. Those things I can always upgrade after the fact. Video card is less than a year old and works fine for I need.

So I'm looking at these parts:

CPU: Intel Core i5-6600K 6M Skylake - http://www.newegg.com/Product/Product.aspx?Item=N82E16819117561

MOBO: ASUS Z170-A LGA 1151 - http://www.newegg.com/Product/Product.aspx?Item=N82E16813132566

RAM: CORSAIR Vengeance LPX 16GB (2x8gb) DDR4 3000 CL15 - http://www.newegg.com/Product/Product.aspx?Item=N82E16820233863 (need that lower profile ram for that massive cooler, lol).

One question about the ram though. Since the DDR3 2133 CL9 ram I currently have is pretty good and its fast, wouldn't the DDR4 3000 with CL15 be slower than what I'm using now?

Will there be any increase in performance by doing this upgrade? Because if there isn't much then I might as well just go with Haswell/Canyon setup and save myself $150.

If my CPU wasn't failing, I wouldn't be making this post because I'd still continue to use the 3570k but apparently its failing, so might as well build a new system.

I'm pretty sure I can resell the old parts and get some money back..CPU would have to be sold as for parts and not working. But RAM is good and mobo is still working good. I'd also be upgrading to the Fractal Design R5 as well or perhaps the Define S but since I have a couple internal HDD, might as well get the R5 for not much more.

Any thoughts, recommendations or advice anyone is willing to share? It would REALLY be appreciated.

Thanks in advance!

EDIT: If anyone is bored and wants to take a look at my mini-dumps to see if my friend is wrong, go for it. I zipped the dumps up to my OneDrive. Download Here.

If its something that could actually be fixed with software, you just saved me a boat load of money! But I think its hardware because this issue had occurred on Windows 7, Windows 8.1 and now Windows 10. No matter how many times I've reinstalled the OS, it still happens.
 
Solution
That's not entirely bad, I've seen worse, and corsair units are very good about dieing quietly, no big explosions or trashing entire pc's, but with the age of the psu and it's line, then yeah, it's probably coming towards end of useful life limits. Don't see any real reason for a rush job, but I'd start looking for a good replacement. Unless you see one on sale, that's always a bonus many don't get when needing a replacement right now.


So I probably should just get a new cpu, mobo, ram and psu?

I'm really sick and tired of these problems, I need something reliable so I don't have to worry about my system crashing during video and photo editing. Don't make me switch to Mac. Lol
 
Then I would advise:
--> set manual CPU voltage to 1.10V or 1.15V (your definitely safe up until 1.20).
if still crashed revert to auto and
--> test ram sticks per pair. Remove 2 secondary sticks and test run with only 2 sticks installed (8G should still be plenty to work with for a while).
if still crashes try other pair of ram sticks
if still crashes swap PSU or buy new PSU
if still crashes throw it out the window and buy a new PC...
 
So I'm not sure what to do about the CPU voltage. I set it to Offset Mode and the CPU voltage dropped to 0.960v.

I have no clue how to get it to display 1.15v. Right now its at 1.016v, should it read 1.150?
 
Ok, I popped it into Manual Mode and entered 1.150...not sure if that's right or not. But its reading 1.12-1.136v now.

160805211209.bmp


 


If there was a fatal issue with my CPU, it should not let the stress pass right?

So maybe this WHOLE time my crashes were related to a low vcore?
 
Well I did a 15 minute stress test using Real Bench and it passed.

Temp only reached 58c..but it stayed mainly around 53-56c.

I can try a 30 minute one next.
 
I'll see how it goes. But I kind of like...want to build a new computer anyways. But If I can limp mine along longer, I might just go for an i7.

So after bumping my voltage back up to regular levels, the computer seems to be performing a little faster. Start up times are faster than before and just logging into windows it self is faster.

I tried doing research about low voltage on CPUs and performance issues but didn't really find much.

So can low voltage such as mine at like 1.008v or below cause performance issues?

EDIT:

It seems faster probably because I did this, lol.

oced.png


Here's what AIDA64 Sensor Report says if its helpful.

--------[ Sensor ]------------------------------------------------------------------------------------------------------

Sensor Properties:
Sensor Type Nuvoton NCT6779D/5535D (ISA 290h)
GPU Sensor Type Diode (NV-Diode)
Motherboard Name Asus P8Z77-V LK
Chassis Intrusion Detected No

Temperatures:
Motherboard 27 °C (81 °F)
CPU 25 °C (77 °F)
CPU Package 35 °C (95 °F)
CPU IA Cores 35 °C (95 °F)
CPU GT Cores 33 °C (91 °F)
CPU #1 / Core #1 35 °C (95 °F)
CPU #1 / Core #2 34 °C (93 °F)
CPU #1 / Core #3 34 °C (93 °F)
CPU #1 / Core #4 35 °C (95 °F)
PCH Diode 47 °C (117 °F)
GPU Diode 27 °C (81 °F)
Samsung SSD 840 PRO Series [ TRIAL VERSION ]
WDC WD2003FZEX-00Z4SA0 [ TRIAL VERSION ]
ST2000DM001-1E6164 [ TRIAL VERSION ]

Cooling Fans:
CPU 408 RPM
Chassis #1 678 RPM
Chassis #2 760 RPM
Chassis #3 692 RPM
GPU 40%

Voltage Values:
CPU Core 1.152 V
+3.3 V 3.312 V
+5 V 5.000 V
+12 V [ TRIAL VERSION ]
+5 V Standby 5.164 V
VBAT Battery 3.344 V
GPU Core [ TRIAL VERSION ]

Power Values:
CPU Package 11.76 W
CPU IA Cores 5.89 W
GPU TDP% [ TRIAL VERSION ]
Debug Info F 678 408 760 692 0
Debug Info T 27 33 255 / 255 25 / 27 33 83 225 94 03
Debug Info V 90 7D CF CF 7D FF 30 (03) / 90 7D CF CF 7D FF 30 D5 / D1 83 EA 20 7D 7D 1C 00
Debug Info I C1 C562
 
Please tell me I did something right.

I just ran RealBench stress test for 1 hour and it passed.

The room temp was 75f and core 1 reached 64c as the highest temp, core 3 reached 61c and other 2 cores stayed around 57c. So average was about 58c.

I monitored the vcore and it dropped a little during the stress test but if I remember right, that is normal and called vdroop I think.

Not bad for being overclocked at 4.2ghz right?

All the fans worked as they should, ramped up when it got hotter. The PWM Noctua was running around 950RPM the whole time and then after the test it went back down to its normal running speed around 450-500RPM.

If there was something fatally wrong with my 3570K, you would think this stress test would cause the system to crash right?

Should I run the stress test even longer? Or is 1 hour plenty?
 
Normally I'd think that 1 hr would be good, for middling stability tests between OC bumps for sure. But you've a history of unexplained bluescreens and cpu errors. I say throw the works at it, full 8 hours. If your pc can pass that, not crash, it would be very reasonable to think she'll be able to game without crashing. Do it overnight, or during work/school etc. It may seem excessive to some, but the aim of stability testing is to drive the pc harder than you ever possibly could in games, and have it succeed.

Results so far are very encouraging though, and temps look good. Not exactly thrilled about 11.76v on the 12v rail, if you can get ahold of a digital multimeter I'd test the wires on one of the molex plugs when the pc is heavily loaded. Orange wires should read @3.3v, red wires should read @5v, yellow wires should read @12v and black, of course, is ground. It is entirely possible that the software has misread (often does) the results.
 



Ok...I'll give 8 hours a shot. If that don't fail, then it can't be the hardware.

I can live without my PC for 8 hours I guess. I'll just use my Samsung Tab S2 in time, but I'll keep an eye on the temps and if they get to a concerning level I can stop it. But it should be smart enough to shut down by it self anyways if its too extreme.
 


Hmm. I did notice the 12v jump between 11.9 and 12v during stress test, but id figured that was normal.

Its a PITA to get the side panel off this Fractal Design Arc Midi due to poor manufacturing. Should of returned it when i got it, but didnt want to delay my pc build any further. Im tempted to go guy the Define R5 right now and be done with it lol.

Maybe Monday i can do the multimeter test, that is if it passes this stress test. I hope im not computerless by the end of today.




 
11.7 is a little on the low side for a 12v rail. It's just in the way things work. Ohms law. Resistances in, say a gpu, won't change, it'll be the same resistance at any given point at any given time. If the voltage is low however, the current will raise to match the necessary power. And there's very little tolerance for current changes. I just have a slight worry with the 12v rail simply because it is an older TX and pretty much 3-4 years is good. After that it starts to be pot luck. Some ppl seem to get great units that last forever it seems, but caps do degrade, and you will see drops in performance. Part of the reason to get an outsized unit, so performance after a few years is still well beyond what's necessary. A good psu also helps a lot with stability, having decent ripple control is vital in that respect and as caps degrade, so does ripple control. This is mostly a FYI, but you might want to start giving some thought into replacement, although it's not a vital hurry like having some no name cheap made in Taiwan junk.
And, as I said, the software could be wrong. It often is, you might actually be getting 12.1v, which would be great. That's the reason for the multimeter. It won't lie.
 


Well im tempted to just get a new PSU anyways and rule that out. I was thinking to get the Corsair RM750x just to be on the safe side. A little more than what I need, but it has a good rating and good warrenty. So should be set when I decide to upgrade to the next generation.
 
The RMx are extremely good units. A 650 would still be more than you could need, is probably cheaper and would put you closer to the 50-70% efficiency range, so cheaper on the electric bill over the years. Unless you plan on sli, then you'd need the 750 or maybe the 850.
 


Will 650 be enough to add a single powerful GPU and power my two mechanical drives?

I should have mentioned that i had dual 2tb drives too. My bad.
 
Easily. Hdd's at max only soak up @6-10w at best, so even both hdds in raid running flat out is @20w or less. Fans run about 3-5w on average. Pretty much 100w for the entire system, 200w for a high OC cpu, 375w for a 2x8pin +pcie gpu. That's absolute maximums. Which is almost physically impossible to obtain, trying to run a gpu at 100% load with a cpu at 100% load and ram, hdds, fans etc all at 100% never happens. In reality even heavy game usage on my 3770k at 4.6GHz has never passed 55%, my 4 fans are never beyond @50%, gtx970 gpu has reached 99%, 16Gb of ram might reach 60%, hdd and ssd are never both flat out thanks to temp caches etc so my whole system under heavy gaming load might reach 350w or so on a 550w psu.

650w will power an OC skylake i7-6700k at high OC with a gtx1080, 32Gb of ram, several hdds and SSDs, multiple fans, extremely heavy gaming loads and still have @100w+ leftover headroom.
 


Oh. I guess ill be safe then haha. Amazon is selling the rm650x for $99, not bad. Though 20 bucks more for the 750w. Im still going to check my current one using the multimeter and see what readings im getting from the 12v rail.

Is there any special way to do it? My cables are really tidy, can I just use one of the spare modular cables to test with? I thought about letting real bench do a benchmark or a 15 minute stress test while i test the psu.

If my readings are good, not really any point in replacing the psu but it would be a piece of of mind anyways.

Im 3 hours in on the stress test and everything is good so far. Temps wont even go over 65c. Its looking promissing.
 
Well, bummer. This just happened.

wp-1470510693022.png


Its still going though, I don't want to hit close program because I'm afraid it will end the stress test.

This gotta be a sign that something is wrong.
 


Should I ignore it and close it or just let it continue as is?
 


It will need to wait for another time then. Maybe ill let it run while I sleep tonight.

I might as well check my psu now.