Random blue screens about 1 to 2 times a day! Help!

John Labrier

Reputable
Mar 5, 2014
28
0
4,530
So I need help figuring out what's wrong with my PC.

post-72909-0-52179300-1426506847.jpg


My system work perfectly fine except I get this blue screen of death at very random times ranging from 1 to 3 times a day. It just happens out of nowhere!
I haven't added any new hardware and the only new software I added was speedfan and minecraft.

I did, however, water cool my system for the first time. I'm guessing I must have damaged something during the process, but it is so hard to discern what I damaged because my system works perfectly fine not counting the rare blue screens.

Can somebody please help me out? I will forever me in your debt!
 

olafgarten

Distinguished
Dec 16, 2011
301
0
18,860
Here is a troubleshooting guide from a Microsoft MVP on the matter...



Stop 0x124 is a hardware error
If you are overclocking try resetting your processor to standard settings and see if that helps.
If you continue to get BSOD here are some more things you may want to consider.
This is usually heat related, defective hardware, memory or even processor though it is"possible" that it is driver related (rare).

Stop 0x124 - what it means and what to try
Synopsis:

A "stop 0x124" is fundamentally different to many other types of bluescreens because it stems from a hardware complaint.
Stop 0x124 minidumps contain very little practical information, and it is therefore necessary to approach the problem as a case of hardware in an unknown state of distress.


Generic "Stop 0x124" Troubleshooting Strategy:

1) Ensure that none of the hardware components are overclocked. Hardware that is driven beyond its design specifications - by overclocking - can malfunction in unpredictable ways.

2) Ensure that the machine is adequately cooled.
If there is any doubt, open up the side of the PC case (be mindful of any relevant warranty conditions!) and point a mains fan squarely at the motherboard. That will rule out most (lack of) cooling issues.

3) Update all hardware-related drivers: video, sound, RAID (if any), NIC... anything that interacts with a piece of hardware.
It is good practice to run the latest drivers anyway.

4) Update the motherboard BIOS according to the manufacturer's instructions.
Their website should provide detailed instructions as to the brand and model-specific procedure.

5) Rarely, bugs in the OS may cause "false positive" 0x124 events where the hardware wasn't complaining but Windows thought otherwise (because of the bug).
At the time of writing, Windows 7 is not known to suffer from any such defects, but it is nevertheless important to always keep Windows itself updated.

6) Attempt to (stress) test those hardware components which can be put through their paces artificially.
The most obvious examples are the RAM and HDD(s).
For the RAM, use the in-built memory diagnostics (run MDSCHED) or the 3rd-party memtest86 utility to run many hours worth of testing.
For hard drives, check whether CHKDSK /R finds any problems on the drive(s), notably "bad sectors".
Unreliable RAM, in particular, is deadly as far as software is concerned, and anything other than a 100% clear memory test result is cause for concern. Unfortunately, even a 100% clear result from the diagnostics utilities does not guarantee that the RAM is free from defects - only that none were encountered during the test passes.

7) As the last of the non-invasive troubleshooting steps, perform a "vanilla" reinstallation of Windows: just the OS itself without any additional applications, games, utilities, updates, or new drivers - NOTHING AT ALL that is not sourced from the Windows 7 disc.
Should that fail to mitigate the 0x124 problem, jump to the next steps.
If you run the "vanilla" installation long enough to convince yourself that not a single 0x124 crash has occurred, start installing updates and applications slowly, always pausing between successive additions long enough to get a feel for whether the machine is still free from 0x124 crashes.
Should the crashing resume, obviously the very last software addition(s) may be somehow linked to the root cause.
If stop 0x124 errors persist despite the steps above, and the hardware is under warranty, consider returning it and requesting a replacement which does not suffer periodic MCE events.
Be aware that attempting the subsequent hardware troubleshooting steps may, in some cases, void your warranty:

8) Clean and carefully remove any dust from the inside of the machine.
Reseat all connectors and memory modules.
Use a can of compressed air to clean out the RAM DIMM sockets as much as possible.

9) If all else fails, start removing items of hardware one-by-one in the hope that the culprit is something non-essential which can be removed.
Obviously, this type of testing is a lot easier if you've got access to equivalent components in order to perform swaps.

Should you find yourself in the situation of having performed all of the steps above without a resolution of the symptom, unfortunately the most likely reason is because the error message is literally correct - something is fundamentally wrong with the machine's hardware.
 

John Labrier

Reputable
Mar 5, 2014
28
0
4,530


sigh.. I hate it when i get problems that don't have clear cut solutions.
I've done most of this already, removed all OCs and still blue screen, ran memcheck with no errors, ran a error check for my SSD with the autofix errors thingy, still blue screens, my machine is watercooled and i always monitor the temps, all are far from too high. I always keep my drivers updated. BIOS is up to date.

One thing I failed to mention was that some of the sata 3 ports stopped working for some reason. Could that be connected to the BSODs?
 
you can put up your memory dump on a server and post a link. I can dump what the CPU is complaining about.
Most of the time, it is a error in the cache ram inside the CPU, the CPU then shuts down the system with a bugcheck.

Most often the cause of the cache RAM error is a voltage fluctuation to the CPU memory controller. Sometimes from a overclocked GPU using too much power from the motherboard PCI/e slot. (always check the GPU external power connections in this case)

the memory dump will show a system up time. if the uptime is short (5 to 10 seconds) then your power to the CPU went too low and the CPU reset but the power supply was not stable before the CPU restarted. (bug in the Power supply logic circuit, cheap power supplies fake the circuit)

if the uptime is longer, you want to look at your temps and make sure you do not have GPU or CPU overclocking software running.
 
most of your bugchecks are due to memory corruption.
I would stop any overclocking, update the BIOS, and run memtest86 to confirm your memory system is ok.
check your temps also, overheated CPU gives the same symptoms as power problems and overclocking problems.
 

John Labrier

Reputable
Mar 5, 2014
28
0
4,530


BIOS is already up to date, CPU temps are well below average thanks to watercooling. I ran the memtest already with no problems, but will run it again just to be sure. Will keep you posted, thanks!
 
you should also put your actual memory .dmp files on a server and post a link so they can be looked at with a windows debugger.



 

John Labrier

Reputable
Mar 5, 2014
28
0
4,530
So I think I figured out that problem. The DRAM voltage was set to 1.65 for some reason when it should have been 1.5. I set it down to 1.5 and so far no blue screens. I hope it stays that way, lol. Anyway thanks for the replies and help guys! Much appreciated!