System instability, won't power on with GPU, blue screens, dual-channel issues, screen-tears/freeze during test

cacti

Honorable
Sep 20, 2013
10
0
10,510
I've been having quite a few issues with my computer recently. :

SPECS:

i7 8700k
z370 aorus gaming 5
2x8gb DDR4-3200 corsair
Asus dual GTX1070 08g
win10-64 installed on samsung ssd 850 evo 500gb
corsair cx750m psu
evo 212 cooler

I built this from scratch, all new parts, except for some old HDs that are used for storage in March 2018.


Issues:
-My computer would not boot in dual channel mode since I noticed that it was in single channel mode in June or July (XMP enabled, RAM in slots 4+3 or 2+1), would only boot in single channel mode (3+1). Inserting the chips in different channels would cause the BIOS to not load. HOWEVER, today, after removing my GPU and checking the CPU for bent pins(see below), it now boots in dual channel mode in slots 2+1, and HWiNFO64 displays 'dual-channel' under 'mode'.

-For the past several months, I have been getting frequent blue screens on startup, with fairly random error messages. (kmode exception not handled, irq not less or equal, kernel security check failure etc). There does not seem to be any pattern to this, it occurred approximately 25% of the time on startup, but a reset usually causes it to boot fine afterwards. (I was not overclocking). Bluescreens after startup are extremely rare.

-I ran intel processor diagnostic tool 3 days ago, it passed all tests.

-yesterday my computer did a hard power off (no blue screen) while playing the witcher 3 on max settings. I have been playing this game for the past month on max with no issues previously. I turned it back on, ran speedfan, and ran witcher 3 again. Speedfan showed no issues with temps, (cores mid 30s, GPU at 50), but I had another hard power off after a few minutes of playing. I left the computer off for 30 min, started it again, played witcher 3 for a couple hours on ultra settings without issues. I can think of nothing that would have precipitated this-- no new software, no new hardware, no power outages etc.

-Today my computer would not turn on at all. I would power on, it would putter a bit, some of the leds would come on, but nothing. the led code panel on the board showed nothing. I tried 5 or six times, same results. Worried that it might be a PSU issue, I removed one ram chip, removed all HDs except for the SSD with the OS, removed all USB devices, tried again, same result. I removed the CPU, checked for bent pins, seemed all fine (though I don't have a magnifying glass, and my camera or my photo-taking skills aren't really up to snuff, so who can be sure?), so I reinstalled the CPU and removed my GPU, and it powered up ).

-I turned off turbo, turned off XMP, and set memory to enhanced stability modes, then ran UserBenchmark. Results as expected. I turned on XMP, kept enhanced stability mode, installed the second RAM chip so that we're in dual-channel mode (2+1)-- RAM now performing as it should in dual channel mode, CPU at low end (because turbo is off). (I'm typing this now with no GPU, both ram chips in dual channel, usb devices and all HDs plugged in).

-I restarted, turned turbo on, overclocked to 4.7 via 'cpu upgrade' in the bios, ran a benchmark, (again, with no GPU) and got screen tearing and a freeze. Power off, power on, turned off OC, kept turbo on. Ran Intel processor diagnostic tool, got screen tearing and a hard freeze on prime number test. Had to power off. I am pretty sure if I run the tool again I'll get a freeze, but I'll post this first, then update.

-running chkdsk and sfc /scannow does not fix anything, though it occasionally finds corrupt files?

UPDATE: a second run of the intel processor diagnostic tool caused a hard freeze, the following text was on the screen at the time:
module math_primenum.exe completed - pass
module viscollisions.exe completed - pass

UPDATE 2: now works with GPU plugged in, still getting freezes with intel processor diag tool. and UserBenchmark ARGH WHAT IS GOING ON!!


UPDATE 3::

I replaced the PSU to no effect. I realized that the problem was two-fold:

1) The majority of the problems were caused by the evo 212 CPU cooler. If you stand the case upright, the cooler sits at a 90degree angle from the board, and from the ground and puts uneven pressure on the CPU causing issues with dual channel mode and random crashes. No amount of tinkering with the tightness of the cooler screws both enables dual channel mode AND eliminates crashes under load. I have the case sitting flat on the ground now, rather than upright, so that the cooler sits upright on the chip, and this has eliminated the crashes and the failure to boot in dual channel mode. I do not recommend this cooler.

2) there was a faulty Type4 connector from the PSU to one of the secondary HDs that was preventing power on.
 
Solution


Could be a bent pin, just one bent pin under the cpu can cause all sorts of problems. Might be best check it out.

boju

Titan
Ambassador
Does sound like a psu issue. Return it for an Corsair Rm or HXi or Evga's Supernova 2G or 3G units. Corsair's Cx line isn't very good.

Your memory configuration is a bit confusing.

A2 and B2 - 2nd and 4th slots from cpu are the slots used for dual channel.

Edit. Sorry, the way you had memory for dual channel is correct in slots 2 and 1. Just checked the manual.

I'm used to this, which is the same config Gigabyte uses.

C5KUt.jpg
 

cacti

Honorable
Sep 20, 2013
10
0
10,510
re: the memory, the board labels them in a funny way. Starting closest to the chip and going further, you have: DDR4_4 DDR4_2, DDR4_3, DDR4_1. They are currently, as you say, in the 2nd and 4th slots from the CPU.

re: PSU, that's the prime candidate in my mind, though, as per the update, it now boots fine with the GPU in, though admittedly with low loads, but running a benchmark or processor test causes a freeze. But I can't think of anything but a PSU issue that would cause a hard power off like what happened yesterday playing witcher 3. I can't trigger a hard power off on demand, unfortunately.

I wonder if there are any tests that will stress the GPU but not the CPU, since I was getting freezes with no GPU running the processor diag tool, and am still getting freezes with the GPU when running the processor diag tool.
 

boju

Titan
Ambassador
Yeah you're right about memory, i updated my previous post.

Theres not really much in the way of testing via software to tell you if a gpu is going to cause a shutdown. Borrowing another gpu could help diagnose if it is the graphics card.

Hard shutdowns are usually psu related, or it could be the gpu tripping the psu causing the psu to shutdown. Memory errors usually results in only blue screens but not hard shutdowns.
 

cacti

Honorable
Sep 20, 2013
10
0
10,510
I'm kind worried that it might be a board issue, or a bent pin that I can't see. It weirds me out that I ran the processor diag tool 3 days ago with no issues, but get a reliable freeze now, and I don't know what has been causing, for the past several months all the blue screens on startup and the fact that I couldn't boot in dual channel until today when I started pulling stuff out.
 

boju

Titan
Ambassador


I read that question again, there are gpu related tests available like Uni Valley but that probably wont do you any good because the cpu still needs to pre-render each frame so it's not entirely just gpu.

https://www.tomshardware.com/reviews/how-to-stress-test-graphics-cards,5449-4.html
 

boju

Titan
Ambassador


Could be a bent pin, just one bent pin under the cpu can cause all sorts of problems. Might be best check it out.

 
Solution

cacti

Honorable
Sep 20, 2013
10
0
10,510
UPDATE:

I replaced the PSU to no effect. I realized that the problem was two-fold:

1) The evo 212 CPU cooler. If you stand the case upright, the cooler sits at a 90degree angle from the board and puts uneven pressure on the CPU causing issues with dual channel mode and random crashes. No amount of tinkering with the tightness of the cooler screws both enables dual channel mode AND eliminates crashes under load. I have the case sitting flat on the ground now, rather than upright, so that the cooler sits upright on the chip, and this has eliminated the crashes and the failure to boot in dual channel mode. (I selected the post above as the solution as 'bent pin' is pretty close in nature to this.

2) there was a faulty Type4 connector from the PSU to one of the secondary HDs that was preventing power on.
 

boju

Titan
Ambassador
Oh great, you got it sorted well done.

You used the back plate for the cooler didn't you? Like this video from 3:50. The screws can't be on too tight otherwise you get problems like that, as you have found. Not sure if the Hyper has high tension stoppers but generally i go by finger tight all round or gently with a screw driver until they stop turning without force and then half turn with force.

[video="https://www.youtube.com/watch?v=UW_DVONAWks"][/video]

This guy here used just nuts and probably non conductive washes. http://www.tomshardware.com/answers/id-2607377/hyper-212-evo-backplate.html