Serious intel x299 computer issues please help!!

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
Hey guys, so 2 months ago I bought a brand new x299 aorus gaming 3 motherboard socket 2066 along with an intel i7-7740x cpu. I also bought 2x8gb DDR4 corsair vengeance 3000mhz ram. I purchased it all brand new off newegg.com to replace my old worn out i7-875k, intel DP55KG motherboard and DDR3 corsair ram.

everything worked like a charm, my first overclock with the new setup was about a week later and I was able to hit 5ghz stable as a rock at 1.28v with my LLC set to turbo.. this was right around the ballpark of the worse binned chips of the 7740x but Its still pretty darn good IMO.

I left it this way for a few weeks and noticed my computer had become unstable.. I thought it was odd cuz i did a hour of intel burn testing and I knew it was stable.. but I finally had to turn the vcore up from 1.28 to 1.33 to keep it stable.. another 3 weeks give or take went by and my computer became unstable again.. so yet again after trying everything I could to stabilize it at 5ghz nothing worked so I had to result to turning up the vcore even more.. THIS time I had to turn it up to 1.4v just to keep it stable! I thought I was loosing my mind.. I thought how can a computer that was once stable a few weeks to a month ago on the same settings all a sudden become unstable and need this much more voltage just to keep it stable again? I didn't change any hardware, i am not new to overclocking.. I keep my pc well ventilated and always watch my temps and voltages..

another few weeks go buy and I can no longer keep it stable at 5ghz with out overvolting my brand new i7 so i chose to finally lower down my overclock from 5ghz to 4.9 while keeping the voltage the same at 1.4v just to maintain stability.. yet again after another week it was unstable AGAIN!

eventually i had to go from 1.4v vcore and LLC set to turbo at 4.9ghz to 1.42vcore and my LLC from turbo to extreme just to keep it stable at 4.8ghz!

you guys see where I am going with this it basically got worse and worse until last night I was playing gta5 online at the exact overclock and settings listed above when gave me the all too familiar blue screen.. THIS TIME I was so mad I just gave up and said ok ill just default my whole bios to stock so I can at least game for tonight.. I defaulted my bios back to stock settings to find I couldn't even finish loading windows before it blue screened! then fact that it bluescreened on stock settings really scared me... so this time I went into the bios and left everything at default again EXCEPT my core frequency, I turned it down from the stock 4.3ghz and 4.5ghz turbo to just a straight 4.2 ghz UNDERCLOCKING my cpu from stock... this finally got me stable enough to boot up and continue playing..

Something is SERIOUSLY WRONG here. and that's why I'm asking you guys for help. I don't know what to do or where to start... this is just ridiculous! I have heard of cpus deteriorating like this over a matter of many years, where after running a certain overclock speed for 3-5 years eventually becomes unstable, but how can a cpu deteriorate this fast in just a matter of months??

My guess is its either my power supply, my motherboard, or my cpu that is failing.. maybe my PSU is no longer affectively delivering the proper amperage and wattage to my cpu.. cuz my voltage is always dead on right where i set it when watching my "coretemp" app so I know my voltage isn't lacking. My cpu while gaming has never gone over 70c and my cpu during stress testing has never really gone over 90c and when it has it hasn't happened often. Maybe its the voltage regulator in my motherboard or VRM? Or maybe my cpu is just flat out dying.. this is why I came to the experts... you guys!

I am not sure where to even start.. I did however update my bios to the latest version this morning and nothing changed.. its still just as bad (having to UNDERCLOCK it to maintain a stable system and not bluescreen) but my speculation is its only gonna keep getting worse to the point to where no matter how low I turn the cpu frequency its just gonna keep bluescreening till I'm dead in the water.

MY SYSTEM COMPONANTS:
Roswill large tower - tons of airflow/fans -5 months old
Aorus Gaming 3 x299 motherboard 2066 chipset with RBG - 2-3 months old
Intel i7-7740x CPU - 2-3 months old
Corsair H100i liquid cooler - 5 months old
Corsair vengeance DDR4 3000mhz ram - 2-3 months old
MSI- GTX 970 gaming edition graphics card - 2 years old
Xion 800watt power supply/PSU - 8 years old
SP 120gig SSD - OS - main programs - 5 months old
Seagate 1T HDD - games and movies - 2 years old
Barracuda 1.2T HHD - empty just a backup drive 6-years old

My backround:
I am not new to overclocking, I'm not new to building computers as I have built many over the years, though I am no expert. I am a mechanic and have been my whole life, I'm good with electronics, I take good care of my computer, I don't let it get dusty and maintain it regularly.

I'm hoping its just the PSU but how can I test it to be sure? I really don't wanna rip it out and try another one if there is an easier way to test it. is there a program that can tell me how many watts/amps its putting out? As I said before I know its putting out the correct voltages.

PLEASE SOMEONE HELP, I WILL BE VERY APPERICATIVE!
 

KirbysHammer

Respectable
Jun 21, 2016
396
0
1,860
32


Sounds like you got a bad CPU or motherboard. RMA both. Power supply could also be the culprit.

CPUs shouldn't degrade that fast, especially not at 1.28V, which is completely safe.

 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
I have seen those PSU testers before but I don't think they do anything more than show what voltage each rail puts out.. as far as voltage goes I think its fine.. like I said if I underclock it then my computer runs just fine, that's actually what I'm on right now typing this with. My voltage is set to 1.28 in bios right now and on my cputemp APP I can see my voltage is steady at 1.28v, and it never fluctuates more than .2 which is normal.. I think if anything I need to find out a way to test the amperage and wattage output of my PSU cuz those I cant see though the APP and that would tell me if my PSU is bad or not. the fact that my computer starts and runs fine underclocked tells me that all my rails are putting out the correct voltage.. but amps and watts I have no clue.


Do you know the warrantee policys? I am almost positive the newegg one is only 30 days which I am passed that, but what is the policy with intel on my CPU?


 

KirbysHammer

Respectable
Jun 21, 2016
396
0
1,860
32


Well the manufacturer should have a 1 year warranty.

Chances are it's your PSU not being able to put out stable voltage so when it dips the CPU crashes due to low voltage.

Xion is not known for making good quality parts so it may have failed.
 

John Philips

Reputable
May 3, 2014
132
0
4,710
21



Or the processor is getting to damm hot...
 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
its a good possibility it is my psu, I just called the motherboard company and they told me to pull the battery to clear the cmos, I did that with nothing changing. still same issue. my mobo company aorus seems to think it is the cpu, says they have heard countless claims about intels chips even new chips slowly dying once overclocked so that in a matter of less than a year they are no longer even able to hold their stock clocks with out bluescreening..

my cpu is under a 3 year I believe and my mobo he said is a 3 year.. so I may just RMA my cpu first and try the new one to see what happens..

this is nothing to do with cpu temps, my computer crashes on stock speeds right after windows startsup.. the temps on startup are no higher than 35-40c at the very most. so that has nothing to do with it.

I think my next step is to call intel and see what they say about it all.. I checked my VRM temps seem normal, which some ppl say is a huge issue with the x299 boards. My VRM at stock clock is only hitting 41c with a temp gun.. so since I'm not using a thermistor directly on the board I could add 10-20c and its still well with in range at 51-61c which is very normal.

I'm gonna call intel next. If anyone has any other suggestions on things to try please lmk. And thanks to the guys who have been trying to help.
 

ubiquityman

Prominent
Aug 30, 2017
18
0
520
1
Summary:
My guess is that the data on your storage device (SSD/HD) is corrupt. i.e. the errors are baked into your system now.
I recommend you reset the system to default clock and reinstall windows to test this possibility.

I have had a similar experience many years ago... I overclocked a system, did a stress test, and it passed without problems. A few weeks later I noticed my system started to bluescreen. After countless hours of troubleshooting, I figured out what had happened. The overclocked system would occasionally corrupt bits. Not frequently, but every blue moon. Probably a combination of ambient temperature, workload, weather, and maybe what I ate for breakfast.
The point was that there was no recovery from this because the data that was being written back to the hard drive was every so slightly corrupted. Mind you, this was on a SCSI system and the PCI bus was also overclocked. Since then, I've become a lot more conservative about overclocking.

 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
Thank you very much for your useful post ubiquityman.. that sounds like a great possibility. Last night I checked the voltage output of all the rails in my PowerSupply/PSU while the system was powered on and fully booted. I backporbed each connector of the power supply and checked power and ground to every plug, checking my 3v rail, my 5v rail, and my 12v rail.. all of which were putting out proper voltage even under load. I also tested the voltage output while in mid boot as well as during a CPU and Memory stress test and also a few benchmarks, and even with all that load the voltage never once dropped below what it suppose to be on each given rail and connector. So I would have to say this proves my power supply to be good.

I actually just ordered a Samsung 850 evo pro 256 gig SSD right before all this started happening and planned to do a fresh install of windows on it anyhow. So I will 100% try this on the new SSD when it comes in the mail today or tommarrow to see if it fixes the problem.

Fingers crossed, I'll get right back to you guys once I try this.

Thanks again for all your help.
 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
Hey man I was just thinking more about what you said about the HDD or SSD maybe being currupt and that's why it could be blue screening. But do you think it would still run normally when underclocked? Cuz when I lower the clock speed lower than stock the computer does just fine.. thinking about this kinda makes me think it's not a currupt HDD or SSD idk.

When I get my new 256gb SSD in the mail and do a fresh install is there anything you recommend me doing before hand? I'll obviously have to use this current pc to install windows on it.. so is there any steps or precautions I should do before installing the new SSD and installing windows so that this new SSD doesn't get currupt right out of the gate? Or are you saying the file corruption will only happen when overclocking
 

ubiquityman

Prominent
Aug 30, 2017
18
0
520
1
If your system runs fine at normal clock or under clocked, then it's not a corrupt hard drive. (Sorry, I glossed over that part previously).

So what could have changed....?

Take all these suggestions in stride. The probability is low, but then again, you might have an odd duck.

These are all just wild guesses, but heat changes things.
The compound between the silicon and the IHS should be robust, but how about the compound you used between the IHS and your HSF?

It's possible something on your VRM gave way (deteriorated) as you suggested.
(I personally have never seen this.)

Ideally, if you knew what your CPU temp was originally with the intial overclock, then you might compare CPU temps that might be causing the problem.

Are you running the exact same software you are now, compared to initially.
For example, did you test the original overclock with stress tools, and then now switched to games?

What is the error when it BSODs? Could it be related to the video card. I've seen many more video cards go bad than I have with CPUs.

The root cause may be hard to nail down if you don't have a duplicate set of parts to swap for trial and error.
 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
Thanks again for your response. I know for a fact it has nothing to do with CPU temps as the temps during startup before the blue screen happends is the exact same as it was the first day I used the processor when i had no issues with anything. The temps were around 30-45 on startup and that's exactly where they are now with bios set back to default settings. I am using a very good thermo paste that many other ppl recommended on another forum site so I know it has nothing to do with the behavior I'm having. I also checked my VRM temps both with a laser temp gun as well as with an app on my desktop. The VRM temps are normal at only 40-60c so I don't believe it has anything to do with VRM.

My blue screen codes are always different.. if I had to guess I would have to say it bounces back and forth beteeen maybe 10 different blue screen codes.. I thought about looking up the definition to the codes at one point but the fact that the codes are always changing would make it way too hard to decipher the cause of the codes that iv been getting.

I agree that 99% of the time it's usually NOT the CPU and is usually some other hardware component that fails but in my case I truely think it's my CPU. I must have gotten a rare bad CPU. I'll explain why...


So last night as I mentioned I would do, I took out my old SSD along with my two other HDD's, and replaced them with my brand new Samsung evo 850 SSD. Before hand I cleared my CMOS AGAIN, just to make sure I had a fresh start and that all my bios settings were defaulted before the fresh install of windows. So I did that and what do you know, before I can even get though the first step in the windows 10 installation I got a blue screen. So that eliminates it as being a currupted HHD or program.

All of my overclocking before and after my issues started had been done directly through my bios and NOT though any overclock apps.

So after the New SSD and fresh install yeilded nothing new I then decided to start trying different hardware. First I swapped out my power supply with a brand new 750watt EVGA gold edition power supply which did not change a thing.. still the same bluescreen and the same exact time which is right before or right after windows startup. Once that didn't work I plugged my old power supply back in (now that I knew it had nothing to do with the power supply) and I tried my ram sticks next. I have 2x 8gb of corsair vengeance ram, so I tried pulling one stick of ram at a time to see if I had faulty ram and again i had the same blue screen at the same time. So now I know it's not my ram and not my power supply. Lastly I tried a new graphics card. It's actually an old nvidia GTS 250, I took out my GTX 970 and swapped in my GTS 250 and AGAIN I had the same results.. blue screened right before or after windows startup.

So this I do know, it's not my ram, it's not my power supply, it's not my graphics card, it's not a currupt or bad SSD/HDD, it's not a currupt program or file, and from what I can tell while monitoring my motherboard and it's temperatures like VRM, Etc, that's it's also not caused by my motherboard.

In my opinion this only leaves one other thing it can be, and that's my 2 month old i7-7740x :/

Please READ this to see everything I did and tried, and let me know if you think I missed anything, or if you agree with me now that it's my CPU.

I am prolly going to RMA my CPU today sometime. I have never RMA any computer part before in my life so I hope it's not too hard.

If anyone has any info on the best way to RMA my Intel CPU though Intel themselves and can give me some tips and pointers please do!

Thanks again for your help.


One last thing I am going to try when I get home is to take my water cooler off, and see if any of the thermopaste has somehow leaked it's way onto the board or into the CPU pins, and if that's ok (which I'm sure it is cuz I do not overuse thermopaste and know how to apply it) then I'm gonna try to reseat the CPU by taking it out and putting it back in, and if none of that works then I'm giving up and RMAing it cuz at that point I will have tried EVERYTHING I can think of.
 

KirbysHammer

Respectable
Jun 21, 2016
396
0
1,860
32


VRMs could still be faulty even if they're not overheating.
 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
Yes. But how would I further test it to find out? Cuz there's not a whole lot to test lobo wise.. last night I popped off my cooler to see if I had any thermopaste that leaked and there wasn't.. but I cleaned it off and took out the CPU anyhow, looked like brand new still. No scares no marks no hot spots on the heat spreader, I used the magnifying glass on my iPhone 7 to look at the socket and the pins and didn't find anything out of the norm there.. all the pins appeared to be perfectly straight in nice clean columns and straight rows of pins. This was the first time my CPU has been out since in bought the system over two months ago, so I didn't expect any damage but figured I would check it while I had the cooler off..

I put the CPU back in gently, added some different thermos paste, artic silver. Put it all back together and booted it up to again the same exact issue.

I did however take pictures of some of the more common blue screens that Iv been getting, if I leave the CPU at stock clock it will just continually blue screen over and over and over again untill I go back into the bios and lower the clock speed to 4ghz then everything is just fine.. I really don't know what else to try at this point cuz I have basically tried everthing possible.

I also tried memtest last night.. both the plus version and the regular version.. when I had my CPU frequency at 4ghz I had no errors but when I turned the frequency back up to stock speeds memtest errored over 16 times in 45 seconds and the error confidence level was at 90%.. I also left it run at stock speeds overnight on memtest+ to wake up in the morning and see there was 1,026 errors found and the computer and screen was frozen at that. I had to do a hard shut off to get it turned off before I left for work this morn.

I will post all the pics of my blue screens, error codes and memtest results later today so you guys can see then and try to help me decipher what's going on.

Tonight I think I will try one ram stick at a time to see if it changes the results and go from there
 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
I think I'm just gonna call it guys. It's gonna be the CPU and Honestly I think iv done way more testing than most ppl would have lol.

I called Intel today, they never asked me if it was ever OC. I told them what had happened and that it was great the first few weeks but a few weeks later it would not stay stable at stock clock speeds anymore. I told them that if I underclocked it that the CPU would work just fine and I would have zero blue screens anymore. They told me all I gotta do is send it back and I'll get my new one with in 7-8 business days.. OR I could pay upfront 375$ and get a new CPU in 2 days and once I get the new CPU it will come with a return box and all I gotta do is box it up and send it back, once they receive it with in a day or two later they said I will have basically a full refund of 350$ which is everything accept the 25$ it took to ship it to me.. which I will prolly have to pay anyhow if I ship my CPU myself and go with the first option I sent. Only bad part is I'll have to spend 375$ upfront first but honestly I feel like it's bad and I feel as long as I get my money back then who cares at least I can get my new CPU faster.

So what would you guys do? Would you spend the money and ship the CPU out yourself and wait 7-10 days to get the new one with no other upfront money? Or would you spend 375$ upfront and get the new CPU in 2 days and get re-embursed the entire cost of the chip 350$ 1-2 days after they receive the old one?

Lmk thanks
 

ubiquityman

Prominent
Aug 30, 2017
18
0
520
1
That is entirely based on your schedule, but I would just have them ship it right away.

I'm interested in your on-going saga, and what your root cause turns out to be. Let me know how it turns out.
 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0


 

nobiffbetter

Prominent
Aug 29, 2017
23
0
510
0
Hey guys sorry I forgot to report back to let you guys know if it fixed it!

The answer is YES the new i7 7740x replacement CPU given to me by Intel for no charge fixed the issue!

I can't tell you how relived I was to finally have this mess over with. I wanted to make sure I reported back to tell you all this because I hope that anyone else with similar issues who google a solution for their problems can come to this thread and possibly it will help them decide whether or not it is what's wrong with their computer or not.

As you can see from all the posts, I went though a vigorous testing process, testing just about every component there is to test, and when I had nothing left to test I knew it had to be the CPU. I kind of assumed right off the batt that's what it was, just by the symptoms alone, but I did not want Intel to deny me my money back or not accept the CPU if I was wrong. So I had to make sure I covered all my bases.

Hopfully the processes I took can help somebody diagnose their own issues, there are lots of great ideas and testing suggestions that helped a lot from both me as well as other ppl who chimed in to help in this thread. Thanks for everthing guys! Just glad to have my computer back.


On a side note: Intel warentee process was very simple, very smooth, and very fast, they never asked if it was overclocked, they tested it, found it was faulty and re-embersed me my money.

One more thing, on the box to my new i7 7740x that I received from Intel, I noticed it had a special sticker on it, not a sticker you would typically see when purchasing a new CPU.. I looked at the tag and it had a "BIN#" the BIN# for my chip started with 01- and then some other numbers after it.

I saw that and instantly thought, "maybe this number means my CPU was one of the first dies made out of the silicon" I started to get really excited to try my chip out because I have read that when making CPU dies, the ones that came out of the silicon closest to the center are always the first to be made from that silicon and also always the best chips "hence the silicon lottery. So I thought if this is true maybe I have a super good chip because it would make sense why I had a BIN# starting with 01..

I tested the chip out, I got it to 5ghz easy on stock voltage! I then was so impressed with how far I got on stock voltage I decided to start turning my voltage below stock just to see how low I could go, I managed to boot up at 5ghz with just 1.65v vcore! That's just insane.. my first chip couldn't do 5ghz at 1.38 most the time.. granted I didn't do any serious stress testing at that time, but the utter fact that it booted at that low of a voltage told me I had a really amazing chip. I read a lot online and most ppl with the 7740x hit 5ghz stable at 1.35-1.4v , I was able to make mine totally stable at only 1.18v.

Now, I don't know if any of what I said about the bin# has anything to do with me getting a good chip or not but it's something to take note of if any of you ever get a warentee chip with this "extra white sticker" on the front of the CPU box..

It may have nothing at all to do with that, I could have just gotten extremely lucky my second time around.. OR maybe Intel get bad about me getting a bad chip the first time, and instead of loosing another person to AMD maybe they just decided to throw me a good chip to make me happy.

Whatever the reason, I find it very coincidental that once I warenteed my chip out that I got not only a good chip back, but an EXTREMELY good chip back..

For what it's worth I just thought I'd share that.

Happy computing!!