PC won't boot past mobo screen

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

AJZ

Distinguished
Dec 23, 2010
16
0
18,510
Homebuilt
EVGA p55V
Intel i5
z3 4GB Ram (single stick)
nVidia geforce 8900
Ulta 750W ATX power supply
Memorex lightscribe cd drive

The first time i booted up the PC, I was unable to install an operating system for over a week. I would blue screen of death on the install of both x86 and x64 windows 7. Eventually, I persisted and successfully installed an OS. This was in January of fthis year. Since the successful install I've had numerous OS issues including theme failures, system error messages, freeze-ups, etc. Recently, the computer has been freezing every hour. When I try to restart, the PC locks on the windows shutting down screen.

At first, I thought it could be the processor because I started with the x64 OS, and figured the processor pins weren't set right (checked the same concept with the ram). But i checked the processor and ram and they were perfectly set. I tried to flash the mobo early on, but realized the updates for the p55v were lacking (mostly nonexistent actually).

Last week, the problem seemed to turbocharge. The OS (which was win7) seems to have corrupted. I went to EVGA for support. I stupidly forgot to write down my mobo serial key when i installed, so I had to tear apart the machine for the serial to get the EVGA support. The first think EVGA suggested was that i had a memory problem so they directed me to memtest. I have downloaded and burned memtest from my laptop, but now the PC won't boot at all. At first, windows was going into "save the os" mode, running a test of some sort. I tried to reinstall windows, but when I try to boot from cd, the entire pc turns off. It was only happening when i tried to boot from disc, so i thought i messed up my reinstall. But i've checked the disc drive and it seems to be connected properly. Where i stand now, the pc isnt only restarting when i try to boot. I removed all discs from the drive, and the PC boots to the mobo screen (where it shows the graphic and name of my mobo), then it shuts off. Over and over - 6 times.

So here I am. Basically, my pc won't turn on at all. Thanks for any help you can provide.
 
Solution


This thread is getting a bit confused I think. Let me recap, you tell me if I'm wrong.

1. You had a lot of OS troubles and your RAM failed memtest in dimm1.
2. You switched your RAM to dimm2, and you can now pass memtest. You are having no more errors.

Allow me to quote myself:
On the RAM, people often think of RAM as working in one direction, but it's not like that. Whatever exists on your HDD passed through RAM on the way there 😉
So if your RAM is undervolted, OR if your RAM is being fed very dirty power from a bad PSU, it might...
Make very sure that the heatsink is clipped on properly. Also, you can try to undervolt and underclock it in your bios if it lives long enough to change the settings.

Stock VCore on i5's are arond 1.1V - 1.25V. so try going down to maybe 1.15 and drop the multiplier a couple of times. This should reduce the heat produced and give you a bit longer to adjust things.

It could be the chip itself (maybe). If the factory has set the VID too high it could in theory cook itself to death.
 
I will preface this post with "my bad." And also that the push-button secures on the intel heatsink/fan compel me to slit my wrists. I did finally get it secure (and i realize that i said it was secure before -- but i was wrong). The cpu runs at 73C consistently now. Let it run for an hour to make sure.

Onto the next... I ran memtest86 (as directed by evga support) for an hour. In the first 5 minutes i had no memory leak erros. In the second 5 minutes i had 14 errors. Through 30 mins of the test it held strong at the 14. In between 30 minutes and 45 minutes of testing, i had an additional 18 (for 32 total). The memtest forums suggest that I switch my ram stick into a new dimm, so I'm about to do that. The forums also say that even 1 error means theres a problem (so i guess 32 could mean 32xproblem).

Anyone an expert on memtest86? Can anyone provide an analysis on the 32 errors? Forums also said that analyzing the errors is impossible, so if no, thats cool. If i interpreted the readme correctly, i need to first try another slot, then try another stick of ram (or sticks).

Thanks again, guys. Sorry for the overheating obstacle course that i embarrassingly stumbled through.
 
I switched my ram to dimm 3 (out of dimm1 where the mobo manual told me to put it) and I've run memtest for an hour with 0 errors. Im gonna let it run overnight, but so far, the 0 error report seems to imply that memory is not the issue
 
There is no need to be embaressed about asking for help on a problem you haven't experienced before.

I have had my own issues with the damn stupid heatsink clips myself, and the problem just sounded like a manifestation of this particular problem I had before.

73C is still pretty hot as Proximon says. MX-3 thermal compound is really good stuff. I dropped 10C just by using this stuff. The cost was worth it to me as I now have a rock solid stable machine.

Anyway, I am happy that you now have your machine up and running. Next time you will know what the CPU overheat problem looks like and how to fix it.

There is no shame in ignorance if you actively seek to eliminate it. Learning is a good thing.
 
Prox - yes, i saw the first post and by smear i mean dab and spread very thin and evenly across the cpu taking care not to get any on the mobo. I can send off for some mx-3 and see if that improves the heat issue. Yes, dimm 1 is paired with dimm 3 and im using single channel ram (gah!).

Brian-- thanks for the support. happy the heat issue is somewhat resolved. Ill try the mx-3.

So... by running the memtest off a disk it seems that the cdrom and at least 1 sata port is operating properly. It seems that based on the memtest, the ram is working perfectly. Is this an unfair assessment?

At this point, is my inability to load an os (or install an os) pointing to the hard drive?
 


This thread is getting a bit confused I think. Let me recap, you tell me if I'm wrong.

1. You had a lot of OS troubles and your RAM failed memtest in dimm1.
2. You switched your RAM to dimm2, and you can now pass memtest. You are having no more errors.

Allow me to quote myself:
On the RAM, people often think of RAM as working in one direction, but it's not like that. Whatever exists on your HDD passed through RAM on the way there 😉
So if your RAM is undervolted, OR if your RAM is being fed very dirty power from a bad PSU, it might corrupt your OS installation.

This would apply to a bad dimm slot as well, or unclear MB specifications. Or a BIOS not properly configured for your specific RAM module.

So if your memory errors are gone the OS should install cleanly and you should have no more issues on THAT score. A bad dimm slot is serious though and needs to be addressed. It could be anything from the physical slot to the CPU, as the memory controller resides there.

I would first go into the BIOS and configure the settings manually so that you have the correct voltage, frequency, and timings for that module. Then I would try dimm1 again... but only from the memtest CD. Keep it AWAY FROM YOUR OS until you know it won't corrupt it further.

If dimm1 continues to be a problem I would start an RMA with EVGA.... because it's going to be a process of elimination and that is the prime suspect.


-------------------

CPU -

The normal expected temp in BIOS for your CPU would be in the 30-40C range. The normal core temps as reported by HWMonitor, Real Temp, or Speedfan while running Prime95 would be around 60-65C. This on a non-overclocked CPU with a stock Intel heatsink. If you have cleaned off the thermal wax and replaced it with higher quality paste it would be a few degrees less, possibly as much as 4 or 5C.

You have said that your CPU is now at 73C. You have not said according to what program or under what conditions. If you were in the middle of the Sahara and the ambient temperature was 45C, 73C might be acceptable at idle. However putting any stress on the CPU at that point would be a very bad idea, as it would cause dangerous temperatures.

One last thing. Your memory controller is built in to your CPU. If it was overheating it might throw out memory errors, in theory anyway.



 
Solution
Thank you Proximon.

Your assessment is accurate regarding the issues (OS and memtest failures).

The 73C reading was straight from BIOS. According to your post, it should read in the 30-40C, meaning im still running unbelievably hot. The machine is in an airconditioned room in my house (no sun, no desert).

My plan for this evening was to pick up some top notch thermal paste and regrease the cpu. Then, I was planning to attempt an OS install (and fully expected it to fail). My thought was that this would show the HDD is at fault. But now, I hear that you're saying the memory controller in the cpu or the mobo itself might be at fault even though they're completed these simple tests and the temperature on the cpu may be misreading because its too hot. The machine isn't shutting itself off anymore tho - so that seems good. And about the memory controller in the cpu throwing errors in memtest: would one dimm be totally clear and the other be dirty if it were the memory controller? I suppose anything is possible.

My impression is that it is dangerous to attempt an OS install when 1) my cpu is way too hot and 2) i dont know which part of the system is failing. I have already started an RMA with EVGA but their process is slow and they don't think its the mobo. They're the ones who sent me to memtest. When i showed the the result, they said they aren't experts and cant interpret the test.

I can't configure my memory on my board in BIOS. The p55V is incredibly basic and so is the bios.


 


Yes. The only reason this wasn't jumped on harder by some of the others is it wasn't clear. It's a big thing. Really horrible temps pointing to a larger problem than just bad paste. Still, it's worth a try to fix it.

Other much rarer causes of bad temps:

1 A bad MB that is seriously overvolting the CPU.
2 Some major flaw in the heatsink, a big imperfection in the metal for instance.
3 A corrupted or poorly configured BIOS. You could try updating the BIOS if you haven't, but really that's a big risk when the CPU is overheating to the point of instability.


would one dimm be totally clear and the other be dirty if it were the memory controller?

I honestly don't know. It does seem more likely that it's the physical slot or the capacitors controlling it's voltage.


My impression is that it is dangerous to attempt an OS install when 1) my cpu is way too hot and 2) i dont know which part of the system is failing.

Well.... yeah. Dangerous as in stressing the overheating CPU is not good. Any work given to the CPU is going to drive up the temps.

They're the ones who sent me to memtest. When i showed the the result, they said they aren't experts and cant interpret the test.

Typical support staff run around.

Maybe you should ask in the overclocking section. The hard core guys will tell you if an overheating CPU can impact one dimm slot or not.
 
I spread on arctic cool mx-4 and the temperature on the cpu in BIOS dropped to 45 degrees. I let it run for a while and it didn't go past 45C. Then, I pulled out my ram from dimm3 and put it back into dimm1. Dimm1 was where i got the plethora of memory leak errors on the first go around. After 2 hours running memtest on dimm1, zero errors. I rebooted and went back into BIOS to check the CPU temperature. CPU temperature was at 50C. It seems from Prox's earlier post that my cpu should be running at about 30-40C. Is my 50C still considered overheating? I cant imagine that memtest would strain the cpu much and it pushes it to 50C. I'm thinking that heavy strain might still push me into the 60s.

I don't have any reason to believe that my heatsink is defective except for the ever relevant temperature of the cpu. I am still timid about trying to reinstall an OS if 50C is too hot.
 
Your CPU fan may have a control in the BIOS. See if you can set it to a more aggressive setting.
If your case is not well ventilated that might account for the somewhat high temps.

50C under a light load might not be too bad.

I worry that you have some irregularity in the surface of the CPU or heatsink. I also worry that you are still putting too much TIM on the contact surface, or that there is some foreign material in there.
The paste, or Thermal Interface Material (TIM) as it's called, is not there because it conducts heat better than metal. It does not. It's there to eliminate MICROSCOPIC irregularities in the surfaces and air pockets.
The best conductor of heat is the metal-on-metal contact. On a microscopic level though, that does not happen well, so we apply a VERY thin layer of TIM. It's not there to prevent contact, but to improve it.
I really wish you didn't spread it. It's not needed. A small drop in the center of the CPU will spread out quite well under pressure, given surfaces that are properly even. It's why so many advise to do it that way... you can precisely regulate the amount. If you spread it on, it's very hard to say how much is on there.

See if you can drop the temps another 5C before installing the OS. Take the cover off and point a fan in there or something. Once you have the OS you will have better calibrated programs that can give a much clearer picture of your temps than the BIOS can. We'll have a better idea of CPU voltage too.
 
In general I agree with Proximon, however I would say 50C is absolutely safe to install an OS and use.

It is still slightly on the higher side, but well within the thermal envelope that should run stable. Even at full load I cannot see it getting up past 65C which is sufficiently below the maximum of 70-75C that I personally would consider touching the edge of stability.

It is entirely your choice, I am merely speaking from my own personal experience.

 
You guys have been very very helpful and I really appreciate it. I must have been misinformed about spreading out the TIM. It is very thin, but I take your point. I have more TIM, so I'll clear the existing paste and try without the spreading. If it gets me down to 45-50C, I'll try the OS again.

I also have some bad news. Curiosity led me to run memtest overnight again. I put the ram back into dimm1. Dimm1 originally returned over 30 errors in memtest. I switched the Ram to dimm3 and returned no errors. Then i returned the ram to dimm1 for 2 hours with no errors. Then i restart and kicked into BIOS to check the temperature. When i went to bed, i turned memtest back on for dimm1. I awoke to 11 errors. So apparently there is either still an issue with dimm1 or the memory controller. I am going to switch back to dimm3 and try the memtest again when i get home. If i don't get any errors in dimm3 for several hours, it seems that dimm1 is disfunctional.

Thanks again fellas
 
EVGA has elevated me to an RMA (finally). Gonna get a new mobo hopefully and start the process over. In the interest of efficiency, it's prolly best to close this thread. Thanks for all who stuck with me and for all the advice. Much appreciated.

best,

AJZ
 
Status
Not open for further replies.