[SOLVED] PC reboot while running prime95 (Z390 Aorus Master/9900k)

Assaf Patishi

Commendable
Jan 11, 2017
33
0
1,540
1
Hi everybody,
In short, I Overclocked my 9900k to 5ghz, the vCore was at 1.32. While running prime95 (blend) within the first 5 minutes the pc suddenly shut down and rebooted back into windows, no blue screen, simply a reboot.
No surprise that it happened when the Blend test was at its "hot stage" (max 100c). Do you think something's wrong with my Mobo or PSU? or is it some kind of built in protection?
Later I tried running the same test with default bios settings and it did get to 100c but NO shutdowns, it throttled a little bit.. but kept running. And the interesting thing is that in "stock" mode the vCore was actually higher because it was on "Auto" setting in the bios.
(FWIW before I tried prime95 I ran Aida64 and OCCT large data set without any issues) It's maybe important to say that I didn't update the bios so its in version F4 (first version)

When I overclocked I didn't look at the VRM temperature when it shut down so I can't say if this was due to high VRM temp, but the VRM temps on this board are pretty good so I doubt this was the cause of the shutdown) But the LLC was at Turbo setting (seems to be equivalent to level 6 on other brands' mobos)
Now I am running at stock settings until I figured out what caused the shutdown. I don't want to damage my pc. This pc is perfectly new, I built it from all new parts just 2 months ago. My PSU is corsair RMx 1000 by the way.

If some of you have any experience with the Z390 master, can you please let me know what can cause this shutdown?

EDIT: I am starting to believe more and more that it's not a vCore issue. because I already got up to 1.37 vCore and still getting those reboots in prime95 when it's reaching the AVX part of the test (a few of minutes in)
It seems unlikely to me that I passed an hour in OCCT and Aida64 with only 1.3v but getting shutdowns in prime with freaking 1.37v! My LLC is at Turbo
Is there any current protection/temp protection or something like that in the bios that I'm maybe overlooking?
 
Last edited:

Darkbreeze

Titan
Moderator
Anything over 80°C is undesirable. Anything over 85°C is starting to get into "bad" territory. At 100°C you are way past bad into really bad, and probably more than likely you ARE going to see VRM and power delivery problems.

What are your full hardware specifications including exact power supply model number and CPU cooler model.

Also, you should NOT be running AVX versions of Prime to do thermal testing. AVX based stress tests should ONLY be used, ONLY, if you KNOW you are primarily going to be running applications that make heavy use of AVX instruction sets.

Quick and dirty overview of overclocking/stability validation procedure.

Set CPU multiplier and voltage at desired settings in BIOS. Do not use presets or automatic utilities. These will overcompensate on core and other voltages. It is much better to configure most core settings manually, and leave anything left over on auto until a later point in time if wish to come back and tweak settings such as cache (Uncore) frequency, System agent voltage, VCCIO (Internal memory controller) and memory speeds or timings (RAM) AFTER the CPU overclock is fully stable.

Save bios settings (As a new BIOS profile if your bios supports multiple profiles) and exit bios.

Boot into the Windows desktop environment. Download and install Prime95 version 26.6.

Download and install either HWinfo or CoreTemp.

Open HWinfo and run "Sensors only" or open CoreTemp.

Run Prime95 (ONLY version 26.6 which is not an AVX version, or another later version with a modified local.txt file to eliminate the use of AVX while thermal testing) and choose the "Small FFT test option". Run this for 15 minutes while monitoring your core/package temperatures to verify that you do not exceed the thermal specifications of your CPU.

(This should be considered to be 80°C for most generations of Intel processor and for current Ryzen CPUs. For older AMD FX and Phenom series, you should use a thermal monitor that has options for "Distance to TJmax" and you want to NOT see distance to TJmax drop below 10°C distance to TJmax. Anything that is MORE than 10°C distance to TJmax is within the allowed thermal envelope.)

If your CPU passes the thermal compliance test, move on to stability.

Download and install Realbench. Run Realbench and choose the Stress test option. Choose a value from the available memory (RAM) options that is equal to approximately half of your installed memory capacity. If you have 16GB, choose 8GB. If you have 8GB, choose 4GB, etc. Click start and allow the stability test to run for 8 hours. Do not plan to use the system for ANYTHING else while it is running. It will run realistic AVX and handbrake workloads and if it passes 8 hours of testing it is probably about as stable as you can reasonably expect.

If you wish to check stability further you can run 12-24 hours of Prime95 Blend mode or Small FFT.

You do not need to simultaneously run HWinfo or CoreTemp while running Realbench as you should have already performed the thermal compliance test PLUS Realbench will show current CPU temperatures while it is running.

If you run the additional stability test using Prime95 Blend/Small FFT modes for 12-24 hours, you will WANT to also run HWinfo alongside it. Monitor HWinfo periodically to verify that no cores/threads are showing less than 100% usage. If it is, then that worker has errored out and the test should be stopped.

If you find there are errors on ANY of the stability tests including Realbench or Prime95, or any other stress testing utility, you need to make a change in the bios. This could be either dropping the multiplier to a lower factor or increasing the voltage while leaving the multiplier the same. If you change voltage or multiplier at ANY time, you need to start over again at the beginning and verify thermal compliance again.

A more in depth but general guide that is still intended for beginners or those who have had a small amount of experience overclocking can be found here:


*CPU overclocking guide for beginners
 
Last edited:

Assaf Patishi

Commendable
Jan 11, 2017
33
0
1,540
1
Thanks for the replies guys.
Darkbreeze, I am only willing to stand those temps to pass the test, and it's only with prime95 that I am getting this kind of temps. Actualy, in 1.3v - 1.33v the temps are very reasonable with all other softwares, even Aida64 doesn't exceed 85c. Its only in the AVX part of the prime blend test that the temps goes that high. Besides, this is what thermal throttling is for isn't it? and like I said when I test with stock settings I still get 100c max but at least the pc keep running and doesn't restart suddenly. It is not temperature related IMO, maybe related to power somehow.

My Full specs:
9900k
Z390 Aorus Master (F4 bios version)
Noctua NH-D15 (SgtScream this answers your question)
16GB (2x8) 3333mhz corsair memory
1000W RMx corsair PSU

I also got pretty good case: Fractal meshify c with two 140mm Noctua Industrial (2000 rpm) and rear corsair ML140 pro (2400 rmp). everything is running 100% full speed when testing and my room temp is about 23-24 with AC on

Everything is shiny new..bought this PC two months ago and build it myself.
 
Last edited:

Assaf Patishi

Commendable
Jan 11, 2017
33
0
1,540
1
f4 is the first bios version. I'd start there and get up to F8.
Thought about it..Do you think it could be it? I sweared that I won't update the bios unless something is unbearable :)
I just want to get some advice on what can cause this before I go running updating the bios.
I read posts of people that encountered this kind of issue with prime, but the conclusions are somewhat unclear. some say it is not enough voltage..
 
Thought about it..Do you think it could be it? I sweared that I won't update the bios unless something is unbearable :)
I just want to get some advice on what can cause this before I go running updating the bios.
People must have encounted this before...
The first thing you allways do is update the bios on a newly built system. Take advantage of the warranty before it expires. You only update the bios as a last resort on older systems without warranty because it's not worth the risk. Those updates are there for a reason. They aren't pushing them out to brick your system.
 

CompuTronix

Judicious
Moderator
The issue with Prime95 is strictly version specific, which is a known problem concerning AVX.

Which Prime95 version did you run?

Use only version 26.6 (Small FFT's) - http://www.mersenneforum.org/showthread.php?t=15504

Please read Sections 11 & 12: Intel Temperature Guide - https://forums.tomshardware.com/threads/intel-temperature-guide.1488337/

As per Intel’s Datasheets, TDP and Thermal Specifications are validatedwithout AVX.

See page 87 - Section 5.1.1 - Thermal Considerations, first paragraph, second sentence:

8th and 9th Generation Intel® Core™ Processor Families Datasheet, Volume 1 -https://www.intel.com/content/www/us/en/products/docs/processors/core/8th-gen-core-family-datasheet-vol-1.html

CT 😎
 
Reactions: Darkbreeze

Darkbreeze

Titan
Moderator
Those updates are there for a reason. They aren't pushing them out to brick your system.
I agree, for the most part. There are good reasons to not update to a new BIOS version when it is FIRST released, unless you already have an issue, but if that BIOS version stands for a few weeks and there are no major complaints across the relevant forums from it then there is no reason in general to not update. Especially if you already have a BIOS version that incorporates the performance reducing mitigations and microcode changes for the various vulnerabilities. If you have a system that is pre-Spectre and Meltdown, and haven't updated to a BIOS version newer than January 2018, I can see reasons for not wanting to update to a version that incorporates them. I actually wish I had not but for systems that were released after that anyhow, it doesn't really matter since they will already have them.
 

Darkbreeze

Titan
Moderator
Any time you system is just "shutting down" and you are getting 90-100°C or higher temps, it is ALWAYS highly possible for it to be temperature related. Even if it's not the core or package temps causing the protections to kick in, it could easily be OTHER temps such as the VRMs, etc.

Throttle protections only do so much and they are not always successful in preventing damage. Best practices are the "way" to prevent that. Allowing your system to ride high temps like that, are not "best practices" for the average person who is not a competitive overclocker or highly skilled experienced enthusiast overclocker and knows exactly what and why they are doing that.

Listen to and understand what Computronix is telling you. He has done, and does, more testing with Intel processor architectures in his sleep than 95% of the enthusiast community will ever do at all. He is THE person to listen to when it comes to Intel CPUs and test methodology.
 

Assaf Patishi

Commendable
Jan 11, 2017
33
0
1,540
1
The issue with Prime95 is strictly version specific, which is a known problem concerning AVX.

Which Prime95 version did you run?

Use only version 26.6 (Small FFT's) - http://www.mersenneforum.org/showthread.php?t=15504

Please read Sections 11 & 12: Intel Temperature Guide - https://forums.tomshardware.com/threads/intel-temperature-guide.1488337/

As per Intel’s Datasheets, TDP and Thermal Specifications are validatedwithout AVX.

See page 87 - Section 5.1.1 - Thermal Considerations, first paragraph, second sentence:

8th and 9th Generation Intel® Core™ Processor Families Datasheet, Volume 1 -https://www.intel.com/content/www/us/en/products/docs/processors/core/8th-gen-core-family-datasheet-vol-1.html

CT 😎
Thank you very much for the info. But just tell me one thing..what's the point of testing the CPU with non-avx workloads? even some games today and especially chess softrware (which I often use) incorporate avx instructions, and I optimally want to know that my overclock can handle those instructions.
I used the latest prime95 version 29.8. And why Gigabyte recommend version 27.9 in their overclocking guide if you say to use only version 26.6? Actually I care more about stability and error detection rather than just thermal performance. Like I said, in Aida64 (all tests combined), OCCT (large data set) I was able to pass 1 hour+ without any issues with temps well below 85. But prime95 (29.8) small ffts just overkill. I actually think that my overclock was stable under that load but because of the high temperature it is rebooting and preventing me from complete at least 30 minutes.

I guess I will try version 26.6 just to say I passed prime95 😁 and If I pass the rest of the tests (occt large, AIDA64, Realbench..) than I will just keep that overclock at 1.3v-1.31v.

Thank you all very much for all the help.
 

Assaf Patishi

Commendable
Jan 11, 2017
33
0
1,540
1
truegenius, I don't want to do the prime95 blend test again..I am already worried that I damaged the cpu. I decided to leave prime95 alone and stick to the other stress testing softwares.

Computronix , I just saw an old post of yours:


I actually just finished one hour of Realbench stress test (Its actually stressing the whole system including the GPU at 100%!) and I seccesfully passed with temp around 85 and below (89 was the max temp but hey..its 9900k on air here 😁) Including the Benchmark..did 6 runs.
And like I said, I already passed OCCT large set and Aida64. prime95 was a bad idea..I am sorry I went through this, I just damaged the CPU :( at least I leaned something

EDIT: well.. I tested again with primm95 (version 29.8) but this time I disabled AVX completely (AVX+AVX2) and so far passed 40 minutes of the blend test. yeah..the temp reached 90 max but I am ok with that for stress testing purposes, its also similar to other 9900k youtubers results (including derbauer with NH-D15).
is this the right way of testing with prime? I mean..simply disable avx to imitate version 26.6?
 
Last edited:

Darkbreeze

Titan
Moderator
What extreme overclockers do, on systems they don't CARE whether or not they will last six months, or a year, or three, should not be considered as good examples of the best practices that should be used by those who are planning to use their systems for continuous daily driver usage and expect to get years of use out of them.

Der8auer is usually looking for short term, high yield results. Not long term stable overclocks. If you want to be replacing your hardware every year or reinstalling your OS every four months, that is up to you. If not, I'd recommend that you DO make every effort to limit your overclock to one that will comply with the recommended maximum thermal ceiling of 80°C because especially if you are running applications that use AVX instructions heavily you are going to be likely to want a little extra thermal headroom because it's probably going to exceed it at times. Ignoring thermal compliance because "you are OK with it" is a bad idea IMO.

Yes, disabling AVX on newer versions is ok, because it still presents a steady state load which is what you want for thermal testing. You don't want temps that bounce all over the place like a bullrider in a rodeo. But you DO want to limit you OC to one that doesn't exceed 80°C with AVX disabled, because when you run AVX applications (Unless you use an AVX offset) you are likely going to exceed that. If you are already at 90°C WITHOUT AVX, you are going to get yourself into trouble with thermal damage. Guaranteed.
 

Assaf Patishi

Commendable
Jan 11, 2017
33
0
1,540
1
What extreme overclockers do, on systems they don't CARE whether or not they will last six months, or a year, or three, should not be considered as good examples of the best practices that should be used by those who are planning to use their systems for continuous daily driver usage and expect to get years of use out of them.

Der8auer is usually looking for short term, high yield results. Not long term stable overclocks. If you want to be replacing your hardware every year or reinstalling your OS every four months, that is up to you. If not, I'd recommend that you DO make every effort to limit your overclock to one that will comply with the recommended maximum thermal ceiling of 80°C because especially if you are running applications that use AVX instructions heavily you are going to be likely to want a little extra thermal headroom because it's probably going to exceed it at times. Ignoring thermal compliance because "you are OK with it" is a bad idea IMO.

Yes, disabling AVX on newer versions is ok, because it still presents a steady state load which is what you want for thermal testing. You don't want temps that bounce all over the place like a bullrider in a rodeo. But you DO want to limit you OC to one that doesn't exceed 80°C with AVX disabled, because when you run AVX applications (Unless you use an AVX offset) you are likely going to exceed that. If you are already at 90°C WITHOUT AVX, you are going to get yourself into trouble with thermal damage. Guaranteed.
Thank you for your thoughts. But like I said I feel the temps are ok for the applications I am using daily, and with all due respect I think that 80c for max temp under load is a little low..at least for 9900k, and 85 is more reasonable.
When I'll get to the point where the CPU go above 85 I will definitely consider to revert to stock clocks. So far in gaming (which is what I mostly do) it stays under 80c.
 

CompuTronix

Judicious
Moderator
CompuTronix , I just saw an old post of yours:

... I tested again with Prime95 (version 29.8) but this time I disabled AVX completely (AVX+AVX2) ... is this the right way of testing with prime? ... simply disable avx to imitate version 26.6?
Yes.

Assaf Patishi,

If you want to find your valid baseline Core temperatures, then you need to test your processor just as Intel does. This means following a procedure that complies with Intel's Datasheets.

The correct approach is to replicate Intel's test conditions and methodology as closely as possible. This means normalizing, minimizing, or eliminating the 3 major variables; environment, hardware & software.

In post #8, I said:

Please read Sections 11 & 12: Intel Temperature Guide - https://forums.tomshardware.com/threads/intel-temperature-guide.1488337/

Respectfully, the nature of your questions suggests that you haven't done so. I was hoping that you would've read it by now, as the Guide explains how to properly conduct a thermal test.

Keep in mind that we're trying to help you. Darkbreeze and SgtScream have given you some very good advice. To that end, we have very good reasons for taking the time to provide links for our members to read.

PLEASE read it ... carefully ... after which I'll be happy to answer your questions.

CT 😎
 
Reactions: Darkbreeze

Karadjgne

Titan
Herald
1. You didn't damage the cpu. They have built in measures that will automatically drop clocks and corresponding voltages in order to lower temps. This happens above 80°C ish. If that fails, and the cpu perceives it is going into burn-out territory, it'll shut down immediately. No bluescreen, no warnings, just off. This'll happen somewhere beyond 90°C depending on the cpu. So you should be fine.

2. Prime95 has but 1 real value (beyond its intended number crunching) and that's temp limits. It so happens that the small fft torture test uses the same cpu instructions as the vast majority of games. So p95 small becomes the virtual limit for gaming results. Small fft also runs a 100% continuous load. No burns, no ups and downs like Aida64 or OCCT. Straight load. It's very continuity is what makes small fft perfect for temps limits. When it says 70°, it's 70°, not a peak 1 time at 70° for a few seconds and then bouncing between 60 and 65°. You'll know exactly what the threshold is.

3. But that's temps. Only. Got nothing to do with stability. Asus RealBench is far better suited there as it tests the whole system, gpu, cpu, ram, storage, pcie, USB, everything. From several different directions. It'll try and trip the pc up, create instability. If you can pass 4Hrs of AR, consider it successful OC. Whether or not you like or dislike the temps.

So with OC, you will bounce back and forth between the 2 programs and bios, until you can dial in a stable OC at acceptable temps and acceptable OC. It can and sometimes will get extremely frustrating and time consuming, depending a lot on your knowledge of the bios and settings, expectations, silicon lottery, hardware choices etc.
 
Clearly assorted AVX workloads are pushing you over the edge, but, there are AVX offsets that can be applied. (Intel's XTU allows AVX offset and peak multipliers, and core voltage for that matter to be easily manipulated; in the event of any crash with XTU running, everything will reset to defaults/stock)

As others have already alluded to, v26.6 is the standard, others are akin to 110-120% overloads...(in my testing, the later versions of P95 post v26.6 do indeed induce 4-7C warmer temps than v26.6)

As many mainboards likely have something akin to 95-110 watt TDP/power limits in their respective BIOS settings, that is likely what hard reset your system, as normally it should just throttle at near max temps...(to run at 5 GHz all core, realistically, you will likely have 160+ watt TDP peaks at 1.37V...)

I'm sorry to hear the NH-D15 was not handling it....you can always use sub-5 GHz max multipliers, as most folks are GPU-limited anyway. I doubt your apps/games will suffer at 'only' 4.8 GHz, still 100 MHz over all-core turbo spec, etc... That last 100-200 MHz is what needs the high voltage, and likely spiking your temps by 5-8C...

Most gaming CPU loads are a full 5C below what's encountered even on Prime 95 blended mode, and 8-10C less than temps seen with P95/small FFTs...
 

ASK THE COMMUNITY

TRENDING THREADS