[SOLVED] BSOD: WHEA_UNCORRECTABLE_ERROR

Status
Not open for further replies.
Mar 14, 2022
7
0
10
I can keep Chrome open for hours, or use Word and other light software forever, but for some reason, gaming causes this memory error. I am unable to play any games. I have done bug checks from WINDBG, and have attached them to the link below: https://1drv.ms/u/s!At_Uera_B1jU7lrk-GRQ9_gyCCPH?e=YCxWUr.

Specifically, I get a WHEA_MEMORY_ERROR_SECTION, and then when I scroll further down, the FAILURE_BUCKET_ID: 0x124_0_AuthenticAMD_MEMORY_UNKNOWN_FATAL_IMAGE_AuthenticAMD.sys.

I am unsure if this is a RAM error or a CPU error. Personally, I hope it's the latter as I still have warranty, and in-store product replacement on that component. I would rather get this resolved without having to deal with that though.

Any advice would be appreciated, thank you!
 
Solution
Hey everyone,

Thanks for your replies. I have completed several Mem86 tests on both sticks of RAM (as well as each one individually). The test with both sticks resulted in one error, but the test with each stick individually revealed 0 errors. So that's been great.

@johnbl, new crashes have produced the following code:
ILE_IN_CAB: 031922-5125-01.dmp

BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffca02a2503028

BUGCHECK_P3: bc000800

BUGCHECK_P4: 1010135

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: System

STACK_TEXT:
ffffa181d1f2c948 fffff80741b0178b : 0000000000000124 0000000000000000 ffffca02a2503028 00000000bc000800 : nt!KeBugCheckEx
ffffa181d1f2c950 fffff807448710c0 ...
I can keep Chrome open for hours, or use Word and other light software forever, but for some reason, gaming causes this memory error. I am unable to play any games. I have done bug checks from WINDBG, and have attached them to the link below: https://1drv.ms/u/s!At_Uera_B1jU7lrk-GRQ9_gyCCPH?e=YCxWUr.

Specifically, I get a WHEA_MEMORY_ERROR_SECTION, and then when I scroll further down, the FAILURE_BUCKET_ID: 0x124_0_AuthenticAMD_MEMORY_UNKNOWN_FATAL_IMAGE_AuthenticAMD.sys.

I am unsure if this is a RAM error or a CPU error. Personally, I hope it's the latter as I still have warranty, and in-store product replacement on that component. I would rather get this resolved without having to deal with that though.

Any advice would be appreciated, thank you!
Have you ran memtest86 this appears to be a ram cache issue like a lot of people having lately caused most commonly by improper ram because people buy a mobo without looking a lot supported components and memtest may not even catch this issue
 
Have you ran memtest86 this appears to be a ram cache issue like a lot of people having lately caused most commonly by improper ram because people buy a mobo without looking a lot supported components and memtest may not even catch this issue

Hello,

Thanks for replying! I have both ran and cleared Memtest86, but I will do so again, just to make sure. The RAM compatibility is not an issue, I have been using it for a while with this current system.
 
Hello,

Thanks for replying! I have both ran and cleared Memtest86, but I will do so again, just to make sure. The RAM compatibility is not an issue, I have been using it for a while with this current system.
The wrong ram for a mobo may not show any signs of error in a PC for its whole life or randomly 6yrs in or right off the bat there will be issues wrong ram is wrong ram
 
Well I have a Gigabyte B550i Aorus Pro AX Motherboard, and my RAM is G-Skill Ripjaws V D4 3600 8G x 2 C16 RAM. I do not believe that the two are incompatible as the person at Memory Express also verified both products for me and so did PCPartPicker.
 
0= cpu called the error. so it would be cpu cache RAM error not external RAM

I would also go into bios and set everything to defaults. then check the cpu temps, setup the memory profiles and set the fan speed to max
i guess I would also get all of the chipset drivers from the motherboard vendor rather than directly from AMD.
 
Last edited:
  • Like
Reactions: white.a.drew
0= cpu called the error. so it would be cpu cache RAM error not external RAM

I would also go into bios and set everything to defaults. then check the cpu temps, setup the memory profiles and set the fan speed to max
i guess I would also get all of the chipset drivers from the motherboard vendor rather than directly from AMD.
Ram cache errors will replicate through many different codes some may be fauls to to the exact reason so your issue still may reside in the ram
 
Hey everyone,

Thanks for your replies. I have completed several Mem86 tests on both sticks of RAM (as well as each one individually). The test with both sticks resulted in one error, but the test with each stick individually revealed 0 errors. So that's been great.

@johnbl, new crashes have produced the following code:
ILE_IN_CAB: 031922-5125-01.dmp

BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffca02a2503028

BUGCHECK_P3: bc000800

BUGCHECK_P4: 1010135

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: System

STACK_TEXT:
ffffa181d1f2c948 fffff80741b0178b : 0000000000000124 0000000000000000 ffffca02a2503028 00000000bc000800 : nt!KeBugCheckEx
ffffa181d1f2c950 fffff807448710c0 : 0000000000000000 ffffa181d1f2ca29 ffffca02a2503028 ffffca02a2503028 : nt!HalBugCheckSystem+0xeb
ffffa181d1f2c990 fffff80741c3faa3 : 0000000000000000 ffffa181d1f2ca29 ffffca02a2503028 ffffca02a12faa00 : PSHED!PshedBugCheckSystem+0x10
ffffa181d1f2c9c0 fffff80741b0317d : ffffca02a2dcbb00 ffffca02a2dcbb00 ffffca02a12faa50 178bfbff7ef8320b : nt!WheaReportHwError+0x393
ffffa181d1f2ca90 fffff80741b035c8 : 0000000000000008 ffffca0200000003 ffffc388b1177000 0000000000000008 : nt!HalpMcaReportError+0xb1
ffffa181d1f2cbf0 fffff80741b0345c : ffffca029fd47600 0000000000000000 ffffa181d1f2ce00 0000000000000000 : nt!HalpMceHandlerCore+0x138
ffffa181d1f2cc50 fffff80741b028fb : ffffca029fd47600 ffffa181d1f2cef0 0000000000000000 0000000000000000 : nt!HalpMceHandler+0xe0
ffffa181d1f2cc90 fffff80741b0527b : ffffca029fd47600 0000000000000000 0000000000000000 0000000000000000 : nt!HalpHandleMachineCheck+0x97
ffffa181d1f2ccc0 fffff80741b65869 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!HalHandleMcheck+0x3b
ffffa181d1f2ccf0 fffff80741a260fe : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiHandleMcheck+0x9
ffffa181d1f2cd20 fffff80741a25d28 : 0000000000000010 0000000000000000 fffff807459ef000 0000000000000000 : nt!KxMcheckAbort+0x7e
ffffa181d1f2ce60 fffff8074195eed6 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiMcheckAbort+0x2a8
ffffc388b117d420 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KxFlushSingleTb+0xca


MODULE_NAME: AuthenticAMD

IMAGE_NAME: AuthenticAMD.sys

STACK_COMMAND: .cxr; .ecxr ; kb

FAILURE_BUCKET_ID: 0x124_0_AuthenticAMD_MEMORY__UNKNOWN_FATAL_IMAGE_AuthenticAMD.sys


So I think it might be CPU/Motherboard related? CPU temperatures are absolutely fine so not sure what is causing this issue.
 
Hey everyone,

Thanks for your replies. I have completed several Mem86 tests on both sticks of RAM (as well as each one individually). The test with both sticks resulted in one error, but the test with each stick individually revealed 0 errors. So that's been great.

@johnbl, new crashes have produced the following code:
ILE_IN_CAB: 031922-5125-01.dmp

BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffca02a2503028

BUGCHECK_P3: bc000800

BUGCHECK_P4: 1010135

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: System

STACK_TEXT:
ffffa181d1f2c948 fffff80741b0178b : 0000000000000124 0000000000000000 ffffca02a2503028 00000000bc000800 : nt!KeBugCheckEx
ffffa181d1f2c950 fffff807448710c0 : 0000000000000000 ffffa181d1f2ca29 ffffca02a2503028 ffffca02a2503028 : nt!HalBugCheckSystem+0xeb
ffffa181d1f2c990 fffff80741c3faa3 : 0000000000000000 ffffa181d1f2ca29 ffffca02a2503028 ffffca02a12faa00 : PSHED!PshedBugCheckSystem+0x10
ffffa181d1f2c9c0 fffff80741b0317d : ffffca02a2dcbb00 ffffca02a2dcbb00 ffffca02a12faa50 178bfbff7ef8320b : nt!WheaReportHwError+0x393
ffffa181d1f2ca90 fffff80741b035c8 : 0000000000000008 ffffca0200000003 ffffc388b1177000 0000000000000008 : nt!HalpMcaReportError+0xb1
ffffa181d1f2cbf0 fffff80741b0345c : ffffca029fd47600 0000000000000000 ffffa181d1f2ce00 0000000000000000 : nt!HalpMceHandlerCore+0x138
ffffa181d1f2cc50 fffff80741b028fb : ffffca029fd47600 ffffa181d1f2cef0 0000000000000000 0000000000000000 : nt!HalpMceHandler+0xe0
ffffa181d1f2cc90 fffff80741b0527b : ffffca029fd47600 0000000000000000 0000000000000000 0000000000000000 : nt!HalpHandleMachineCheck+0x97
ffffa181d1f2ccc0 fffff80741b65869 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!HalHandleMcheck+0x3b
ffffa181d1f2ccf0 fffff80741a260fe : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiHandleMcheck+0x9
ffffa181d1f2cd20 fffff80741a25d28 : 0000000000000010 0000000000000000 fffff807459ef000 0000000000000000 : nt!KxMcheckAbort+0x7e
ffffa181d1f2ce60 fffff8074195eed6 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiMcheckAbort+0x2a8
ffffc388b117d420 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KxFlushSingleTb+0xca


MODULE_NAME: AuthenticAMD

IMAGE_NAME: AuthenticAMD.sys

STACK_COMMAND: .cxr; .ecxr ; kb

FAILURE_BUCKET_ID: 0x124_0_AuthenticAMD_MEMORY__UNKNOWN_FATAL_IMAGE_AuthenticAMD.sys


So I think it might be CPU/Motherboard related? CPU temperatures are absolutely fine so not sure what is causing this issue.
you would need to run some commands in the debugger
!errrec ffffca02a2503028
this will report why the cpu called the bugcheck
!sysinfo cpuinfo
here you would look at the cpu speed it was running at.

for internal cpu errors: the cpu is very sensitive to certain voltages applied to pins of the cpu. CPUs have primary cache memory banks and secondary cache memory banks and the cpus can run at various clock rates. The voltage on pins tells the connection between the cache banks how fast to run. If the voltage is incorrect for the current frequency of the cpu then the data in the transfer is locked in and sampled at the wrong time before the electronics have stabilized. (violation of electronics set up and hold time requirements)

basically, a binary zero value correlates to the bottom 1/3 of the voltage range,
the middle 1/3 of the voltage range would be undefined (often the bit gets locked in at what ever the last setting was)
and the top 1/3 of the voltage range is defined as a binary 1

the cpu voltage to certain pins set the time when a snapshot of the values are made. if this voltage is wrong, then the snapshot is made at the wrong time and you can get a 1 or 0 value when the voltage is in the undefined range.
the cpu does a checksum on the values and if it is correct or you get two errors that cancel each other out then the cpu continues on. if it detect a error it calls a bugcheck because it can not trust its data in side of the cpu cache.

problem: -as the temp of the cpu changes the timing window moves.
and more errors are detected.
- the voltage used on the pins depends on the electronics of the motherboard.
the starting voltage is looked up in a table in BIOS. something like at this cpu core frequency, then voltage should be X. This is tuned over time for each motherboard version and is why you have to update the BIOS to get updated tables. This is also why as intel releases new cpu's that are lower voltages or run at a higher frequency you get problems because the bios has to be updated or these bios tables will tell the motherboard to apply too high a voltage to the cpu and cause data corruption in the transfer of data between levels of cache inside of the cpu.

overclock tools, they throw in more problems since you can tweak voltages to the cpu pins. same goes for the new bios version that will automatically tweak voltages for you. for debugging, you never want to see overclocking tools and want people to set the bios to defaults so you have the best chance to have working hardware. The debugger can not detect these slight changes, all it can really look at is the cpu clock rate and maybe thermal zones.
most of the time the thermal zones in the debugger are useless. The cpu clock rate is only notable when it shows some rate that is not a multiple of 100Mhz.
or it is set to a range outside of the CPU normal range. ie cpu that runs at 3GHz but is running at 2.9 GHz. it could be overheated and the cpu is trying not to burn up or maybe the bios was not updated and does not know about your cpu version that runs at a higher clock rate.

for older systems, dust in cpu fans cause over heating and cpu cache memory errors, voltages from a power supply change over time and can cause these error. The thermal paste on the cpu cooler get hard and does not make a good thermal connection.
I have seen older machines that the cpu cooler became disconnected on one side when the machine was moved. bad connection = bad cooling.
I have seen water cooled system develop vapor bubbles and not cool the cpu correctly have this type of errors.

capacitors In the power supply or on the motherboard can start to fail and voltages change over time.

anyway, hope some of this might help

just as a side note: the most common error I used to see with RAM (external memory) was incorrect setting of the command rate for the RAM. often RAM requires 2t or 2N clock rate for commands to be locked in but the BIOS defaults are set to 1T or 1N clock rate. This is for each tick of the main clock. It is often hard to find out what the clock rate should be for your actual memory chips.
often people reject memory as bad when this parameter is just not set correctly in the BIOS. there are about 12 parameters that can be set for RAM, there is no set order to the parameters so most people only check the first few parameters.
---------
when i googled your RAM the timings looked like they should be
17-18-18-36-2T timings

the last one it the command rate (2 clock cycles before the electronics setup and hold time are valid for this RAM to set up its commands to access a memory location)
you should confirm your bios is set correctly.

also, you might find that it will work correctly with 1N with single stick of ram but you have slow it down to 2N when the motherboard RAM get fully populated. the timing is affected by the distance the RAM stick if from the CPU. More RAM and greater distance from the CPU means you have to use a slower timings. Some motherboard vendors put this in the fine print of the manuals for the RAM setup.
 
Last edited:
Solution
Hey everyone,

Thanks for your replies. I have completed several Mem86 tests on both sticks of RAM (as well as each one individually). The test with both sticks resulted in one error, but the test with each stick individually revealed 0 errors. So that's been great.

@johnbl, new crashes have produced the following code:
ILE_IN_CAB: 031922-5125-01.dmp

BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffca02a2503028

BUGCHECK_P3: bc000800

BUGCHECK_P4: 1010135

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: System

STACK_TEXT:
ffffa181d1f2c948 fffff80741b0178b : 0000000000000124 0000000000000000 ffffca02a2503028 00000000bc000800 : nt!KeBugCheckEx
ffffa181d1f2c950 fffff807448710c0 : 0000000000000000 ffffa181d1f2ca29 ffffca02a2503028 ffffca02a2503028 : nt!HalBugCheckSystem+0xeb
ffffa181d1f2c990 fffff80741c3faa3 : 0000000000000000 ffffa181d1f2ca29 ffffca02a2503028 ffffca02a12faa00 : PSHED!PshedBugCheckSystem+0x10
ffffa181d1f2c9c0 fffff80741b0317d : ffffca02a2dcbb00 ffffca02a2dcbb00 ffffca02a12faa50 178bfbff7ef8320b : nt!WheaReportHwError+0x393
ffffa181d1f2ca90 fffff80741b035c8 : 0000000000000008 ffffca0200000003 ffffc388b1177000 0000000000000008 : nt!HalpMcaReportError+0xb1
ffffa181d1f2cbf0 fffff80741b0345c : ffffca029fd47600 0000000000000000 ffffa181d1f2ce00 0000000000000000 : nt!HalpMceHandlerCore+0x138
ffffa181d1f2cc50 fffff80741b028fb : ffffca029fd47600 ffffa181d1f2cef0 0000000000000000 0000000000000000 : nt!HalpMceHandler+0xe0
ffffa181d1f2cc90 fffff80741b0527b : ffffca029fd47600 0000000000000000 0000000000000000 0000000000000000 : nt!HalpHandleMachineCheck+0x97
ffffa181d1f2ccc0 fffff80741b65869 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!HalHandleMcheck+0x3b
ffffa181d1f2ccf0 fffff80741a260fe : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiHandleMcheck+0x9
ffffa181d1f2cd20 fffff80741a25d28 : 0000000000000010 0000000000000000 fffff807459ef000 0000000000000000 : nt!KxMcheckAbort+0x7e
ffffa181d1f2ce60 fffff8074195eed6 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiMcheckAbort+0x2a8
ffffc388b117d420 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KxFlushSingleTb+0xca


MODULE_NAME: AuthenticAMD

IMAGE_NAME: AuthenticAMD.sys

STACK_COMMAND: .cxr; .ecxr ; kb

FAILURE_BUCKET_ID: 0x124_0_AuthenticAMD_MEMORY__UNKNOWN_FATAL_IMAGE_AuthenticAMD.sys


So I think it might be CPU/Motherboard related? CPU temperatures are absolutely fine so not sure what is causing this issue.
What's your system speccs again? Including full mobo and ram
 
you would need to run some commands in the debugger
!errrec ffffca02a2503028
this will report why the cpu called the bugcheck
!sysinfo cpuinfo
here you would look at the cpu speed it was running at.

for internal cpu errors: the cpu is very sensitive to certain voltages applied to pins of the cpu. CPUs have primary cache memory banks and secondary cache memory banks and the cpus can run at various clock rates. The voltage on pins tells the connection between the cache banks how fast to run. If the voltage is incorrect for the current frequency of the cpu then the data in the transfer is locked in and sampled at the wrong time before the electronics have stabilized. (violation of electronics set up and hold time requirements)

basically, a binary zero value correlates to the bottom 1/3 of the voltage range,
the middle 1/3 of the voltage range would be undefined (often the bit gets locked in at what ever the last setting was)
and the top 1/3 of the voltage range is defined as a binary 1

the cpu voltage to certain pins set the time when a snapshot of the values are made. if this voltage is wrong, then the snapshot is made at the wrong time and you can get a 1 or 0 value when the voltage is in the undefined range.
the cpu does a checksum on the values and if it is correct or you get two errors that cancel each other out then the cpu continues on. if it detect a error it calls a bugcheck because it can not trust its data in side of the cpu cache.

problem: -as the temp of the cpu changes the timing window moves.
and more errors are detected.
- the voltage used on the pins depends on the electronics of the motherboard.
the starting voltage is looked up in a table in BIOS. something like at this cpu core frequency, then voltage should be X. This is tuned over time for each motherboard version and is why you have to update the BIOS to get updated tables. This is also why as intel releases new cpu's that are lower voltages or run at a higher frequency you get problems because the bios has to be updated or these bios tables will tell the motherboard to apply too high a voltage to the cpu and cause data corruption in the transfer of data between levels of cache inside of the cpu.

overclock tools, they throw in more problems since you can tweak voltages to the cpu pins. same goes for the new bios version that will automatically tweak voltages for you. for debugging, you never want to see overclocking tools and want people to set the bios to defaults so you have the best chance to have working hardware. The debugger can not detect these slight changes, all it can really look at is the cpu clock rate and maybe thermal zones.
most of the time the thermal zones in the debugger are useless. The cpu clock rate is only notable when it shows some rate that is not a multiple of 100Mhz.
or it is set to a range outside of the CPU normal range. ie cpu that runs at 3GHz but is running at 2.9 GHz. it could be overheated and the cpu is trying not to burn up or maybe the bios was not updated and does not know about your cpu version that runs at a higher clock rate.

for older systems, dust in cpu fans cause over heating and cpu cache memory errors, voltages from a power supply change over time and can cause these error. The thermal paste on the cpu cooler get hard and does not make a good thermal connection.
I have seen older machines that the cpu cooler became disconnected on one side when the machine was moved. bad connection = bad cooling.
I have seen water cooled system develop vapor bubbles and not cool the cpu correctly have this type of errors.

capacitors In the power supply or on the motherboard can start to fail and voltages change over time.

anyway, hope some of this might help

just as a side note: the most common error I used to see with RAM (external memory) was incorrect setting of the command rate for the RAM. often RAM requires 2t or 2N clock rate for commands to be locked in but the BIOS defaults are set to 1T or 1N clock rate. This is for each tick of the main clock. It is often hard to find out what the clock rate should be for your actual memory chips.
often people reject memory as bad when this parameter is just not set correctly in the BIOS. there are about 12 parameters that can be set for RAM, there is no set order to the parameters so most people only check the first few parameters.
---------
when i googled your RAM the timings looked like they should be
17-18-18-36-2T timings

the last one it the command rate (2 clock cycles before the electronics setup and hold time are valid for this RAM to set up its commands to access a memory location)
you should confirm your bios is set correctly.

also, you might find that it will work correctly with 1N with single stick of ram but you have slow it down to 2N when the motherboard RAM get fully populated. the timing is affected by the distance the RAM stick if from the CPU. More RAM and greater distance from the CPU means you have to use a slower timings. Some motherboard vendors put this in the fine print of the manuals for the RAM setup.

Thanks for your comprehensive response. I am still trying to digest what you said, but in the meantime, I ran the two tests you suggested, and the results are as follows:
===============================================================================
Common Platform Error Record @ ffffca02a2503028
-------------------------------------------------------------------------------
Signature : * INVALID *
Revision : 0.0
Record Id : 0000000000000000
Severity : Recoverable (0)
Length : 0
Creator : {18efe4e4-00b1-0000-8202-000000000000}
Notify Type : {18efe4e4-00b1-0000-8202-000000000000}
Flags : 0x00000000

9: kd> !sysinfo cpuinfo
[CPU Information]
~MHz = REG_DWORD 3693
Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0
Identifier = REG_SZ AMD64 Family 25 Model 33 Stepping 0
ProcessorNameString = REG_SZ AMD Ryzen 5 5600X 6-Core Processor
Update Status = REG_DWORD 1
VendorIdentifier = REG_SZ AuthenticAMD
 
you need to confirm you have updated the bios to the current version, then updated the motherboard drivers for the chipset.
cpu came out November 2020, check the date of the bios version to confirm that it supports this cpu id.

if you have the current bios version, you should reset it to defaults and turn off any automatic overclocking disable any bios action that might change cpu frequencies. Then test to see if you still bugcheck.
if you have installed the amdrazermaster driver you should disable it.
you can download microsoft autoruns to disable drivers that tweak on the CPU.
The goal would be to get the cpu stable at a fixed frequency.

looks like the error record is all zeros so it is not valid.
 
Last edited:
you need to confirm you have updated the bios to the current version, then updated the motherboard drivers for the chipset.
cpu came out November 2020, check the date of the bios version to confirm that it supports this cpu id.

if you have the current bios version, you should reset it to defaults and turn off any automatic overclocking disable any bios action that might change cpu frequencies. Then test to see if you still bugcheck.
if you have installed the amdrazermaster driver you should disable it.
you can download microsoft autoruns to disable drivers that tweak on the CPU.
The goal would be to get the cpu stable at a fixed frequency.

looks like the error record is all zeros so it is not valid.

Hey John,

THanks for your help. I tried everything you stated and that didn't work, so I just took it to Memory Express. They replaced the CPU cooler because it was running very hot in tests, and now everything was fine, but the new issue is that the computer freezes (no BSOD), until I manually restart it. As well, whatever sound is playing also freezes. Here is a video of the event: https://photos.app.goo.gl/FNMbviYXWNRVMK5g6
 
Hey John,

THanks for your help. I tried everything you stated and that didn't work, so I just took it to Memory Express. They replaced the CPU cooler because it was running very hot in tests, and now everything was fine, but the new issue is that the computer freezes (no BSOD), until I manually restart it. As well, whatever sound is playing also freezes. Here is a video of the event: https://photos.app.goo.gl/FNMbviYXWNRVMK5g6
I looked at the video. if the display driver does not respond within a time out (i think it is 2 seconds) then windows sends a reset signal to the hardware device to get it to recover. This could be what is going on. most common problem for that is overheating. or bad overclocking software. (overheating changes the timing in the electronics and has the same effect as a bad overclock)

for the new issue, go to window control panel, find device manager and look at all of the sound devices. you should try and disable ones that do not have speakers attached to them. Also, go to your motherboard vendors website and update the sound driver for the motherboard. old versions of realtek sound driver responded to signals that belong to other sound drivers. This causes buffer overruns in the second sound device and strange crashes.
 
Status
Not open for further replies.