[SOLVED] AMD Threadripper 1950X system freezing while gaming.

LightningsalesUK

Distinguished
Dec 25, 2013
601
0
19,010
My system keeps freezing while playing games, when playing games such as Assetto Corsa, Overwatch and Minion Masters, Games such as CS:GO and BeamNG.Drivce have never had this freezing issue. At a random point while playing (shorter or longer depending on the game) it will freeze, the screen will be stuck on the frame it froze at. The PC will need to be turned off and on again to get it functional.
I put this thread with the CPU posts as think it's the CPU causing the issue. I have noticed the CPU temperature fluctuating randomly and my CPU idle temps seems to be a lot higher than when I first built my system (6 months ago). I have noted my CPU sitting at around 50-55c when idle compared to around 40-42 from when the build was first built, could this just be from the summer heat? I have also noted my CPU spiking randomly from its idle temperature to 65-68c causing the fans to power up before it quickly reaches the idle temperature again. Once this starts happening it will continue till the machine is turned off.
Since upgrading the GPU with the Kraken G12 the temps of my GPU had dropped significantly and doesn't reach over 50c under full load.

My specs are as follows.
AMD Threadripper 1950x (watercooled via Enermax Liquidtech 320mm)
32GB DDR4 3200MHz Corsair Memory
GTX 1080ti 11GB (watercooled via Kraken G12 + 140MM AIO, RAM heatsinks)
256GB Samsung M.2
120GB SSD
3TB HDD
1TB HDD
500GB SSD
Evga SuperNova 850W (with Both CPU power plugs in)
 
Update the BIOS, uninstall the GPU drivers using DDU and reinstall the latest version. While the higher temp might be due to the summer heat, but the random spikes cannot be due to this, and does indicate a problem. Try re-applying the thermal paste on the CPU, see if it helps.

Edit: While you're at it, also make sure that there are no bent pins on the socket.
 
I admit I don't know much about AMD, but I'd guess there's a way to disable some of the cpu cores. Personally I'd consider testing with some of the cores disabled to see if that has any impact on the issue. If you have faulty cores or thermal issues the issue should be less apparent with cores turned off.
 
Mostly I think your CPU has been running too hot causing the issues. You might not have noticed the higher temperatures when the crashes occur. If you have, then what temps have you gotten?

The summer heat should not raise it by that much.

Regarding the temperature differences start from doing the easiest things such as checking BIOS updates (I did not see what motherboard have) and making sure all the fans are working properly. You might need to adjust the settings on the fans...or just checking the settings in the system to make sure they have not been adjusted by an update to the fan settings. When you say the fan speeds up quick it makes me think that the settings could have changed or simply the fan curve was set to be more silent so they are slow and when you get higher temps it has sudden jumps rather than gradual increases.

Checking the CPU cooler doesn't have any leaks and the pump is working normally is another thing to check. Even though it is only six months old and you might not have any leaks...the pump itself may be faulty. I am mostly leaning towards this as the most potential issue. Or maybe the thermal paste needs to be reapplied. (That would be the last step to check because once you get to that point I would suggest checking the CPU for bent pins at the same time).

Even with higher temperatures, those should still be within the operating range. Unless you note higher temps before the freezing. Triple checking all the power connectors again and running a CPU stress test while monitoring temps and voltage would be good also. Part of me would worry that you have some sort of power issue from your power supply, motherboard, or bent motherboard pins. But going with the stuff in order and report back so we can get a better idea.
 
Update the BIOS, uninstall the GPU drivers using DDU and reinstall the latest version. While the higher temp might be due to the summer heat, but the random spikes cannot be due to this, and does indicate a problem. Try re-applying the thermal paste on the CPU, see if it helps.

Edit: While you're at it, also make sure that there are no bent pins on the socket.
Pins all good, I will see if there is any bios updates needed. I have tried the GPU driver reinstall recently.
 
I asked, because Kingston has a problem with their SSD's causing problem. I think the Crucial is ok, not sure of the Sandisk.
Can you disconnect it and still reproduce the same problems?
Hard Disk Sentinel - will check all your Drives to see if there are errors (in English) and tell you if that is causing anything.
 
I admit I don't know much about AMD, but I'd guess there's a way to disable some of the cpu cores. Personally I'd consider testing with some of the cores disabled to see if that has any impact on the issue. If you have faulty cores or thermal issues the issue should be less apparent with cores turned off.
Mostly I think your CPU has been running too hot causing the issues. You might not have noticed the higher temperatures when the crashes occur. If you have, then what temps have you gotten?

The summer heat should not raise it by that much.

Regarding the temperature differences start from doing the easiest things such as checking BIOS updates (I did not see what motherboard have) and making sure all the fans are working properly. You might need to adjust the settings on the fans...or just checking the settings in the system to make sure they have not been adjusted by an update to the fan settings. When you say the fan speeds up quick it makes me think that the settings could have changed or simply the fan curve was set to be more silent so they are slow and when you get higher temps it has sudden jumps rather than gradual increases.

Checking the CPU cooler doesn't have any leaks and the pump is working normally is another thing to check. Even though it is only six months old and you might not have any leaks...the pump itself may be faulty. I am mostly leaning towards this as the most potential issue. Or maybe the thermal paste needs to be reapplied. (That would be the last step to check because once you get to that point I would suggest checking the CPU for bent pins at the same time).

Even with higher temperatures, those should still be within the operating range. Unless you note higher temps before the freezing. Triple checking all the power connectors again and running a CPU stress test while monitoring temps and voltage would be good also. Part of me would worry that you have some sort of power issue from your power supply, motherboard, or bent motherboard pins. But going with the stuff in order and report back so we can get a better idea.


I had HWMonitor open and did not see any huge temperature spikes before crashing or while crashing. It seems 50c-70c are okay operating temperatures so I am not too sure either.

The coolers pump was rather hot when opening the case recently, I put an unused RAM heatsink on the pump to dissipate some of the heat. The Cooler seems to have no leaks and is a good position so there is no over bending of the pipes.

The cooler has only 3 fans pulling air into the case so that is why it may not be at it's full potential.
 
I asked, because Kingston has a problem with their SSD's causing problem. I think the Crucial is ok, not sure of the Sandisk.
Can you disconnect it and still reproduce the same problems?
Hard Disk Sentinel - will check all your Drives to see if there are errors (in English) and tell you if that is causing anything.

Drives are all okay, the games causing the issues are spread over different drives so it seems to be something else that could be causing issues.
 
Be sure that you have heasinks on GPU VRM, switching to g12 for the GPU you might removed from mistake the heasink which is cooling the VRM, only a fan on gpu VRM is not enought that is why you must have a heatsink too..

Dont mix up the GPU temp and GPU VRM temp, they are two different things.
 
Be sure that you have heasinks on GPU VRM, switching to g12 for the GPU you might removed from mistake the heasink which is cooling the VRM, only a fan on gpu VRM is not enought that is why you must have a heatsink too..

Dont mix up the GPU temp and GPU VRM temp, they are two different things.
I have had a look and all of the GPU RAM heatsinks are in place and are all good, I only got the Kraken G12 recently, The reason I upgraded was the current issue. I thought at first it could the GPU overheating causing the freezing so I installed the Kraken G12 but the issue was not fixed.
 
So, let me just get it straight.

System was built six months ago and you did not have any issues.

More recently you started getting freezes and higher CPU temps.

You decided to switch out the GPU cooler and that lowered your GPU temps.

The freezing and higher CPU temps have persisted.


----------------

You have checked the usage and temp of CPU and it has not gotten crazy high just before the freezes.

You checked the CPU cooler and it is working right (and everything else seems fine).


I did some research into Threadripper issues and some say that there was an issue with some of them. But a lot of people actually said that there main culprit for freezing while using the Threadripper was actually the RAM needing to be RMAd. Have you tried reseating your RAM (and your GPU since you recently worked on it)? Can you run memtest86 to make sure your RAM is fine?
 
i had this issue a few weeks ago when running handbrake. I ran hwmonitor and checked "cputin" temps as I ran handbrake again. I was shocked to see my temperatures jump to 82c right before crashing. I was scratching my head wondering what could be causing this all of a sudden. It turned out that since it was getting warm out and I haven't cleaned my filters, the radiator on the aio was choking. Cleaned the filters, temps dropped to a max of 70c with handbrake and everything else is a cool 60c or less. This is with a 4ghz oc

Usually if your computer freezes it's associated with the cpu. It's either choking from lack of voltage or too much heat. If you bluescreen, its memory and have to adjust your memory settings to eliminate memory errors.

So since you are hard locking, Slowly increase the cpu voltage until it's stable and not overheating. From what I read, the threadripper 1950 will thermal throttle at 80c. The first thing id do. run prime95, Watch your temps rise. If it freezes up with the temps low, you should increase your cpu voltage. You could also do this with handbrake and the video tab setting set to "very slow" I like using handbrake because it's a real program that I know mimics a real world scenario and isn't just a test.

https://www.mersenne.org/download/

What's your cpu setting you currently have? From previous statements, you aren't overclocked, right? My cpu core voltage is set to 1.295v You can use that as a starting point if you like. Should be more than plenty for a stock speed multiplier(auto).
I know the Enermax aio coolers had a lot of issues with corrosion and just dying. They supposedly fixed them but i've heard others still had issues. I wonder if your pump is stalling and the temps jump so fast that its not even posting them in hwmonitor. When mine was overheating, i watched in real time when I put a load on the cpu, it jumped from 60 to 75 instantly in one refresh cycle. That should never happen.

id run handbrake or prime95. prime95 will slowly ramp up and get more difficult as it goes through the tests so keep an eye on it while it runs. You should abort if it goes above 78+c. remember "cputin" is the one you want to watch. You might have to stop it manually in task manager.

fyi I'm running a msi board and a coolermaster tr4 specific aio. I'm really happy with the coolermaster. If your board has the metal back bracket built in then it's a very good cooler. The back brace they provide is plastic and from I read in reviews breaks when installing which is ridiculous. The fans are loud so I replaced them with really quiet "arctic cooling" fans.

So it could be your cooler or possibly your psu choking the cpu. What psu do you have? I've got two 1950 running. You don't need the second cpu cable for stability with anything 4ghz or less in overclocking. At least with MSI boards. My second 1950 is running on a psu without a second cpu cable. I was concerned about this being a stability issue with my first build and it wasn't the case. As long as your psu is good quality and can provide the proper amperage, you should be fine.
 
So, let me just get it straight.

System was built six months ago and you did not have any issues.

More recently you started getting freezes and higher CPU temps.

You decided to switch out the GPU cooler and that lowered your GPU temps.

The freezing and higher CPU temps have persisted.


----------------

You have checked the usage and temp of CPU and it has not gotten crazy high just before the freezes.

You checked the CPU cooler and it is working right (and everything else seems fine).


I did some research into Threadripper issues and some say that there was an issue with some of them. But a lot of people actually said that there main culprit for freezing while using the Threadripper was actually the RAM needing to be RMAd. Have you tried reseating your RAM (and your GPU since you recently worked on it)? Can you run memtest86 to make sure your RAM is fine?

I think you possibly are on to something, I haven't looked fully into this yet but just a simple test is showing up red flags.

I checked CPU-Z and it showed my RAM at 2133MHz (Its normal run speed is 3000MHz, not 3200MHz as per my first post).
So I booted into BIOS, I believed it may have been an error as I just updated my BIOS an hour or two ago.
I saw that profile one had been disabled so I enabled and tried to boot into Windows.
I only tried to boot into Windows twice however both times it froze as it started the Windows spinning balls loading part.
It would again need a hard reset to get it working.
 
i had this issue a few weeks ago when running handbrake. I ran hwmonitor and checked "cputin" temps as I ran handbrake again. I was shocked to see my temperatures jump to 82c right before crashing. I was scratching my head wondering what could be causing this all of a sudden. It turned out that since it was getting warm out and I haven't cleaned my filters, the radiator on the aio was choking. Cleaned the filters, temps dropped to a max of 70c with handbrake and everything else is a cool 60c or less. This is with a 4ghz oc

Usually if your computer freezes it's associated with the cpu. It's either choking from lack of voltage or too much heat. If you bluescreen, its memory and have to adjust your memory settings to eliminate memory errors.

So since you are hard locking, Slowly increase the cpu voltage until it's stable and not overheating. From what I read, the threadripper 1950 will thermal throttle at 80c. The first thing id do. run prime95, Watch your temps rise. If it freezes up with the temps low, you should increase your cpu voltage. You could also do this with handbrake and the video tab setting set to "very slow" I like using handbrake because it's a real program that I know mimics a real world scenario and isn't just a test.

https://www.mersenne.org/download/

What's your cpu setting you currently have? From previous statements, you aren't overclocked, right? My cpu core voltage is set to 1.295v You can use that as a starting point if you like. Should be more than plenty for a stock speed multiplier(auto).
I know the Enermax aio coolers had a lot of issues with corrosion and just dying. They supposedly fixed them but i've heard others still had issues. I wonder if your pump is stalling and the temps jump so fast that its not even posting them in hwmonitor. When mine was overheating, i watched in real time when I put a load on the cpu, it jumped from 60 to 75 instantly in one refresh cycle. That should never happen.

id run handbrake or prime95. prime95 will slowly ramp up and get more difficult as it goes through the tests so keep an eye on it while it runs. You should abort if it goes above 78+c. remember "cputin" is the one you want to watch. You might have to stop it manually in task manager.

fyi I'm running a msi board and a coolermaster tr4 specific aio. I'm really happy with the coolermaster. If your board has the metal back bracket built in then it's a very good cooler. The back brace they provide is plastic and from I read in reviews breaks when installing which is ridiculous. The fans are loud so I replaced them with really quiet "arctic cooling" fans.

So it could be your cooler or possibly your psu choking the cpu. What psu do you have? I've got two 1950 running. You don't need the second cpu cable for stability with anything 4ghz or less in overclocking. At least with MSI boards. My second 1950 is running on a psu without a second cpu cable. I was concerned about this being a stability issue with my first build and it wasn't the case. As long as your psu is good quality and can provide the proper amperage, you should be fine.

The cooler seems all well and good, I am running the ASUS X399-A board along with an EVGA Supernova 850W, I have been told it should be sufficient but I have been looking to upgrade to the supernova 1200w