Question PCH overheating on H170N, what can I do?

Mar 31, 2024
56
18
35
I'm having issues with my computer: portable HDs stop responding with a variety of I/O, paging and cache errors and disconnecting. It took me a while to understand what was going on as the HDs are fine, with good SMART data, perfect connections through their OEM cables and no bad blocks. It's the mobo temperature that triggers it.

I'm tracking them on HWiNFO. I'm not entirely sure which component of the motherboard is causing the issue. Two temperatures rise at once when the failures happen, the PCH and an unidentified one, "Temperature #5", which I suspect is the VRM. They don't get really hot, 58ºC for the PCH and 56ºC for the maybe-VRM is enough to trigger the failures, but it is what it is, that's when issues start happening. The mobo isn't new after all.

I have a M.2 SSD with my OS install. It occasionally gets hot and is on the underside of the motherboard, directly under the PCH. I'm sure it's worsening the PCH temps.

mobo-components.jpg


What can I do to mitigate or hopefully solve the issue?

My build is a watercooled mini ITX. The other temperatures are fine. I've setup Fan Control to respond to the PCH/maybe-VRM temperatures besides CPU. The PCH has a heatsink and I'm thinking of repasting it, with thermal paste if the gap allows. Other candidates for a do-over are the VRM and the M.2. I've seen people just stick thermal pads over chipsets reasoning the pad would take the heat away and distribute it on a larger surface, but without a heatsink wouldn't also trap heat there for longer?

All of these are band-aids I know, but I have no idea how to begin solving the root problem and right now I can't simply bin that mobo and get a new one.
 
Update your post to include full system hardware specs and OS information.

PSU: make, model, wattage, age, condition (original to build, new, refurbished, used)?

Disk drive(s): make, model, capacity, how full?

Case: make & model.

Also, if possible post a couple of photographs showing both sides of your motherboard.

Post the photographs here via imgur (www.imgur.com).
 
Thanks for getting back to me. Photos: View: https://imgur.com/a/aPSe4BZ


Sorry for leaving the WC block seated. I'm currently out of paste to replace it if I take it off and this is also my work computer, can't stop for long. I was hoping to figure out if I needed to order anything else to take advantage of same shipping, but if you really want to see the obscured parts I'll order it now and update here ~next week when it arrives.

Before anything, extra odd behavior that could or not be more symptoms and new info:

1) My computer seems unable of keeping time in spite of a CMOS battery change 8 months ago. Panasonic brand if I remember it right, not a knock-off AFAIK.

2) A couple of days ago I caught the frontal fan ignoring the curve set in the mobo utility, being stuck at the min RPM. In hindsight it has been suspiciously quiet for a few weeks. Mobo might have gotten hot (70ºC) but other components didn't because no warnings so no harm done to them. After fumbling with settings back and forth in the UEFI it was back to responding to changes done from OS-based apps.

3) Since I had to take my system apart to take the photos I left it partially dismantled with components in but PSU out to access them and do further temperature testing. Turns out I was wrong about which one is triggering it. Managed to bring PCH then temp 5 individually to the levels they were when I reproduced the issue multiple times and it didn't happen now. PSU is surely cooler sitting outside but I won't blame it yet — I'm tired of jumping to wrong conclusions. :tearsofjoy:

Computer Specs​

ComponentModel
MotherboardGigabyte H170N-WIFI
CPUIntel Core i5 6600
WatercoolerDeepcool Maelstrom 120T
GPUNVIDIA Galax GeForce GTX 1060 6GB
RAM#116GB Corsair Vengeance DDR4-2400 1200.5MHz
RAM#28GB Corsair Vengeance DDR4-2400 1200.5MHz
PSUCorsair CX430
CaseSilverstone SG05* (modded)

*About the case
I removed all drive bays to be able to fit a regular PSU with enough airflow for the rest. I replaced the frontal fan with the watercooler. It's also the main intake for the case and and the air flow is fantastic when closed even during summers, which are very hot here. The fan is on CPU_SYS. I'm using the one from the WC not the case, ~1800RPM max, and from the occasional noise it's starting to wear so it's on my list to be replaced soon.

The following peripherals are connected to the remaining mobo USB ports:
Wacom Intuos 4 graphic tablet and wireless receiver for mouse/keyboard combo.

My monitor is is connected to the DP port of the GPU and daisy-chaining the signal to a second monitor.

Drives​

TypeModelFree SpaceNotes
SSD/M.2 SATANVM Express 1.3 Adata Falcon 512GB6.1%Main OS, secondary OS install
SSD/SATASanDisk SDSSDA240G 240GB6.9%
SSD/SATAOCZ Agility4 128GB1.9%
Portable/USB 3.0 (mobo)WD WD40NDZW 4TB96.7%4 months old, w/ issues
Portable/USB 3.0 (mobo)Seagate ST1000LM024 1TB0.6%5+ years old, w/ issues

Operating System​

Microsoft Windows 10 Pro (x64) Build 19044.2130. I update the system and drives manually only once in a while to keep track of any bugs and instability they may introduce.

I also still have a previous Windows 8.1 install, but it is slightly broken making it not the ideal to test things.
 
Last edited:
Given that you reported not having the issues with the PSU not mounted, and I'm not sure how plausible it is, but potentially your PSU could be failing and you're seeing some sort of EMI. The temperatures you're seeing aren't high enough that they should have any impact on operation. Even poor VRM implementations should be fine into the 80s.
 
Given that you reported not having the issues with the PSU not mounted, and I'm not sure how plausible it is, but potentially your PSU could be failing and you're seeing some sort of EMI. The temperatures you're seeing aren't high enough that they should have any impact on operation. Even poor VRM implementations should be fine into the 80s.
I know! To be able to seemingly pinpoint it and reproduce it 15 times then nothing at all, it feels like I'm losing my mind.

I agree the temperatures weren't that high when triggering the issue, however I'm not ready to dismiss the possibility yet it's some sort of accumulated wear due high temperatures. The board likely got to the 70ºCs and remained there for a extended time. The local weather is brutal and this year in particular summer has been really bad, it's heatwave after heatwave. Ambient temperatures at 35ºC+ at day, 32ºC at night, and my home office traps heat.

The PSU is rated at 30ºC. It sits right above the mobo, basking in its heat. I'm sure it operated above the rated temp most of its lifetime. 🥲

Besides the PSU being outside other (unintentional) change made is that I unplugged nearly all connectors, including power, to access the mobo and take the photos. I'm skeptical something badly seated — a real possiblity, it's a cozy case — would wait 5 months to create issues, but you never know. I'll keep an eye on everything for a few more days and put the PSU back to see what happens. There's more hot weather incoming.
 
  • Like
Reactions: 35below0
There's been a (not so) new development.

I've been using the computer the entire week without any issues. Ambient temperature mild to warm (24ºC-31ºC) but nothing close to what it was when the issue happened, fan curve slightly more aggressive so mobo won't go over 57ºC just in case.

All well... then my SSD with the OS install (Adata Falcon) briefly lost it.

Bj7URGo.jpg


There weren't any performance signs this was happening, no stutter, no issues with the portable HDs, I only knew because I setup a scheduled task to alert me to relevant disk window events.

I'm kicking myself because upon a close inspection of past event logs it turned out the Adata flipped out like that a couple of times before at the same time I experienced issues with the portable HDs, but because the other drives were generating a flood of the same error codes I failed to spot Adata's buried in the middle of them.

Reliability history is clear, so are sfc and dism.

As it happened now Crystal Disk info flagged the composite temperature of the drive:

zO4vQcq.jpeg


Not very hot, but hmm.

Another datapoint: This drive is not as well as I'd like it to be. It's my newest SSD and according to the SMART its health is at 82%. The same as another SSD I've used for the OS for some years before this, worse than my oldest SSD which also carried the OS once, the OCZ from the Toshiba arm that folded 8 years ago. :tearsofjoy:

I didn't run chkdsk on the Adata yet. I have to look into how it works for SSDs and whether my system already has an update that fixed a rare issue for Windows 10 bsod after chkdsk SSDs containing the OS. I don't expect it to find "bad blocks" anyway, so far none of the disks reporting them had any.

I have HWiNFO64 logs this time around! Didn't see anything very noteworthy at the time the Adata had its issue, but some interesting findings:

- AVCC3 fluctuates constantly, from 0 V to 6.096 V. It wasn't at the min/max at the time.
- RING: Max VR Voltage, ICCmax, PL4 keeps flipping from No to Yes and back. It was "Yes" at the records logged the closest to when these bad blocks were reported (just a few seconds off).
- Core #3 C7 Residency [%] keeps changing as usual, but reached its lowest values for a moment around the time some of the events happened.
- The system wasn't under heavy load and not hot with no parts over 51ºC, but it was under more load than in the preceding moments.

So, about the PSU, I have a multimeter if I need to check anything. I also have an oscilloscope but I don't know where the cable or probes are and I don't have a clue of how to operate it. I can try to find them though.