[SOLVED] 3 of 4 AMD GPU's No Longer Detected (Post-Kraken x42 Liquid Cool Mod) - PCIe Detection Issue? Help Desperately Needed!

Status
Not open for further replies.
Jul 6, 2019
8
0
20
Hello All,

First time poster, long time lurker, with an unusual problem. Would be extremely grateful to have your input on the following.

I've been running the following setup as a combo crypto miner and data analysis PC (my setup is sufficient for my analysis area) for several months without issue now (specs in Table 1).

To address the temperature issues (stock fans insufficient), I purchased 4x NZXT Kraken x42 liquid coolers (+NZXT G12 mounters). All instructions were followed to a tee (good practise regarding static was undertaken). These fans are working just fine (wonderfully in fact; the detected GPU barely exceeded 70oF during a benchmark).

For unknown reasons, post-mod, my MB is refusing to recognize three of the four GPUs (1x MB mounted, 2x PCIe risers). Only one of the MB mounted GPUs is consistently identified, and infrequently, the second MB mounted one is identified as a "hidden" adapter in Device Manager. Please note that I made absolutely no software or BIOS changes intra- or post-cool upgrade, with the exception of CAM.

Upon first boot, the build turned on for several seconds, before turning itself off, and then turning back on again. I do not have the expertise to discern what the significance of this is, and it hasn't happened again since.

I've updated the BIOS, reset the CMOS, quadruple checked all the wiring, ensured only one power line per riser board and GPU each, re-inserted the risers, replaced said risers, fiddled with the PCIe Gen settings in my MB's BIOS, ran DDU, uninstalled and reinstalled AMD Crimson... Absolutely no change. Usually one (and sometimes only two) GPUs are ever detected and listed in Dev Man or AMD's WattMan.

I've noticed something unusual, however. In Run>msinfo32, my amateur reading of the following in Hardware Resources > Conflicts/Sharing suggests that the GPU's are in fact being perceived by the MB (Table 2).

Any guidance with this issue please? Can provide more info if required (but I think this is sufficient for a first post).

Thanks a bunch in advance!



TABLE 1
Version 10.0.17763 Build 17763
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name DESKTOP-9T45SGB
System Manufacturer Gigabyte Technology Co., Ltd.
System Model AB350-Gaming
System Type x64-based PC
System SKU Default string
Processor AMD Ryzen 3 1200 Quad-Core Processor, 3100 Mhz, 4 Core(s), 4 Logical Processor(s)
BIOS Version/Date American Megatrends Inc. F40, 14/06/2019
SMBIOS Version 3.2
Embedded Controller Version 255.255
BIOS Mode UEFI
BaseBoard Manufacturer Gigabyte Technology Co., Ltd.
BaseBoard Product AB350-Gaming-CF
BaseBoard Version x.x
GPUs AMD Radeon Sapphire+ 8GB RX480x3, AMD MSI 8GB RX480 (4 in total)

TABLE 2
Memory Address 0xFCC00000-0xFCCFFFFF PCI Express Downstream Switch Port
Memory Address 0xFCC00000-0xFCCFFFFF Realtek PCIe GbE Family Controller

I/O Port 0x00000000-0x0000000F Direct memory access controller
I/O Port 0x00000000-0x0000000F PCI Express Root Complex

Memory Address 0xF0300000-0xF03FFFFF PCI Express Upstream Switch Port
Memory Address 0xF0300000-0xF03FFFFF PCI Express Root Port
Memory Address 0xF0300000-0xF03FFFFF PCI Express Downstream Switch Port
Memory Address 0xF0300000-0xF03FFFFF Realtek PCIe GbE Family Controller

I/O Port 0x000003C0-0x000003DF Radeon (TM) RX 480 Graphics
I/O Port 0x000003C0-0x000003DF PCI Express Root Port


I/O Port 0x0000F000-0x0000FFFF PCI Express Upstream Switch Port
I/O Port 0x0000F000-0x0000FFFF PCI Express Root Port
I/O Port 0x0000F000-0x0000FFFF PCI Express Downstream Switch Port
I/O Port 0x0000F000-0x0000FFFF Realtek PCIe GbE Family Controller

Memory Address 0xFC900000-0xFCBFFFFF PCI Express Root Port
Memory Address 0xFC900000-0xFCBFFFFF AMD USB 3.0 eXtensible Host Controller - 1.0 (Microsoft)

IRQ 55 High Definition Audio Bus
IRQ 55 Microsoft ACPI-Compliant System

I/O Port 0x0000E000-0x0000E0FF Radeon (TM) RX 480 Graphics
I/O Port 0x0000E000-0x0000E0FF PCI Express Root Port


Memory Address 0xFEE00000-0xFFFFFFFF PCI Express Root Complex
Memory Address 0xFEE00000-0xFFFFFFFF Motherboard resources

Memory Address 0xE0000000-0xEFFFFFFF Radeon (TM) RX 480 Graphics
Memory Address 0xE0000000-0xEFFFFFFF PCI Express Root Complex
Memory Address 0xE0000000-0xEFFFFFFF PCI Express Root Port

Memory Address 0xA0000-0xBFFFF Radeon (TM) RX 480 Graphics
Memory Address 0xA0000-0xBFFFF PCI Express Root Complex
Memory Address 0xA0000-0xBFFFF PCI Express Root Port

I/O Port 0x000003B0-0x000003BB Radeon (TM) RX 480 Graphics
I/O Port 0x000003B0-0x000003BB PCI Express Root Complex
I/O Port 0x000003B0-0x000003BB PCI Express Root Port

Memory Address 0xFCF00000-0xFCF3FFFF Radeon (TM) RX 480 Graphics
Memory Address 0xFCF00000-0xFCF3FFFF PCI Express Root Port

IRQ 0 High precision event timer
IRQ 0 System timer
 
Last edited:
Solution
I have discovered the root of the issue, albeit at some financial cost (and in retrospect, this wasn't something I should've overlooked).

My PSU was overloading due to a surplus in initial demand.

While fiddling away with the drivers and rotating the various GPUs, my rig randomly shut down unexpectedly two days ago (if you'll recall, this happened within a day of purchasing the Kraken coolers). Upon reboot, I smelled something plastic burning. Quickly shut everything off and physically approached the various components. Sure enough, the smell was strongest from the PSU.

I did a quick repeat calculation of the new power demands for my rig. In short, 4x Kraken x42's would increase my power consumption by about 60W. I made the...
Jul 6, 2019
8
0
20
I've just run the "DriverEasy" tool and the first three drivers that "require updating" are ones I hadn't seen prior to the cooler upgrade ("AMD PCI" x2, "AMD PSP 3.0 Device").

Could it be possible that - For whatever reason, the installation of CAM (the fan operating software bundled with Kraken) is causing an unexpected driver misconfiguration?

If so, does anyone know how I can specifically rectify this issue?
 
Jul 6, 2019
8
0
20
Update.

I've cycled each of the other GPU's through the top PCIe bracket to make sure none of them bricked.

All of them work. All register right away in Device Manager and work just fine (just don't have the AMD drivers installed on them yet).

So, my problem is definitely related to the MB detection of the PCIe risers. These were all working just fine prior to the liquid cool mod.

Any help, gals and guys?
 
Jul 6, 2019
8
0
20
I have discovered the root of the issue, albeit at some financial cost (and in retrospect, this wasn't something I should've overlooked).

My PSU was overloading due to a surplus in initial demand.

While fiddling away with the drivers and rotating the various GPUs, my rig randomly shut down unexpectedly two days ago (if you'll recall, this happened within a day of purchasing the Kraken coolers). Upon reboot, I smelled something plastic burning. Quickly shut everything off and physically approached the various components. Sure enough, the smell was strongest from the PSU.

I did a quick repeat calculation of the new power demands for my rig. In short, 4x Kraken x42's would increase my power consumption by about 60W. I made the assumption that my existing PSU (1kW) could safely take this.
I made the error in assuming my undervolted+overclocked GPU power consumption range (90-120W) would carry over at initial boot in my pre-purchase estimations (it doesn't).
So, although a 1kW PSU was fine for my rig during mining pre-mod (estimated wattage is ~810-820), it turns out that I was always just over 1kW on boot. As such, adding those Kraken's pushed the demand to at least 1.1kW.
Inadequate power also explains why all the watercool-equipped GPUs worked just fine when they were mounted to the motherboard individually, but adding

As it's the second time this has happened, I don't think the PSU's fit for purpose any longer. I'll be purchasing a 1.5kW gold+ PSU to be on the safe side.

Just posting this so others in the future with my experience can troubleshoot as I've done (and I'd advise very carefully ruling out a power supply related issue from the start).
 
Solution
Status
Not open for further replies.