[SOLVED] Identify a gpu which is not working well within identical 12 gpu's in a system.

Jan 22, 2021
7
0
10
Hi I have 12 identical gpu's in a system and one of the gpu goes to 85c where all others stay around 65c. I believe thermal paste or pad of rams is gone and needs to be replaced. I don't want to test one by one to identify gpu. Is there any way I could identify which slot one so I can fix it. Hwinfo64 shows everything identicial yet couldn't find which slot
 
Hi I have 12 identical gpu's in a system and one of the gpu goes to 85c where all others stay around 65c. I believe thermal paste or pad of rams is gone and needs to be replaced. I don't want to test one by one to identify gpu. Is there any way I could identify which slot one so I can fix it. Hwinfo64 shows everything identicial yet couldn't find which slot
Speccy should be able to help should not 100% certain in your case
 
Jan 22, 2021
7
0
10
I have different approach to solve this issue. In Photoshop it shows all gpu's serial but it doesn't show temperature to identify problematic GPU.
I can see all serial number on the top of every gpu's but yet to identify which serial number has problem. ?
 
Jan 22, 2021
7
0
10
Okay I solved this problem in case if you have same problem

I disable one by one all gpu's until I found right one.

a) disable gpu
b) start mining
c) look msi after burn if that's the one hottest.
c) stop mining
d) repeat.

when you find right one another question will arise in your mind.
which one it is in 12 gpus. that one is easy one.

Disable all gpu's except the one hottest open case and find hottest one, all disabled one will be cold, That's my identification solution until I find something more classy.
 
To find out the temperature of things, an infrared thermometer will work, like this one https://www.amazon.com/Etekcity-Las...ca-403a-923c-8152c45485fe&tag=bargainsbaby-20

Run the system, point it at the video cards, see which one gets the hottest.

The procedure you did was a bit backwards, you don't want to disable one by one, you want to enable them one at a time and measure the temps of that single one. Since you know one is in the 80 range, soon as you hit the one that gets that hot, you are done.
 
Solution