[SOLVED] Ethernet devices getting disconnected from network

CubsWin

Distinguished
Apr 20, 2012
88
3
18,535
Recently, I have been experiencing an occasional problem where all devices connected to my switches by ethernet are becoming disconnected from my network. I haven't been able to reproduce or find any patterns in when this occurs. I could go several days without problems and then have it happen twice within a few hours. I included a picture below of my approximate network setup. The switches on my network are all ZyXEL GS108Bv2 unmanaged switches.

When this problem occurs, every device connected to and downstream of Switch #1 is losing its connection to my network. That includes everything connected to Switch #1, Switch #2, and Switch #3. The devices connected directly to my router still function properly and are not impacted. I have also noticed that, along with losing network connections, my Blu-ray player on Switch #2 locks up when this happens and has to be rebooted. I am including that information in case it is somehow relevant to what is happening here. It may just be a side effect of losing its network connection.

The one new component in my network is the Synology NAS, connected to Switch #2. Prior to adding that NAS a few weeks ago, I had never experienced this issue. The last time my devices lost connection (overnight last night), I started troubleshooting by pulling the ethernet plug on the Synology NAS, but none of my devices regained their connection. I then disconnected my Blu-ray player, but that didn't work either. Then, I disconnected cable that connects Switch #1 and Switch #2. Upon doing that, all devices on Switch #1 and Switch #3 regained their network connection. I then re-connected Switch #2 and those devices came back up as well.

What could cause this issue and what should I do to troubleshoot it? Would adding a managed switch be beneficial to pinpoint and/or resolve any problems? I would prefer not to disconnect the NAS, as I am using it and it could take several days or longer before I know whether disconnecting it has made any difference. I would appreciate any advice though and possible next steps. Thanks!

oNz6bJc.jpg
 
Solution
Since switch 2 and 3 are both connected to switch 1, then a problem with switch 1 will cause them all to have a problem.

I think you will have to do a lot more troubleshooting. Starting with pings. You generally don't have devices that can do ping testing to each other. You will probably have to move a laptop around from switch to switch to help.
Getting managed switches is another recommendation.

Do your devices have static IPs or DHCP ?

kanewolf

Titan
Moderator
Since switch 2 and 3 are both connected to switch 1, then a problem with switch 1 will cause them all to have a problem.

I think you will have to do a lot more troubleshooting. Starting with pings. You generally don't have devices that can do ping testing to each other. You will probably have to move a laptop around from switch to switch to help.
Getting managed switches is another recommendation.

Do your devices have static IPs or DHCP ?
 
Solution

Ralston18

Titan
Moderator
I see two routers: which router is assigning DCHP IP addresses?

Do you have any devices (e.g., the NAS) with a static IP address?

If so, did you assign that static IP address such that the IP address is outside of the available DHCP IP address range allowed to the router and that static IP address is reserved on the router via the device's MAC?
 

kanewolf

Titan
Moderator
I see two routers: which router is assigning DCHP IP addresses?

Do you have any devices (e.g., the NAS) with a static IP address?

If so, did you assign that static IP address such that the IP address is outside of the available DHCP IP address range allowed to the router and that static IP address is reserved on the router via the device's MAC?
I assumed the "VPN" router was connected via the WAN port to the switch so a DHCP server on that wouldn't matter. Since most routers won't support a VPN except through the WAN port.
 

CubsWin

Distinguished
Apr 20, 2012
88
3
18,535
I see two routers: which router is assigning DCHP IP addresses?

Do you have any devices (e.g., the NAS) with a static IP address?

If so, did you assign that static IP address such that the IP address is outside of the available DHCP IP address range allowed to the router and that static IP address is reserved on the router via the device's MAC?

Both routers are assigning IP's with DHCP, but they are each using different, non-overlapping, ranges. I do have some devices, including the NAS, that are using static/reserved IP's (all configured on my primary router). Those IP's are within the IP address range, because my router enforces that reserved IP's still have to be within the range.

I assumed the "VPN" router was connected via the WAN port to the switch so a DHCP server on that wouldn't matter. Since most routers won't support a VPN except through the WAN port.

You are correct, the switch is connected to the WAN port on the VPN router.
 

CubsWin

Distinguished
Apr 20, 2012
88
3
18,535
Since switch 2 and 3 are both connected to switch 1, then a problem with switch 1 will cause them all to have a problem.
That makes sense, but where I am lost is in figuring out "what" the problem is with Switch 1. Obviously whatever the problem is, it is impacting Switch 1 and everything connected to it, but I don't know whether the problem is Switch 1 itself, or something downstream that is somehow causing Switch 1 to drop all connections. Everything works fine most of the time, until it doesn't... and once the devices lose their connection they don't seem to re-establish it until I do something (disconnect the cable between Switch 1 and 2).
 
There's a really easy way to check this--swap switches since they're all the same and all unmanaged dumb ones. I would take #1 and put it in the position of #3 and see what happens. If #3 end up having the same issue as #1 and #2, that conclusively clears #2. Then I would move #2 to where #1 is and move #3 to #2. If you no longer have any problems, then all may be good.

The root cause behind something like this is very, very, very, very (notice all the verys) difficult to find since you're dealing with dumb devices that should work without issues for years and also not interact in any way with the packets they are handling. However, bugs do exist that in certain very, very, very rare configuration situations may rear their head. And debugging these is just about narrowing it down and trying something different that causes it not to happen, if possible. Luckily, they're just dumb switches and these are cheap. Worse case, replace #1 with another brand and the problem will probably go away completely.

But this is all assuming there are no configuration errors. If there is a config error, no matter how you swap the switches, you'll always have that issue between #1 and #2. And this is still one method to narrow it down. :)
 
Last edited:

Ralston18

Titan
Moderator
I see one PC in the diagram.

What might prove helpful is to run "arp -a" via the command prompt on that PC and take a look that the IP addresses and corresponding MACs.

Determine if all are as expected with no duplicates and/or no missing devices.

Run "arp -a" in between switch swaps etc. and compare the results. May help narrow down the culprit(s).
 

CubsWin

Distinguished
Apr 20, 2012
88
3
18,535
There's a really easy way to check this--swap switches since they're all the same and all unmanaged dumb ones. I would take #1 and put it in the position of #3 and see what happens. If #3 end up having the same issue as #1 and #2, that conclusively clears #2. Then I would move #2 to where #1 is and move #3 to #2. If you no longer have any problems, then all may be good.
Thank you for the suggestion. I think what I'm going to do is replace Switch #1 with a managed switch. One of the cheap NETGEAR GS308E managed switches, not fully managed but good enough for my purposes. If I still have problems after that, then I can swap the old Switch #1 in for Switch #2, and likewise for Switch #3 if needed. Hopefully the problems are resolved by that point, but if not, then the problem is somewhere else (configuration, wiring, etc.). Let's hope it doesn't get that far.
 
Thank you for the suggestion. I think what I'm going to do is replace Switch #1 with a managed switch. One of the cheap NETGEAR GS308E managed switches, not fully managed but good enough for my purposes. If I still have problems after that, then I can swap the old Switch #1 in for Switch #2, and likewise for Switch #3 if needed. Hopefully the problems are resolved by that point, but if not, then the problem is somewhere else (configuration, wiring, etc.). Let's hope it doesn't get that far.
Managed switches will add complexity and another point of failure. I would avoid them unless you need them.
 

kanewolf

Titan
Moderator
Managed switches will add complexity and another point of failure. I would avoid them unless you need them.
I disagree. Managed switches will work just like unmanaged out of the box. It only gets complex if you choose to configure. But without managed, you get no insight. No SNMP, no syslog, no port statistics. What is the uptime on an unmanaged switch? Who knows...
 

Ralston18

Titan
Moderator
Will crawl out on the proverbial limb a bit....

I noted that many network devices including Switch #1 and Switch #2 are in the basement.

Not sure about how many electrical circuits are involved with respect to both house and basement but what about a ground loop...?

If the problem(s) are occasional could they be related to which network devices are on at any given time?

Because the problems appear to have started with the NAS why not try temporarily disconnecting it.

Or move it to the Office or Family Room.
 

CubsWin

Distinguished
Apr 20, 2012
88
3
18,535
In an interesting turn of events...

I have not yet swapped any of the switches, but the one thing I did was disconnected my Blu-ray player from the network. As I mentioned before, there seemed to be an odd correlation between the Blu-ray player locking up and my other devices losing their network connection.

Since that time, my Blu-ray player has locked up on two occasions, but there have been no issues whatsoever with my network. I need to continue monitoring for awhile, but is it crazy to think that a malfunctioning device could have somehow been impacting other connected devices on the network? Is there any precedent for that sort of issue?

I still dont understand why it would affect the devices on Switch #1, since the Blu-ray player was connected further downstream, but maybe there are other factors involved. It all may just be coincidence, but it is a possible theory to consider.
 

CubsWin

Distinguished
Apr 20, 2012
88
3
18,535
I'm back with an update. Since disconnecting the Blu-ray player (an Oppo BDP-103D by the way) from my network, this problem has not occurred again. I have been stable for 3 weeks without any disconnects. One other thing I noticed, while investigating why I was losing signal, was that I think I had a bit of a loose connection on the HDMI cable coming out of the player. Since reseating the cable, I have not lost signal from the player at all. So, my working theory, which I cannot prove in any way, is that the loose HDMI connection was causing the signal to drop out on occasion, and "somehow" that occurrence was also propagating through the Ethernet cable to disrupt my switches. I don't know whether that is really technically possible, but that is my conclusion based on circumstantial evidence. Now that everything seems stable, I will reconnect the Ethernet cable to my player to test whether the problem has bee completely resolved.