[SOLVED] Weird LAN/Wifi issue - must read

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Nov 2, 2019
49
1
35
Hello,
In my house I am using a LAN as shown on this simplified drawing.
The primary router 192.168.0.1 is a Netgear D7000 VDSL router with the latest original firmware and the secondary 192.168.0.250 is a TP-Link TL-WR0146ND with DD-WRT v24SP2 (DHCP disabled) – connected via cable and used to improve WiFi coverage upstairs in my house. The switch is an unmanaged Netgear ProSafe 24 port gigabit switch.

When connecting using different wireless devices with both iOS and Android I have the following issue:
If the devices connect via WiFi A on my primary router, they can access the SAT receiver and the NAS without any issues.
When I connect them using WiFi B on my secondary router, both devices can access the SAT receiver and the NAS without any issues.

However, when I now switch back to WiFi A, neither device can access the SAT receiver and the NAS.
They don‘t see the devices on the network and cannot even ping them.
This situation lasts for some 8 minutes and 30 seconds after which the SAT and NAS become pingable and visible from my iOS and Android devices (this applies to all wired devices connected to my switch – all become visible after the said 8 minutes 30 seconds).

If I repeat this process – Wifi A – WiFi B – WiFi A the result is always the same and it takes 8.5 minutes till the SAT receiver and the NAS are visible on WiFi A again.

I have absolutely no idea what can be causing this. There are no other issues with the network, all wired and wireless devices are working properly, the speeds are good, the VDSL connection works fine. The issue now exists over 2 years and in that time the Netgear received several firmware updates, but nothing has changed.
Any advice appreciated.
Marcel
 
Last edited by a moderator:
Solution
Re-categorized problem to "Roaming wireless delays".

Reason (full disclosure) could not think of anything else.

Some googling along those lines brought up a number of potential causes - Switches being one of them....

I found the following Cisco page:

https://documentation.meraki.com/MR/WiFi_Basics_and_Best_Practices/Roaming_Technologies

And a Deju Va link:

https://community.netgear.com/t5/Orbi/Random-ARP-Problems-w-WiFi-nodes/td-p/1799406


Then I came across Roaming aggressiveness:

https://www.versatek.com/blog/how-t...reception-by-boosting-roaming-aggressiveness/

A wireless device being at some threshold location and bouncing back and forth with every other device trying to keep up - what...
Nov 2, 2019
49
1
35
Are all old cables replaced with new cables including those connected to both routers and the NAS?

^ I'll assume so unless told differently. So all cables and the switch were replaced. That's good, now you have a proper physical level infrastructure which a LAN should have.

Next, we look at the IP scheme. The diagram leaves many questions. Let's dig in. The following devices are outside of Router A's DHCP range but not marked as static on the diagram. So I'm going to ask:
  1. Is Router A on a static IP address?
  2. Is the NAS on a static IP address?
  3. Is PC1 on a static IP address?
  4. Is PC 2 on a static IP address
  5. Is Chromecast on a static IP address?
  6. Is the Printer on a static IP address?
Next, the only DHCP range (192.168.0.2 to 192.168.0.30) you have on your network diagram comes from Router A. You said Router B has DHCP disabled but would you recheck that? We must know for sure:

  1. Is Router B setup to issue IP addresses via DHCP?
  2. See HERE. Which version is Router B (TL-WR0146ND); is it V1, V2, or V3?
We are looking for possible IP address conflicts, whether or not a wireless device is being issued an IP address by router B which it has to drop and pick up a new IP address when it moves within range of Router A, etc..

Now watch the following video and then ensure Router B (TP-Link TL-WR0146ND) is correctly configured as a wireless access point (WAP) to Router A on a wired (not wireless) connection either in access point mode or manually configured.

You can reach out to the TL-Link community https://community.tp-link.com/us or contact TL-Link directly https://www.tp-link.com/us/support/contact-technical-support/ for further help configuring Router B (the TP-Link TL-WR0146ND) as a network access point.

View: https://youtu.be/Cg_gGECGLiY
Yes, all cables have been replaced, everything is now CAT6a

  1. Yes Router A has a static address 192.168.0.1
  2. NAS, PC1. PC2, Chromecast and Printer all have static addresses assigned by router A based on their MAC address - it says so on the schematic.
  3. DHCP is disabled on router B.
We have already discussed the IP address of my wireless devices - we came to the conclusion that it remains the same all the time, however, when it connects to wifi A it is listed among wireless devices on router A and when it connects to wifi B it is listed among wired devices on router B.
4. As for the router B setup, my configuration was the one shown as second on the video - i.e. router B is connected to router A via LAN (not WAN) port and DHCP is disabled on Router B.

The TP link community will not help as the router does not run official TP link firmware. I flashed it years ago because the original firmware would not let me change the IP range to 192.168.0.x, it only allowed to use 192.168.1.x

I also experimented with enabling DHCP on both routers (just within different ranges) The network works, however, the main issue remains unsolved. In this case the IP address of the wireless device changes when switching from wifi A to B and back to A, yet the wireless device still does not see the wired devices for some 8 minutes.

Would it make any difference if I turn off DHCP on router A completely and only use DHCP on router B? (until now it was the other way round)

I am unable to set router B as an access point using method A from the video - LAN port on router A to WAN port on router B - not sure if I'm doing something wrong?
 
Last edited:
Nov 2, 2019
49
1
35
This is the bottom of router B
70b9b834f3af99d64d6b0d1062622012.jpg


Sent from my POCOPHONE F1 using Tapatalk
 

Ralston18

Titan
Moderator
A network should only have one router assigning DHCP IP addresses.

Not two routers. No problems with experimenting per se but if you keep changing things without some deliberate, methodical approach resolving the problems will continue to be difficult.

What will happen is it all may seem to work and then some new device or additional connection will again cause the network's performance to collapse.

Overall, as I understand the current on-going situation, there are at least two major problems at hand.

1) Cabling/infrastructure:

There is really little need for shielded ethernet cables in a home network environment. And if shielded cables are used then the existing F/FTP cables must be properly terminated, installed, and grounded (drained). The NO-Grounding shown on the tester indicates that grounding has not been done.

One floor is wired "crossover" the other floor is straight through. It may or may not matter as some devices will sense the signals and adjust. However it only takes one device to fail at doing so and network havoc results. I.e., moving a device between floors.

No patch panel - okay. However, I believe that a patch panel is important as a solid, fixed reference point for all outgoing network cable runs. Not a fan of ethernet cables (or any cables) just dangling out of walls waiting to be plugged in somewhere. That's just me.

2) IP addressing:

There needs to be an overhaul with respect to your IP addressing scheme. Only one router assigning DHCP IP addresses to start with plus what DHCP IP address range is to be used. Also what devices are to be assigned Static IP addresses (outside of the DHCP IP address range) and reserving those IP addresses via the devices' MAC.

My suggestions:

First, focus on getting the cable plant/wiring corrected. I am sure that there are various ideas about how to do so. And there are Forum members here that can do such things in their sleep. Shielded cables and connectors are not as forgiving as UTP.

Second, establish the IP addressing scheme (Best Practices, etc.). That, again, is something that many members are more than capable of being able to work out. I have my preferences as do others. No matter what policy is applied the key is that that policy is consistent throughout the network.

And my thought is that once the network wiring is fixed/standardized the IP addressing issues will diminish.

Wiring first.
 
Nov 2, 2019
49
1
35
A network should only have one router assigning DHCP IP addresses.

Not two routers. No problems with experimenting per se but if you keep changing things without some deliberate, methodical approach resolving the problems will continue to be difficult.

What will happen is it all may seem to work and then some new device or additional connection will again cause the network's performance to collapse.

Overall, as I understand the current on-going situation, there are at least two major problems at hand.

1) Cabling/infrastructure:

There is really little need for shielded ethernet cables in a home network environment. And if shielded cables are used then the existing F/FTP cables must be properly terminated, installed, and grounded (drained). The NO-Grounding shown on the tester indicates that grounding has not been done.

One floor is wired "crossover" the other floor is straight through. It may or may not matter as some devices will sense the signals and adjust. However it only takes one device to fail at doing so and network havoc results. I.e., moving a device between floors.

No patch panel - okay. However, I believe that a patch panel is important as a solid, fixed reference point for all outgoing network cable runs. Not a fan of ethernet cables (or any cables) just dangling out of walls waiting to be plugged in somewhere. That's just me.

2) IP addressing:

There needs to be an overhaul with respect to your IP addressing scheme. Only one router assigning DHCP IP addresses to start with plus what DHCP IP address range is to be used. Also what devices are to be assigned Static IP addresses (outside of the DHCP IP address range) and reserving those IP addresses via the devices' MAC.

My suggestions:

First, focus on getting the cable plant/wiring corrected. I am sure that there are various ideas about how to do so. And there are Forum members here that can do such things in their sleep. Shielded cables and connectors are not as forgiving as UTP.

Second, establish the IP addressing scheme (Best Practices, etc.). That, again, is something that many members are more than capable of being able to work out. I have my preferences as do others. No matter what policy is applied the key is that that policy is consistent throughout the network.

And my thought is that once the network wiring is fixed/standardized the IP addressing issues will diminish.

Wiring first.
Thank you for the suggestions. Of course, when I did the experiment with two DHCP routers, I immediately put everything back the way it was. I realize that changing things at this stage would only cause more problems. Today I will get in touch with the guy who did my cables and ask him why it is connected the way it is. Will update, when there is something new.
Btw.: I did a test by completely disconnecting the first floor from the switch - there was physically no connection to that section of the network. Using only the ground floor sockets I connected router B to test if the problem persists - it did. So I very much doubt changing the wiring will do anything.
 
Last edited:
Nov 2, 2019
49
1
35
Just got off the phone with the guy who installed my cabling - as it seems, he only did the ground floor. The rest was done by someone else - thus the different wiring and unconnected ground. He is coming on Saturday to change the socket wiring and is bringing a network expert who claims to know my issue and should be able to fix it. I'll keep you posted.
 

Ralston18

Titan
Moderator
Very good.

No intention nor desire to go off the present course with respect to the cabling issues, etc...

However, in the meantime I have been thinking about that 8 1/2 minutes....

Wondered about some sort of network loop...

Found the following:

"
In the unexpected event that a router receives an IPv6 packet with a Hop Limit of zero and that packet is not destined for the router itself, it is discarded as you’d expect. If a router (or any IPv6 host) receives an IPv6 packet with a Hop Limit of zero or one and the host is the destination of that packet (it won’t be routed onwards), the packet is accepted. The IPv4 RFC is far less explicit and simply states that if this field is zero the ‘datagram’ must be destroyed.
The IPv4 RFC states that TTL is measured in seconds but acknowledges this is an imperfect measure. There is no way of knowing how long any particular host will take to process a packet and most will do so in far less than a second. In any case, based on this logic, in theory the maximum time a packet can exist in the network is 4.25 minutes (255 seconds). Hence, the TTL is described as a “self destruct time limit”.

Full disclosure: my underline.

Reference:

https://packetpushers.net/ip-time-to-live-and-hop-limit-basics/

4.25 minutes x 2 = 8.5 minutes (noted time to start working again).

4.25 x 1.5 = 6.4 minutes. ( approx. 390 seconds which was also a noted time.

See posts #27 & #29

Why the doubling (i.e., x 2)? Two loops perhaps: One loop self destructs itself and then the second does the same...

Somewhat out of my comfort zone and general purview of things.

Any comments or thoughts about some network loop or loops and TTL being involved?

Would the existing routers/firmware eventually kill a loop?
 
Don't want to get this thread off track but TTL is a layer 3. If I remember correctly it is not a actual time it is a hop count that is decremented by each layer 3 device in the path.

Most loops related to switches or cables are layer 2. They are broadcast packets and they loop forever. The symptom is a total lock up of the network. This is prevented by a function called spanning tree. I have seen the feature even on some unmanged switches but it is not universal.

I have no clue what can cause a delay like that and partially why I even follow this massive thread.
 

Ralston18

Titan
Moderator
Thanks.

STP, Layer 2.

OP is using DD-WRT on his router.

If I have not gone too far down the proverbial rabbit hole here I am wondering if STP should be enabled or disabled....

Reference:

https://wiki.dd-wrt.com/wiki/index.php/Spanning_Tree_Protocol

That approx. 50 second learning state/per connected port is intriguing. Not sure how to "count ports".

Massive indeed.

But the times/numbers must be telling us something. What that something is - that is what I want to learn.
 
Nov 2, 2019
49
1
35
Thanks.

STP, Layer 2.

OP is using DD-WRT on his router.

If I have not gone too far down the proverbial rabbit hole here I am wondering if STP should be enabled or disabled....

Reference:

https://wiki.dd-wrt.com/wiki/index.php/Spanning_Tree_Protocol

That approx. 50 second learning state/per connected port is intriguing. Not sure how to "count ports".

Massive indeed.

But the times/numbers must be telling us something. What that something is - that is what I want to learn.

It was disabled. Tried both settings - makes no difference.
 
Nov 2, 2019
49
1
35
Very good.

No intention nor desire to go off the present course with respect to the cabling issues, etc...

However, in the meantime I have been thinking about that 8 1/2 minutes....

Wondered about some sort of network loop...

Found the following:

"

The IPv4 RFC states that TTL is measured in seconds but acknowledges this is an imperfect measure. There is no way of knowing how long any particular host will take to process a packet and most will do so in far less than a second. In any case, based on this logic, in theory the maximum time a packet can exist in the network is 4.25 minutes (255 seconds). Hence, the TTL is described as a “self destruct time limit”.

Full disclosure: my underline.

Reference:

https://packetpushers.net/ip-time-to-live-and-hop-limit-basics/

4.25 minutes x 2 = 8.5 minutes (noted time to start working again).

4.25 x 1.5 = 6.4 minutes. ( approx. 390 seconds which was also a noted time.

See posts #27 & #29

Why the doubling (i.e., x 2)? Two loops perhaps: One loop self destructs itself and then the second does the same...

Somewhat out of my comfort zone and general purview of things.

Any comments or thoughts about some network loop or loops and TTL being involved?

Would the existing routers/firmware eventually kill a loop?

The times only match sometimes - they go anywhere from 300 to 600 . Would that mean I have a loop in my network?
P.S.: I did a couple of measurements today - here are the times it took for the wifi device to see the wired devices (measured in seconds using one ping per second - first ping to not fail counts):
417
413
437
418
397
306
515
521
572
There is not discernible pattern here.
 
Last edited:

Ralston18

Titan
Moderator
In light of the bigger picture here I am not going to chase the statistical rabbit down the hole.

Albeit somewhat tempting to be honest about it....

Delays are delays and that is the issue. Individual time variations aside.

I think a loop is likely but other than the time delays I cannot fairly demonstrate that to be the case. There may be other thoughts about loops per se - open to discussion I would expect.

It will be far more productive to get the network wiring consistent and up to standards.

Especially with respect to shielding/grounding. Your installer really needs to take a good hard look at what has (or has not) been done. Also subject to more discussion I think.

Then work through a network "do-over" literally one device at a time while checking configuration settings, IP addresses (static and DHCP), subnet masks, reserved MACs, etc..

Once something goes astray, stop and figure out what is wrong and fix.

Get wired up and running without problems then introduce wireless devices. Again, one by one using some planned, deliberate process.

Keep going until the problem reappears if it does. And do give the network time between device additions so all of the ARP tables and so forth can bring themselves up-to-date with respect to who is who and where. Or flag some error in the process.
 

rcfant89

Distinguished
Oct 6, 2011
546
3
19,015
Seems like a whole lot of extra work when I believe you guys had the right answer from the start. Wifi B is the likely issue. As other comments have pointed out, you should only have WIFI A doing the routing and Wifi B should be an access point only.

Could you pick up something else for testing purposes like a linksys EA6400 (or something like that) for 20 bucks off craigslist and set it to AP mode and give that a shot? The different SSID/two devices trying to act like routers is your issue.

My two Linksys routers are acting like APs only set to "Bridge Mode" connection type. Actually, all three are (have a nighthawk as well and that one has a check box for "Enable AP Mode") and I have a separate PC running an Untangle router (no wireless for that box).

Anyway just wanted to say looks like the other commenters are right and give you a quick test. Swap B for a cheap linksys AP set it to bridge mode, use the same SSID and see if that fixes your problems. I bet it will. When you are saying it's a consistent 8.5 minutes every time that sounds like some kind of time out, not a wiring issue or anything like that so I wouldn't waste your time following that rabbit hole. Good luck.
 
Nov 2, 2019
49
1
35
Seems like a whole lot of extra work when I believe you guys had the right answer from the start. Wifi B is the likely issue. As other comments have pointed out, you should only have WIFI A doing the routing and Wifi B should be an access point only.

Could you pick up something else for testing purposes like a linksys EA6400 (or something like that) for 20 bucks off craigslist and set it to AP mode and give that a shot? The different SSID/two devices trying to act like routers is your issue.

My two Linksys routers are acting like APs only set to "Bridge Mode" connection type. Actually, all three are (have a nighthawk as well and that one has a check box for "Enable AP Mode") and I have a separate PC running an Untangle router (no wireless for that box).

Anyway just wanted to say looks like the other commenters are right and give you a quick test. Swap B for a cheap linksys AP set it to bridge mode, use the same SSID and see if that fixes your problems. I bet it will. When you are saying it's a consistent 8.5 minutes every time that sounds like some kind of time out, not a wiring issue or anything like that so I wouldn't waste your time following that rabbit hole. Good luck.

I think you may be right there.
Today I tried a different WiFi setup, with just one SSID, however as soon as the wireless device moves from router B to A, it looses connection to all wired devices as usual.
The network is not that complicated and I think there must be thousands of people who use the same setup as me - just with different hardware. The fact that two routers and one switch are unable to properly exchange routing information in such a tiny LAN is rather worrying. It is basically clear what the problem is, however, I don't know how to solve it.
When a wireless device moves from router B to router A, the network devices still attempt to send information using the old route and only once the tables with mac addresses get flushed (either some kind of a time limit elapses or I toggle the power on the switch), normal functionality resumes.
I would prefer if I could find a solution without replacing any more hardware (already replaced the switch to no avail), but if there's no way around it I will get a new access point.
Also - when it comes to the cabling of my first floor sockets there is no way to connect the ground as the idiot who installed them snipped the wire off. Really pissed about it.
Marcel
 

Ralston18

Titan
Moderator
In the meantime, pending the wiring matters etc., I have had some thoughts about the overall network configuration per the network diagram.

Probably a good time to doublecheck that all devices are correctly configured with respect to IP address assignment (DHCP IP or desired Static IP w/ MAC Reserved), correct subnet mask, correct MAC, and correct Default Gateway.

Sole reason being just a matter of elimination and leaving as little as possible to doubt.
 
Nov 2, 2019
49
1
35
In the past two weeks I had nothing new to report. Today my new TP-LINK access point arrived, I connected it, set it up, and ... same <Mod Edit> all over again. I'm utterly fed up with this. Since replacing the switch and access point did not solve anything the main culprit, I presume, is my main VDSL router. Not sure what to do now... I now have one spare router.
 
Last edited by a moderator:

Ralston18

Titan
Moderator
Are you familiar with the utility "Advanced IP Scanner"? Free.

I think that a scan of your network may reveal something not being as expected or as it should be.

Use the results along with your diagrams to take another look at the network structure and connections.
 
Nov 2, 2019
49
1
35
Are you familiar with the utility "Advanced IP Scanner"? Free.

I think that a scan of your network may reveal something not being as expected or as it should be.

Use the results along with your diagrams to take another look at the network structure and connections.
Hi,
I did the scan, it did not reveal anything I would not know about. All IP addresses are as they should be, there are no conflicts. From what we know, the IP layer is not an issue - the issue will be one layer below - remember - cycling the power on my switch resolves the issue...
The main router should be able to reroute the traffic if the wireless device changes wifi, but for some reason it doesn't.
Marcel
 

Ralston18

Titan
Moderator
IP addresses (and inclusively their respective subnet masks) are as they should be - fair enough.

What about MACs? The only reason I am pressing on that is that a typo somewhere can wreak havoc....

Main router being the Netgear D7000 VDSL router - correct? And Router B (TP-Link) is simply a DD-WRT access point that is connected to the switch per the network diagram.

And the problem temporarily ends when switch(es)' power is cycled off and on: how long until the problem shows up again?

Can the problem's reoccurrence be associated with the added presence or removed absence of any given device?

And I will add another suggestion. Use that spare router (sans DD-WRT) to swap out Router B.

Just trying to think outside of the proverbial box - likely I may also fall out but so be it....
 
Nov 2, 2019
49
1
35
OK, let me answer your questions:

IP addresses (and inclusively their respective subnet masks) are as they should be - fair enough.

Q: What about MACs? The only reason I am pressing on that is that a typo somewhere can wreak havoc....
A: MAC addresses are only used on router A to assign IPs to 5 different devices - I checked, they are all OK, those devices work fine.

Q: Main router being the Netgear D7000 VDSL router - correct? And Router B (TP-Link) is simply a DD-WRT access point that is connected to the switch per the network diagram.
A: Main router is correct. Router B was the DD-WRT running TP-Link, at the moment it is a TP-Link Access Point I recently bought. Replacing router B with the new access point did not change anything, the error remains the same.

Q: And the problem temporarily ends when switch(es)' power is cycled off and on: how long until the problem shows up again?
A: the problem shows up as soon as a wireless device switches from wifi B to wifi A

Q: Can the problem's reoccurrence be associated with the added presence or removed absence of any given device?
A: I tried switching devices off, but the issue involves all wifi devices and all wired devices - a wifi device that just switched from wifi B to wifi A simply cannot see any wired devices for a couple of minutes.

Q: And I will add another suggestion. Use that spare router (sans DD-WRT) to swap out Router B.
A: Router B is what I used originally. Router B is now replaced by the TP-Link AP. I cannot replace router A, as I need it to provide VDSL internet access. I made an attempt to switch router A to modem mode and set up the old Router B as my main router with DHCP, wifi and everything, but I could not get it to work. The local network was fine, DHCP worked, but I was not getting internet access (neither wired nor wireless). Router B was connected to Router A (in modem mode) so that router A used its LAN port and Router B its WAN port (also tried LAN port, but to no avail). Router B WAN port was set to automatic DHCP configuration.

That is all for now.
 

Ralston18

Titan
Moderator
Re-categorized problem to "Roaming wireless delays".

Reason (full disclosure) could not think of anything else.

Some googling along those lines brought up a number of potential causes - Switches being one of them....

I found the following Cisco page:

https://documentation.meraki.com/MR/WiFi_Basics_and_Best_Practices/Roaming_Technologies

And a Deju Va link:

https://community.netgear.com/t5/Orbi/Random-ARP-Problems-w-WiFi-nodes/td-p/1799406


Then I came across Roaming aggressiveness:

https://www.versatek.com/blog/how-t...reception-by-boosting-roaming-aggressiveness/

A wireless device being at some threshold location and bouncing back and forth with every other device trying to keep up - what happens?
 
Solution
Nov 2, 2019
49
1
35
Re-categorized problem to "Roaming wireless delays".

Reason (full disclosure) could not think of anything else.

Some googling along those lines brought up a number of potential causes - Switches being one of them....

I found the following Cisco page:

https://documentation.meraki.com/MR/WiFi_Basics_and_Best_Practices/Roaming_Technologies

And a Deju Va link:

https://community.netgear.com/t5/Orbi/Random-ARP-Problems-w-WiFi-nodes/td-p/1799406


Then I came across Roaming aggressiveness:

https://www.versatek.com/blog/how-t...reception-by-boosting-roaming-aggressiveness/

A wireless device being at some threshold location and bouncing back and forth with every other device trying to keep up - what happens?

Thanks for your effort. Having read the linked articles it seems like those people are dealing with different issues than mine, so all in all, those information will not help in my situation :(
Roaming wireless delays usually refer to a situation, where you need to ensure lightning fast handover to ensure that e.g. a VoIP connection remains stable when a device switches between multiple AP. We are talking about shaving off milliseconds :) I need to "shave off" a couple of minutes :)