Problem with 3 tp-link APs.Connection to the LAN is lost after some time.

acidrop

Honorable
Apr 28, 2013
3
0
10,510
Hello all,

Recently I bought 3 TP-Link TL-WA801ND (http://www.tp-link.com/en/products/details/?model=tl-wa801nd) APs to be able to cover by wifi a 3 floor building.

I've setup all of them with the following characteristics:

1. Same SSID on all of them.

2. Different channel on all of them (1,6,11)

3. DHCP Server OFF on all of them.

4. Static LAN ip on all of them (192.168.1.254,253,252)

There is UTP Cat6 cabling on all floors so each AP is acting in default mode as Access Point(Not Repeater mode).
Each AP is connected on a UTP Outlet and all of them are connected to a 16port 3com network switch.
There is also a Draytek Vigor ADSL router with ip 192.168.1.1 which acts as a internet gateway to the network.
Below is a simple diagram:

AP1 ---- UTP ---->|------------|

AP2 ---- UTP ---->| 3com switch|<---- UTP--->|Draytek Router|--><Internet>

AP3 ---- UTP ---->|------------|

Now, I have several clients (iphones,laptops,ipads) which can successfully connect on each floor (AP1,AP2,AP3) and can obtain ip address from the router (vigor) and have internet access normally.
Also each client can automatically connect/disconnect transparently from each AP of each floor without needing manually intervation.

The problem:

I have the following strange situation.This setup works fine for some hours/days and suddenly without any obvious reason, all clients even desktop pcs which are connected by cable cannot access internet neither local lan (router,server etc).
When this happens I can see that network switch leds are blinking like crazy and the same happens to the router and all APs.
I have replaced the switch and the happens again with the other switch too.I have come to the conclusion that something happens with
the APs but I do not know what?

Has anybody has same experience/setup??

Thank you in advance
 
The flashing lights indicate a lot of traffic and since it is widespread it is likely broadcast traffic. Now that itself will cause major network issues but it could also be a symptom of a different failure. For example if the router would stop responding the mac address will time out of the switch before the ARP entry times out. Now all traffic destined for that mac will be flooded out all ports. This is but one example it can anything from software issues on a end device to problem with the router.

Your best bet is packet capture. Wireshark is the free tool to do this. Start on one of the wired PC and see if it is receiving lots of traffic. It may give you a clue as to what kind of broadcast is being sent. If you do not see anything then it gets tough. You really need to get into the connection between the router and the rest of the LAN. If your 3com switch has the ability you can have it copy all the data from the router port to another port and you can monitor it with wireshark. Many commercial switches have this ability (many times called mirror port). Unfortunately any unmanned switch does not.

If you can get into the router or it has a log ability maybe you can get a clue as to what it things the traffic issues is.
 
I also believe it is broadcasting which produces the problem.
I have replaced both router and switch, so I don't believe there is a problem on them.
Actually because the network existed, the problem started when I installed the 3 APs.So I believe something is going wrong with them.
As I said I have configured them with the same SSID on different channels.
So when somebody with an iphone for example connects to AP1,receives the settings from router correctly and has internet access.
Then as he moves around he connects to AP2,AP3 and so on.This is refered as roaming procedure.The client disconnects from AP1 when it gets a stronger signal from AP2.
Is there a possibility that when when client moves from AP to AP, it's ARP address not being "released" from the APs and as a result causing this thing?


 
The ap only has mac addresses and it really won't matter if the AP has a old mac....some other wireless user would not be able to talk to it. The switch behind it should learn the new mac on the port going to the new AP as soon as the very first packet arrives. "roaming" with consumer grade routers is so clunky it almost doesn't happen. With the default settings it will not change until the signal gets extremely bad even though it knows there is a better one. Then it must redo all the authentication and key exchange which takes many seconds. So its not like it can swap back and forth quick enough to degrade you network.

Now this "feature" of assigning the mac to the port that it receives a packet on is also what causes problems if it happens too much. If you were to have the same mac on 2 different devices and both were to send traffic the switch will spend all its time updating the mac to port mapping table. This is also what happens if you were to plug 2 ports together since the packets themselves actually loop and the switch gets confused. Generally a duplicate mac address causes little issues unless both devices are sending lots of traffic. I have only seen this on a dual nic machine that the user thought they had put on 2 different vlans.

If you can catch it when it is happening you could unplug cable one at a time and see if it stops. Would have to get lucky.

Then again it could be a issue with a single user machine. Old virus like codered use to completely compromise networks. Or you could have a user using things they should not. A machine that would flood out mac addresses would exceed the mac forwarding table in the switch...which then sends out all broadcasts.

It is very very hard to find these unless you can see the traffic. This is part of the reason for commercial installations you want to run managed switches, they have many tools that let you look at the status and in some cases automatically defend against common problems.

 
Thank you for sharing your thoughts.
I agree, it is difficult to find a solution on this without some kind of monitoring.
I don't, believe this is a virus issue since it started to happen immediately after installing the APs and in random time.