[SOLVED] Device leaving wireless range takes down entire network

Apr 20, 2022
4
0
10
This is maybe the strangest thing I've ever seen on a network and I have no idea what could be causing it.

Modem - Netgear Nighthawk CM1200-100NAS
Router - Netgear Nighthawk R7000
Impacted devices - iPhone 13 Pro Max (15.4.1), iPhone 13 Pro (15.4.1), iPad Pro 2021 (15.4.1), 2015 Macbook Air, 2018 Macbook Pro, eufy Security Solo IndoorCam C24 (x2), Homepod Mini (x4), Nintendo Switch, Playstation 4, Apple TV gen 3, AppleTV 4k, iRobot Roomba S9+, PetSafe Smartfeed, Suddenlink Stream, Canon MX470 Printer
Provider: Suddenlink, gigabit internet
Max connected devices: 20 (see above)

Any time a device is pulled off my wifi network, the entire network seemingly goes down. For example -- my wife works at an office and I work from home -- when she leaves the house, and gets out of wifi range, every single device connected to my wifi network becomes incapable of connecting to the internet. My devices capable of searching for a wifi network do not even show the network in the listed devices. If my phone dies, the exact same thing happens; every single device stops being able to access the network. This lasts anywhere from 30 seconds to roughly 2 minutes. However, if I shut off devices, do a hard reset, reset network settings, or simply turn off the wifi radio, the network is unaffected. As best I can tell, the two computers that have wired connections are not impacted by this, even though they run over ethernet from the router.

I live in a condo building with multiple other wifi networks on various channels, if that makes any difference.

Does anyone have any idea what could be causing this?
 
Solution
It would not be DNS it would be much more unusual issue. Pretty much each device only talks to the router they can not see each other traffic unless they have a session between each other. If one IP address were to disappear it "should" make no difference because the device are not communicating. A unlikely condition that can cause this is if one of the end device takes over the IP addresses used by the router. All the traffic would then pass through device and when it disappears the other machines would lose access. This almost has to be done intentionally, it is a fairly well known arp spoof attack used to capture traffic.

BUT because you can't see the network names to even attempt to connect to this sounds more like a...
That is indeed very strange.

Do you know if the wifi network is actually going away or is it still connected but the devices just can't pass traffic.

Maybe ipconfig /all will show if it is in a disconnected state. You say other devices can not even see the SSID when it does this. Do you lose both the 2.4 and 5g radios....I assume you have them set different names.

If you are losing the radios then I would suspect a bug of some kind in the router....but it would be very strange bug because it would almost have to be in the firmware that the router is loading into the wifi chips. The wifi function is fully contained in the chips which may also be why it does not affect the ethernet.

If you see the wifi but it just does not pass data then you might suspect IP address conflict but this should also affect your ethernet connected machines.
 
  • Like
Reactions: BlartVersenwaldIII
Apr 20, 2022
4
0
10
That is indeed very strange.

Do you know if the wifi network is actually going away or is it still connected but the devices just can't pass traffic.

Maybe ipconfig /all will show if it is in a disconnected state. You say other devices can not even see the SSID when it does this. Do you lose both the 2.4 and 5g radios....I assume you have them set different names.

If you are losing the radios then I would suspect a bug of some kind in the router....but it would be very strange bug because it would almost have to be in the firmware that the router is loading into the wifi chips. The wifi function is fully contained in the chips which may also be why it does not affect the ethernet.

If you see the wifi but it just does not pass data then you might suspect IP address conflict but this should also affect your ethernet connected machines.
At first, from an iPhone for example where I can see the connection status, it appears that the connection stays, but that traffic can't go through, but on the times that it takes longer to resolve it will eventually drop the network and show as disconnected from the network and just show cellular. It seems similar to how if you were to unplug a router, it doesn't immediately update that the network is no longer connected on the devices, it takes a few seconds. But it does eventually show as not connected. If I cycle the wifi on and off on one of the devices, it doesn't show it as an available network until it is usable again.

I lose both radios when doing this. I typically have the network set up for Smart Connect just for ease of use, but I just separated the bands and still the issue persists on both networks.

It sounds like it would likely be the firmware issue like you're saying. I have done a full reset of the router and set it up from scratch again two different times and that didn't fix it either.

I should also note that I have a Linux server running a docker instance with PiHole on it, with an IP address reservation for it. This is one of the wired connections that don't appear to have an issue. The router is set up to use the reserved IP address as the DNS. I thought this could have been the issue, so I have changed both to the ISP supplied DNS, as well as 1.1.1.1/1.0.0.1 and made sure devices updated to the new DNS, and have tried this with the server both online and offline in case some weird interference could be the cause. The issue still persists.
 
It would not be DNS it would be much more unusual issue. Pretty much each device only talks to the router they can not see each other traffic unless they have a session between each other. If one IP address were to disappear it "should" make no difference because the device are not communicating. A unlikely condition that can cause this is if one of the end device takes over the IP addresses used by the router. All the traffic would then pass through device and when it disappears the other machines would lose access. This almost has to be done intentionally, it is a fairly well known arp spoof attack used to capture traffic.

BUT because you can't see the network names to even attempt to connect to this sounds more like a wifi problem . I could see if only 1 radio failed but having both fail is strange since it is 2 separate chips in most devices.

Do you have a old router laying around. Before you replace your current router try to disable the radios on the router and then use the old router as a AP to replace the wifi function. If the problem does not go away then things become even more mysterious.
 
  • Like
Reactions: BlartVersenwaldIII
Solution

eldridgep2

Commendable
Dec 24, 2020
31
4
1,545
That really makes no sense and sounds like a bug in the hardware/firmware to be honest. One device leaving a wi-fi network really should not affect all the other devices on that network.

The issue seems to be just with the wi-fi not the broadband as your ethernet devices are OK this also rules out your DNS server, if the firmware is up to date and you still see the issue I'd suggest looking for any "advanced" or "experimental" options in your router settings that can be turned off. There are protocols like 802.11r that are used by devices to fast switch between available wi-fi connections to hand over devices it makes no sense they would cause what you are seeing but you could try switching anything like that off.

As has already been said though what you are seeing makes no sense so is most likely a bug and formware would be the place to start looking.
 
  • Like
Reactions: BlartVersenwaldIII
Apr 20, 2022
4
0
10
Thanks to the both of you. I'll keep messing around with the router and see if I can reload the firmware or find some settings that work.

I don't have another router to check it as an AP connected to this router, but I did try a different router entirely a little while ago when I was having other issues resolved by my ISP conditioning my accounting correctly. Once they fixed that, I figured this problem would go away too, since it started right around when the new service came in, and sent the other router back. But it does appear that a different router resolves the problem, at least when it's replacing this one with the issue entirely.

I guess if I can't find a setting for this, I might have to move up the timeline for my switch to a Ubiquity or other "small business capable" system. Not ideal, as I was hoping to do this all when I moved in a year or so, but I really can't keep having my indoor cameras going offline when I leave the house.
 
I have an R7000 and can tell you all the recent factory firmware has many complaints of the wifi crapping out (well, very fast speeds but many drops and other oddities). Stability and performance have also both been trending worse with each update while features are being removed--like iTunesServer, Kwilt Photosharing, Time Machine Backup, TiVo support for media sharing, DLNA, and Downloader feature. I think this is intentional to get you to upgrade to a new router.

For stock-type firmware focusing on mainly performance, Merlin from ASUS has been ported over by XWRT-Vortex but they're based in Russia... The other minus is it modifies the CFE (to one from the RT-AC68U) which can make going back to stock difficult. But full NAT hardware acceleration, enhanced NTFS performance, and the ASUS exclusive features such as AiCloud or the TrendMicro-powered AiProtection.

I think FreshTomato-ARM is the most stable and features Cake AQM (which was actually developed/backported on a R7000!) so it's what I prefer. The user interface is somewhere between old Linksys and new ASUS (both of which used factory firmware based on Tomato) so should feel familiar. The thing is, just about everything is done by the CPU in software so it cannot keep up with routing gigabit speed ISP and will limit performance. But its wifi is famous for handling 250+ clients simultaneously on weak hardware (who needs business class?) so is what I'd use as an AP.

For the bleeding edge, the R7000 is one of the best supported routers by DD-WRT so just about every firmware release won't brick it, and you will generally have a far less buggy experience than with less popular routers supported by the perpetually beta DD-WRT. For a gateway you would use its fq_coDel + HFSC which works great. DD-WRT focuses on having the most features so is best if you need one of those.

OpenWRT is out of the question as only 2.4GHz wifi "works" on R7000--at 54Mbit G speeds from the lack of reverse-engineered open-source Broadcom drivers for anything faster. Too bad as the focus here is on security. Has Cake too so is what I'd use as a gateway but only on Atheros hardware (like R7800)

The latest factory firmwares prevent installing 3rd-party from the GUI so you may need to either roll back to older factory firmware first or just use the special factory-to-DD-WRT trailed builds (which identify as older factory firmware). You can go straight from DD-WRT to Tomato by simply renaming the .TRX extension on the Tomato firmware file to .BIN before flashing, but be sure to erase NVRAM inside DD-WRT before flashing Tomato as the DD-WRT password is encrypted.
 
  • Like
Reactions: BlartVersenwaldIII
Apr 20, 2022
4
0
10
I have an R7000 and can tell you all the recent factory firmware has many complaints of the wifi crapping out (well, very fast speeds but many drops and other oddities). Stability and performance have also both been trending worse with each update while features are being removed--like iTunesServer, Kwilt Photosharing, Time Machine Backup, TiVo support for media sharing, DLNA, and Downloader feature. I think this is intentional to get you to upgrade to a new router.

For stock-type firmware focusing on mainly performance, Merlin from ASUS has been ported over by XWRT-Vortex but they're based in Russia... The other minus is it modifies the CFE (to one from the RT-AC68U) which can make going back to stock difficult. But full NAT hardware acceleration, enhanced NTFS performance, and the ASUS exclusive features such as AiCloud or the TrendMicro-powered AiProtection.

I think FreshTomato-ARM is the most stable and features Cake AQM (which was actually developed/backported on a R7000!) so it's what I prefer. The user interface is somewhere between old Linksys and new ASUS (both of which used factory firmware based on Tomato) so should feel familiar. The thing is, just about everything is done by the CPU in software so it cannot keep up with routing gigabit speed ISP and will limit performance. But its wifi is famous for handling 250+ clients simultaneously on weak hardware (who needs business class?) so is what I'd use as an AP.

For the bleeding edge, the R7000 is one of the best supported routers by DD-WRT so just about every firmware release won't brick it, and you will generally have a far less buggy experience than with less popular routers supported by the perpetually beta DD-WRT. For a gateway you would use its fq_coDel + HFSC which works great. DD-WRT focuses on having the most features so is best if you need one of those.

OpenWRT is out of the question as only 2.4GHz wifi "works" on R7000--at 54Mbit G speeds from the lack of reverse-engineered open-source Broadcom drivers for anything faster. Too bad as the focus here is on security. Has Cake too so is what I'd use as a gateway but only on Atheros hardware (like R7800)

The latest factory firmwares prevent installing 3rd-party from the GUI so you may need to either roll back to older factory firmware first or just use the special factory-to-DD-WRT trailed builds (which identify as older factory firmware). You can go straight from DD-WRT to Tomato by simply renaming the .TRX extension on the Tomato firmware file to .BIN before flashing, but be sure to erase NVRAM inside DD-WRT before flashing Tomato as the DD-WRT password is encrypted.
Thanks for all of this. I messed with it yesterday. Rolled back to an old version that was suggested as the most stable and did some speed tests and got much higher on wifi than I have been; 600/45. I then installed FreshTomato. I had a few issues getting it up, but it’s working now.

It’s been about 20 hours and I haven’t seen any issues yet, but I don’t always see them, and I was at the gym when my wife left for work this morning. I’m getting about 300/30 throughout my home. Ideally, I’d be getting higher, but aside from very rare instances, the only devices that actually need the very high speeds are the two hardwired devices, and those seem to be getting pretty close to 1gbps/60. So if I can just stop the issues with everything going offline, I’ll take this as a reasonable stopgap until I’m ready to go head first and get a full on business class network up and running.

If I don’t respond again, it’s a Safe assumption that this fixed the issue.
 
The thing is, just about everything is done by the CPU in software so it cannot keep up with routing gigabit speed ISP and will limit performance.

This is specifically why I built an x86 router using old computer parts. With an Athlon 3000G processor, it handle gigabit speed traffic shaping using Cake, without issue. Network is rock solid with OpenWRT , moreso than any off the shelf router I've used before including the R7000 and Asus AC86u. I eventually moved it into a 4U rack case alongside my network switch.