[SOLVED] Unexplained packet loss on cable modem hop

Sep 29, 2020
4
0
10
Been scratching my head over this one for a few days, and the results are inconsistent and frustrating. The ISP visited on Monday and fixed some bad cable splitters, but it did not seem to resolve the issue. I also replaced the modem as a last resort, but this did not seem to help.
Every 20-40 minutes, I experience a brief network interruption characterized by significant packet loss. This resolves very quickly, but is enough to interrupt high-fidelity activities like Zoom calls and games. The behavior is generally consistent, although it did stop for about 3 hours on Sunday evening, which is suspicious.

Two (censored) Pingplotter snapshots below. The first three hops are internal, rest is external:

Full display: View: https://imgur.com/a/5bItGA0


Zoomed in: View: https://imgur.com/a/SJ6y3wA


I'm having a hard time interpreting those results. The ISP reports no detected packet loss when examining the setup, but it does look like a good chunk of it is coming from the hop to the modem. The EdgerouterX is ignoring ICMP, but that doesn't seem to have any downstream impact.
Signal-to-noise and power on the modem all seem within reasonable parameters.


Network setup:
  • Arris DG3450 Modem (usually runs in bridge mode, temporarily re-enabled router mode to appease ISP when they were running tests)
  • Ubiquiti EdgerouterX (main router used)
  • Ubiquiti Wifi AP
  • One main PC wired to the EdgerouterX (this is where the tests are being run from)
  • 5 or so WiFi appliances (2 laptops, apple TV, Nest, etc)

Things I've tried:
  • Rebooting everything
  • Swapping the ISP's modem
  • Swapping every ethernet cable involved for new ones
  • Swapping the coax cable going to the modem
  • Running pings from a laptop instead of the wired PC (same results)
  • Kicking virtually everything off the network except the monitoring laptop and the PC
  • Tuning NIC settings to be more performance
  • Disabling EEE on every physical interface just in case
 
Solution
I am glad you properly can read the ping plot unlike many people that don't understand strange results like your router and its high fake packet loss.

You modem should have both a log and display that shows the power levels. See if there are corresponding messages in the log from where the pingplot shows large loss. You should also check you signal level to be sure they are ok. Many times it is the upload levels that are too high that causes many issues. There should also be on that display number of correctable and uncorrectable errors.

Your ISP should have found these issues when they were out to test. The equipment they use to test the signal levels is actually more accurate than the modem and you have replaced the...
I am glad you properly can read the ping plot unlike many people that don't understand strange results like your router and its high fake packet loss.

You modem should have both a log and display that shows the power levels. See if there are corresponding messages in the log from where the pingplot shows large loss. You should also check you signal level to be sure they are ok. Many times it is the upload levels that are too high that causes many issues. There should also be on that display number of correctable and uncorrectable errors.

Your ISP should have found these issues when they were out to test. The equipment they use to test the signal levels is actually more accurate than the modem and you have replaced the modem.

The only very strange thing I see in your ping plot is hop 2 which should represent the connection between your house and the ISP has a very low latency. This normally would be the case if you had 2 routers in your house. When you place your router/modem into bridge mode it should be transparent to any trace since all it is doing is converting the ethernet packets to docsis and putting it on the coax. It has no real concept of IP addresses so it does not respond to ping or trace.

If it is between the 2 routers in your house I can't say how that can happen really. A bad ethernet cable but that is very unlikely. Maybe some software problem with one of the routers. Then again maybe it is not fully in bridge mode or something.

Your latency is very good in general with low jitter so it not a load or capacity type of issue.
 
Solution
Sep 29, 2020
4
0
10
Thanks!

I've checked the power levels and SnR, both seems within standards for this modem. There are no correctable or uncorrectable errors, and the logs don't have anything during the windows in which these issues occur.

The ping to hop 2 was higher due to the fact that I had disabled bridge mode. The ISP was pretty insistent on everything being to their specs when they were testing, but I've since reset that.

Interestingly, the issue disappeared for about 3 hours today, but returned at about 10pm. I'm honestly completely at a loss here, and the ISP techs are useless as usual.
 
Generally what you should see if you have your router and then only a modem is no loss or delay to hop 1. In your case the router is being strange. Then you should see at least 8-10 ms on hop 2. This is the time it takes to get from your house to the ISP on the cable. This is where you should see packet loss if the problem is to your line.

Since they can't see the problem likely because it is intermittent they will blame your router. I would try to run the ISP router in modem only mode and plug your PC directly into the modem. You should get a public IP directly on your pc. Your testing should now show all the problems in hop 1.
 
Sep 29, 2020
4
0
10
Unfortunately if I switch the ISP modem to bridge mode instead of router mode, my PC does not get assigned an IP. My router does, but I'm not sure what the difference there is.
 
Sep 29, 2020
4
0
10
Thanks for all the help!

Bumping this thread because I found the answer, and hope I can help someone in a similar boat solve this problem if they're struggling.

I noticed that it was possible to force this issue to stop by engaging in very specific activities that had specific traffic patterns. High bandwidth, but low frequency, activities would result in much fewer issues. If I examined the graph closely, I'd still see a small spike every 20 minutes, but no packet loss.

This led me to reading about the Intel Puma 6 issue. There's a lot of FUD out there, probably lawyers trying to get people to join class actions, but there's a significant amount of truth to this problem as well.

It turns out the games I was playing generated enough traffic to overload the modem I was using, and eventually it would dump the buffer it was using to patch this issue, and when that buffer dumped I'd lose any packets pending.

I ran this test against my modem, and it confirmed my suspicions: http://www.dslreports.com/tools/puma6

After that, replacing the modem with a Netgear D1000 fixed everything.