Strange network issue!

rm69

Honorable
Jan 28, 2014
8
0
10,510
We have a wiznet 5300 ethernet serial-to-modbus network device, and it does not have any DHCP support code inside. I can ping it with a direct cable connection from PC. I can ping it through my switch A. There can be multiple devices in the same switch A belonging to a network, say 192.168.1.x. My arp table is fine till this point. If I connect this switch A to another switch B belonging to an existing large network with lot of devices, all hell breaks loose. wiznet5300 replies for the first ping, and then it times out. Sometimes, it doesn't even reply for the first ping. If I see the arp table, I can see that the MAC of wiznet5300 is replaced with something else, and the gateway I entered in wiznet5300 is also present with the same MAC, even though that gateway IP is not a valid IP or existing valid device in te network! If I change the gateway in wiznet5300 to 0.0.0.0, it doesn't help. If I replace my 5300 device with similar thirdparty device which uses realtek network chip, with same IP and MAC, that device works fine, and pings well. If I replace wiznet5300 device with a laptop, then the laptop replies for the ping correctly. ie: everything work fine except wiznet5300 device. But, If I remove the cable connection between this switch A and the other network, then the problem disappears! That network has devices (PCS, Printers..) with multiple network addresses, 192.168.1.x and 2.xs (or even more network addresses) connected using 3 or 4 switches, to my knowledge. No IP clash or MAC clash when I insert wiznet5300 into that network, since I can replace wiznet5300 with another third party device with same MAC/IP as mentioned before, then it pings fine. If I do not use switch A, and just plug 5300 device into switch B network, the result is same. Some times, if 5300 initiates a connection to a specific PC in switch B, then from that machine, 5300 can be pinged. Broadcasts from 5300 can also be received in all PCs connected in switch B. I have tried clearing arp table before pinging, specially when I see the arp table being filled with wrong MAC value of 5300 ie: MAC that does not belong to 5300.
In short, wiznet5300 doesn't ping correctly in some network, with unknown /many devices !! . But my 5300 device works correctly with 3 or 4 devices in switch A, and can communicate with external pc through a gateway (router connected to switch a) to and fro. What could be wrong? Can trace this issue with wireshark?
This device has been connected to other networks, and it works fine. What could be there in the network mentioned above that can make this device behave this way?
 
Solution
I am a network guy and even though I have a electrical engineering degree you start getting this close to hardware and I am out of my expertise area. I know the IEEE and RFC rules very well for software stuff but how exactly the hardware does this tends to be my weak area.

If wireshark cannot decode the packet that is a very bad sign. Something is violating the rules in a major way. Still the part that should be correct is the mac address in the packet. What you should see is a arp that says who has ip x.x.x.x tell y.y.y.y. These are generally sent to the broadcast address. Now the response should be a direct message back. This should have a proper source and destination mac address even if the rest of the packet is...
It sound like some form of arp poison. It could be getting a strange packet from a end user machine and responding incorrectly. It is likely a different machine sending it some broadcast and if that machine is not on the network or switch then it all works fine.

wireshark is you best bet. If you can run a port mirror on the switch for the port this is plugged into you will get the best idea. Wireshark sometimes lies when you are capturing on the device you suspect is not working properly.
 

rm69

Honorable
Jan 28, 2014
8
0
10,510
Thanks Bill. Will probe further in this angle, and report soon.

1. I did notice arp table entries with strange IP addresses starting from 77., which the network admin says he has no clue about. It cannot be an internet address since such addresses have no place in the arp cache, right? This IP also seems to be active in wireshark logs.

2.I can also see the addresses of previous IPS (I changed the IP and MAC many times to check) building up. It had an IP of 192.168.0.207 and 1.201 before, but after changing it to 10.10.10.10 for testing, the next day I can see the arp cache building up with these old entries, with same unknown MAC. The gateway arp of the device, even though it should appear as invalid since it was not present in the system, was appearing with same MAC as the device, and this MAC was different from actual MAC of the device. I changed the MAC many times but this strange MAC kept reappearing.

3. My wireshark ICMP replies are in hex. I can see the arp, 'who is 10.10.10.10' but some how the replies and ICMP echoes are in hex, (or rather looked hex with no translation to me because of tension!!_, so should I convert them and check what exactly is happening?

4. Why is only one device getting into problem? Device uses a hardware TCP/IP chip , not a stack, and the ICMP /ARP etc are chip managed - managed directly by hardware. Could this be someway connected to the vulnerability? (Device works correctly in 1000 plus locations without issues. All those places had single network only, ie either from xxx.y with .0 mask.


4.



 
I am a network guy and even though I have a electrical engineering degree you start getting this close to hardware and I am out of my expertise area. I know the IEEE and RFC rules very well for software stuff but how exactly the hardware does this tends to be my weak area.

If wireshark cannot decode the packet that is a very bad sign. Something is violating the rules in a major way. Still the part that should be correct is the mac address in the packet. What you should see is a arp that says who has ip x.x.x.x tell y.y.y.y. These are generally sent to the broadcast address. Now the response should be a direct message back. This should have a proper source and destination mac address even if the rest of the packet is damaged. With a managed switch you should be able to trace the source mac back to the port that sent the packet. That will be the device that is responsible for sending the garbage. Even if the mac is spoofed it should still reside in the mac table of the switch for a period of time. The other kind of troublesome ARP packets that cause major trouble ..and the ones attackers use to poison are... are arp reply packets sent to the broadcast address without anyone asking. Many machines will just stick these in their arp table which is cause of the exploit.

I guess if you get really stuck and can't find this you could try to use some of the switch security features to help locate it if yours support it. This like port mac limitation prevent mac spoofing and things like DHCP snooping can prevent IP address spoofing.

And of course if you just get sick of trying to find the problem static ARP entries prevent any changes which is always what I use to prevent man in the middle attacks.
 
Solution