So for the last three weeks or thereabouts I’ve been getting weird networking issues on my Clevo P650RS laptop while plugged in to Ethernet - occasional dropped packets and whatnot. Specifically, there was a certain peculiar series of symptoms I was observing while making video calls over Ethernet from this computer: Every now and then downlink would fall to a trickle, while upload would persist for a few seconds before dropping off again.
For weeks I’ve dismissed this as Comcast (the provider on the other end) being Comcast. On the other hand, people on the other end report not having connection issues with other tasks like video streaming.
Earlier this weekend I was working on something completely unrelated - hosting a server application generating heavy UDP traffic, using a VM running on my laptop - and I noticed something strange. While communicating with this server from my desktop computer, I noticed that the connection would cut out occasionally, sometimes for a few seconds, sometimes longer.
This seems oddly familiar, I thought.
I quickly ruled out excessive load locking up the VM by keeping an eye on it
virt-manager on the laptop during these outage events, and verified by
ssh that it wasn’t related to the specific application,
but actually the connection between VM and desktop. Watching
ping from the
laptop ruled out the VM itself and pointed at the laptop. Swapping Ethernet
ports and cables between laptop and desktop ruled out the upstream switch.
At this point, signs were pointing to the laptop itself.
While I was already fully prepared to test again at a different physical
location, the whole situation was getting pretty unacceptable at this point
ping from the laptop regularly reported >10% packetloss), so I dug deeper.
The Clevo P650RS uses some variant of the Realtek RTL8111/8168 Ethernet adapter
built in to the system mainboard. (The device reports it is revision 12.) In
the distant past, Linux support for this adapter has been spotty: the
driver has been implicated in adapter problems for years, and the standard
advice is to pull the
r8168 driver from Realtek and install that. (It’s
possible that this advice has also been outdated for years.)
So I dutifully installed
universe and gave that a try.
Imagine my disappointment when this yielded no results whatsoever.
I combed the Internet for more evidence.
Most information is stuff I’ve already attempted. One post suggested that
use_dac=1 is the magic ticket (it’s not). Another post suggested that
connection renegotiation is to blame (it wasn’t).
Nearly out of ideas, I decided to gather more evidence. Since at this point
packet loss events were reliably happening once every few minutes, I set up
ping and opened up
wireshark, my Ethernet monitor of choice. Then, I
waited for a packet loss event.
Then I spotted something that made everything turn into a red herring - a curious annotation on otherwise-normal ARP traffic sent from my laptop.
Duplicate IP address detected for [address A] ([redacted]) - also in use by [redacted]
One of the earliest networking troubleshooting tricks I learned as a child is refreshing the computer’s DHCP lease. In a nutshell, DHCP is a protocol by which network addresses are assigned - a central server provides “leases” to computers that request them, entitling them to use of a certain address for a certain period of time. One way to clear up some networking issues is to have your computer “release” its current lease and then request a new one - the central server may or may not give you the same address you had before, but the address you receive should be “clean”.
This laptop had received a lease for
address A in the distant past. One
curious property of DHCP is that, in addition to requesting any available
address, computers can also ask for specific addresses, and then receive them
directly. So my laptop has been asking for
address A - and apparently
receiving it without controversy - for the last several weeks, and all the
while, some other machine on the network has also been using
Why? One can speculate. There are many reasons by which this situation can arise, not all of which are accidental.
That aside, one DHCP renewal later, I had a nonconflicting IP address
address B, and my packet loss problems promptly vanished.
Why didn’t I try this before? Because the possibility of there being a DHCP issue hasn’t occurred to me in literally years. These days, DHCP usually does The Right Thing.
The moral: sometimes the simplest solutions are really the most useful.