Yesterday morning I got frustrated by a really slow download speed of some files. What should have taken seconds with my 400mb/s connection actually takes more than 16 minutes. In addition, I’m able to reproduce the problem on my router.
Here curl statistics show a 26:52 minutes long download at 415kb/s:
$ curl -o /dev/null https://s3.us-east-2.amazonaws.com/zuul-images/fedora-open-vm-tools-livecd.iso
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed
0 627M 0 2684k 0 0 398k 0 0:26:52 0:00:06 0:26:46 415k^C
Just to be sure, I retry with a lower MTU. The issue remains. Two friends in Montréal also confirm everything works well for them.
My laptop uses a wire connection that I’ve been using for years now. A large part of the population works from home and the Internet is under constant pressure.So I suspected an ISP bandwidth limitation to preserve the Quality of Service. I contacted them and … complaint… a lot.
At some point, the technician manages to get my attention and asks me to connect the modem directly to my laptop.
This is an obviously pointless request since the downloads are also slow from the router. But well, if I want them to act, I also need to be cooperative. I give it a try and… I immediately felt embarrassed. It was damn fast! The download speed is actually even above the 400mb/s ceiling. WTF.
So I start to reconsider my whole life in a deep introspection. How can this be real? My router runs OpenBSD 6.7 with an absolutely basic pf configuration. Its hardware is decent (Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz). The router is connected to the modem with a very common Realtek RTL8125 and the LAN is connected to an Intel I350. I can download large files between my laptop and the router at full speed.
All this takes out of the equation my laptop, the wire, the router’s I350. The last potential culprit is the Realtek NIC. I replace it with another Intel I350 and retried. Same problem. The downloads are still slow on my laptop and the router. The Realtek NIC is innocent.
So I start thinking. Why the rest of the family is not really affected by the poor performances of the Internet connection. And so, I try to download the same file from another computer. And it’s fast… Damn, what’s going on. I switch some ports of the switch just to be sure. The problem is now between my laptop, the wire.
My laptop uses a bonding of the Wifi and the Wire connection. The gives me the ability to move around with the laptop without losing all my open connections. For instance, I can remove my wire in the middle of a meeting and the laptop will use the Wifi seamlessly. But well, I disgress. I remove this special configuration. And, … the problem remains.
But during this last check, I also saw a large number of RX errors associated with the laptop NIC. Interesting, let’s try another NIC. I plug in a USB3 Realtek RTL8156 NIC and this time… it works.
The wire is a CAT6 cable and is not that long (20m), but it sounds like the NIC (Intel I219-LM) of my T580 is a bit picky with the quality of the signal. It can also be a problem with the e1000e drive of the new 5.10 kernel. The cable is good, I’ve just tested it. Anyway I’ve put a switch just before my laptop NIC and everything works great now.
I’m still not sure why by download are still slow on OpenBSD. But this is an adventure for another day. The slow downloads were all with HTTPS sites (S3 and a Caddy website), the DF flag was on (TCP Don’t Fragment) which exaserves the impact of the transmission errors.
I found the whole situation to be interesting. It’s a series of wrong assumptions and the solution is really far from what I would have imagined.
Also, thank you Teksavvy for your great support.
update: this would partially explain the OpenBSD S3 download problem.
update2: I’m now running Linux 5.10.11 and… I don’t see any RX errors anymore! The S3 download is just as fast as it should be. So this was indeed a problem with the e1000e driver.