Modify

Opened 5 years ago

Closed 5 years ago

Last modified 22 months ago

#8520 closed defect (fixed)

wr741nd: poor tcp throughput on wan interface for hosts with big latencies

Reported by: fercerpav@… Owned by: developers
Priority: response-needed Milestone: Barrier Breaker 14.07
Component: packages Version: Trunk
Keywords: Cc: nbd@…, zdenek.koprivik@…

Description

KOPRajs irc user reports

12:15 < KOPRajs> low download speeds
12:15 < KOPRajs> it seems to depend on what server I'm downloading from
12:17 < KOPRajs> the higher is the ping to the server the lower is the speed... reverting on original firmware or on dd-wrt 
                 there's no such problem
12:18 < KOPRajs> when downloading from debian.org I have about 10x slower speed than with original firmware or with dd-wrt
12:18 < KOPRajs> no QoS
12:18 < KOPRajs> wireshart shows there's log of "tcp last segment lost" and then duplicate acks
12:19 < KOPRajs> any iso from debian.org ati.com and many more
12:19 < KOPRajs> I'm at czech
12:20 < KOPRajs> when downloading from local .cz servers it is better, about half of the normal speed
12:22 < KOPRajs> if you connect without the router you'll get full speed or if you revert to original firmware
12:23 < KOPRajs> downloading from server connected directly to the WAN port I can get 100Mbit/s

I tried downloading an iso from debian.org and indeed the download
speed is rather low. I'd like to try to test this with netem
simulating big latency but i can't yet see how to do it without
disrupting my current setup.

Attachments (0)

Change History (17)

comment:1 Changed 5 years ago by zdenek.koprivik@…

Hi,
I'm the "IRC user KOPRajs" from the above post. I just want to be more specific on the described problem.

Observations:

  • exactly the same download but with the original tp-link firmware installed on the router gives expected speed (about 10 times faster than with OpenWRT)
  • downloading from local servers or using iperf or such (with latency < 2ms) seems to be not affected (or not as much at least), only download from remote servers such as debian.org, ati.com etc. (with latency > 20ms) is very slow
  • using Wireshark to dump the slow download with OpenWRT shows many "tcp previous segment lost" errors as the cause of the low average download speed
  • downloading when connected using wireless is still slower than expected but faster than using RJ-45 LAN port
  • using 'wget -O /dev/null http://cdimage...' on the router gives the same speed as downloading using wireless (still very slow but faster than using LAN port)
  • connecting the router as a wireless client and then running the same as above 'wget -O /dev/null ...' gives the expected full speed of the link
  • now still in wireless client mode connecting my notebook to LAN port and downloading through the router again results in slow download

Conclusions:

  • it seems to me the problem is connected with the ethernet drivers (ag71xx) since wireless seems to be unaffected
  • to confirm that I tried DD-WRT on the same router as it is using ag7240 ethernet driver and not ag71xx... with DD-WRT installed I get expected download speeds (about 10 times faster than with OpenWRT)
  • at least 3 people with different routers and from different countries confirmed they can reproduce the problem

Please ag71xx developers investigate this big ag71xx performance problem.
Thank You

comment:2 Changed 5 years ago by fercerpav@…

I did some tests with

tc qdisc modify dev eth0 root netem delay 10ms

Both wan and lan interfaces are affected, the problem looks quite severe, with 100ms latency the throughput is down to few megabits per second.

Here's the result with 10ms latency, iperf done from router to the laptop directly connected to it with a patchcord to the lan interface:

Client connecting to 192.168.1.100, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.1 port 49917 connected with 192.168.1.100 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec  88.5 MBytes  37.1 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.1 port 5001 connected with 192.168.1.100 port 42840
[  4]  0.0-20.0 sec   112 MBytes  46.9 Mbits/sec

As you can see, easily reproducible :)

comment:3 Changed 5 years ago by nbd

please try to use one of the lan ports as a wan port (using vlan) to see if that makes a difference.

i'd like to know if only the actual wan port is problematic or if it's a problem on lan too

comment:4 Changed 5 years ago by zdenek.koprivik@…

The problem affects all ethernet ports (both WAN and LAN).

Router in standard mode:

  • download without router -> good speed
  • download on the router (wget -O /dev/null) (traffic goes only through WAN) -> bad speed
  • download from the PC connected to the LAN (traffic goes through both WAN and LAN) -> even worse speed
  • download from the PC connected to the Wireless (traffic goes through WAN and WLAN) -> again 'just' bad speed

Router in wireless client mode:

  • download on the router (wget -O /dev/null) (traffic goes only through WLAN) -> good speed!
  • download from the PC connected to the LAN (traffic goes through WLAN and LAN) -> bad speed

So the effect stacks. The more ethernet ports are in the way the worse is the speed.

Wireshark shows many lost TCP segments resulting in reset of TCP window. On the hosts with very low latency <10ms the TCP window is quickly reopened so the speed is not affected much but the higher is the host latency the slower is TCP window reopening and that affects the average speed.

So it seems to me that the driver probably simply loose a packet from time to time but I'm unable to prove that.

Again, neither original TP-Link firmware nor DD-WRT (both use ag7240) are affected so this must be a software problem specific to OpenWRT (ag71xx).

Thank you for any help on this.

comment:5 Changed 5 years ago by nbd

Please also try to figure out if packets get dropped in rx or in tx direction.

comment:6 Changed 5 years ago by zdenek.koprivik@…

It seems to affect both rx and tx.
Also please note that it looks like it doesn't depend on the traffic load.

comment:7 Changed 5 years ago by nbd

Please try applying this patch onto your kernel tree and see if it helps with this issue: http://nbd.name/flowcontrol.patch

comment:8 Changed 5 years ago by zdenek.koprivik@…

Hi,
sorry, the above patch doesn't help. The average download speed is even a bit worse than before.

I've made a few IO graphs showing the problem when downloading an .iso from debian.org:

This is with the original TP-LINK firmware (the same is for DD-WRT):
http://speedtest.mx-net.cz/tl-wr741nd_original.png
The average download speed is above 900kB/s which is the full speed of this link (8Mbit) and is the same with PC connected directly without the router. Also note that there are no TCP previous segment lost errors in the log.

This is with my OpenWRT build (the same is for official OpenWRT images):
http://speedtest.mx-net.cz/tl-wr741nd_mx-home_unpatched.png
The average speed is about 400kB/s. Every drop corresponds with one TCP previous segment lost error in log.

This is with the above patch applied:
http://speedtest.mx-net.cz/tl-wr741nd_mx-home_patched.png
The average speed is about 350kB/s.

comment:9 Changed 5 years ago by anonymous

Hi,
anything new on this? I've tried the same tests on UBNT Routerstation Pro (which is also using ag71xx driver but has a different chipset) and there is no such problem there. So the problem seems to be specific for the ag7240 chipset only.
So far the problem is confirmed for WR741ND and WR941ND.
If it helps I'm willing to send one WR741ND to nbd for testing. If interested, please let me know on zdenek.koprivik*post.cz.

comment:10 Changed 5 years ago by joe

Hi, I'm having these issues as well. My TP-LINK WR941ND has with OpenWRT huge connection speed loss to distant servers. DD-WRT was without this issue, but OpenWRT is much better for me, so I would love to see this bug fixed. Is there something I can do to help this issue get solved even though I'm not a kernel developer?
Thanks!

comment:11 Changed 5 years ago by nbd

Please try latest trunk. Make sure you've run make target/linux/clean and make oldconfig after svn update if it's not a fresh build.

comment:12 Changed 5 years ago by nbd

Any news on testing latest trunk?

comment:13 Changed 5 years ago by nbd

  • Priority changed from normal to response-needed

comment:14 Changed 5 years ago by zdenek.koprivik@…

I've tried the trunk at sunday. I just didn't want to be too optimistic so I've waited with the report and I plan to do more testing during this weekend.
But so far so good.

Tested versions are WR741ND and WR941ND.

comment:15 Changed 5 years ago by joe

I can happily confirm, that using trunk is TP-LINK WR941ND working flawlessly again!
Thanks!

comment:16 Changed 5 years ago by loswillios

  • Resolution set to fixed
  • Status changed from new to closed

comment:17 Changed 22 months ago by jow

  • Cc changed from nbd@openwrt.org,zdenek.koprivik@post.cz to nbd@openwrt.org, zdenek.koprivik@post.cz
  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.