Modify

Opened 4 years ago

Closed 4 years ago

Last modified 21 months ago

#10486 closed defect (no_response)

Low throughtput (20Mbps) in WR1043ND and failure messages in dmesg

Reported by: danielkza2@… Owned by: nbd
Priority: response-needed Milestone: Barrier Breaker 14.07
Component: packages Version: Trunk
Keywords: ath9k wr1043nd wifi wireless Cc:

Description

I've been trying to find out why I have such horrible bandwidth using a TP-Link WR1043ND router, and after trying 10.03.1 RC5 and RC6 I decided to give trunk (22-11-2011, r29289) a go, and I get these messages continually in dmesg at apparently random intervals:

ath: Failed to stop TX DMA, queues=0x004!
ath: Failed to stop TX DMA, queues=0x004!
ath: Failed to stop TX DMA, queues=0x004!
ath: Failed to stop TX DMA, queues=0x004!
ath: Failed to stop TX DMA, queues=0x004!
ath: Failed to stop TX DMA, queues=0x004!

Is it possible these are related to the horrible throughtput? I tried pretty much everything, using multiple Wireless adapters with multiple chips (2 Broadcom, 2 Realtek), tried configuring antennas manually, disabling encryption, all possible channel combinations, enabling and disabling RTS/CTS, and absolutely nothing helped.

Here is the relevant wireless configuration:

wireless.radio0=wifi-device
wireless.radio0.type=mac80211
wireless.radio0.macaddr=f4:ec:38:e9:a9:a0
wireless.radio0.ht_capab=SHORT-GI-40 DSSS_CCK-40
wireless.radio0.disabled=0
wireless.radio0.distance=30
wireless.radio0.frag=2346
wireless.radio0.rts=2346
wireless.radio0.txpower=27
wireless.radio0.noscan=1
wireless.radio0.channel=3
wireless.radio0.country=US
wireless.radio0.diversity=0
wireless.radio0.txantenna=0x5
wireless.radio0.rxantenna=0x2
wireless.radio0.hwmode=11ng
wireless.radio0.htmode=HT40+

Attachments (0)

Change History (33)

comment:1 follow-up: Changed 4 years ago by nbd

Have you tried using HT20 instead of HT40+?

comment:2 Changed 4 years ago by nbd

Another thing: your txantenna/rxantenna settings look completely messed up. Please set them both to 0x7

comment:3 in reply to: ↑ 1 Changed 4 years ago by anonymous

Replying to nbd:

Have you tried using HT20 instead of HT40+?

I had tried it multiple times before and it does not help at all. Would also not be acceptable since the whole point of 802.11n is to use the extra channels for more bandwidth, isn't it?

Another thing: your txantenna/rxantenna settings look completely messed up. Please set them both to 0x7

They were previously set to 0x7, I changed it exactly because I was having trouble. I thought making the antennas exclusive for TX or RX might help, but it changed absolutely nothing, so I did not revert it.

comment:4 follow-up: Changed 4 years ago by danielkza2@…

Ops, forgot to use my username on the post right above this one. I tried reverting the antennas to 0x7 and using HT20, this my iperf output now. Borderline unusable.

~ $ iperf -c 192.168.1.1 -i 5 -t 60
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.100 port 54141 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 5.0 sec  1.70 MBytes  2.86 Mbits/sec
[ ID] Interval       Transfer     Bandwidth
[  3]  5.0-10.0 sec  24.0 KBytes  39.3 Kbits/sec
[ ID] Interval       Transfer     Bandwidth
[  3] 10.0-15.0 sec  16.0 KBytes  26.2 Kbits/sec
[ ID] Interval       Transfer     Bandwidth
[  3] 15.0-20.0 sec  16.0 KBytes  26.2 Kbits/sec
[ ID] Interval       Transfer     Bandwidth
[  3] 20.0-25.0 sec  2.35 MBytes  3.95 Mbits/sec

comment:5 in reply to: ↑ 4 Changed 4 years ago by anonymous

Sorry if I sounded harsh in the previous replies, I'm just really annoyed by this issue since I'm really dependent on media streaming. If there are any other configurations I could try, and/or post my current setings I'd be happy to help.

comment:6 Changed 4 years ago by nbd

  • Owner changed from developers to nbd
  • Status changed from new to accepted

I have some ideas to try. As soon as I have some more time, I will prepare some patches for you to test.

comment:7 Changed 4 years ago by nbd

please try r29339

comment:8 Changed 4 years ago by danielkza2@…

Thanks, compiling it right now. Hopefully it won't take too long (I've never tried compiling OpenWRT before!).

comment:9 Changed 4 years ago by danielkza2@…

Things didn't go very well. I compiled OpenWRT from the backfire branch (checkout from SVN, make defconfig, make menuconfig, select WR1043ND profile, make), and after flashing I got constant reboots, sometimes randomly and others right as I tried to login through SSH, after which I would need to unplug the router from power as it stopped responding completely (not assigning DHCP addresses, no response on telnet or SSH).

I'll try recompiling to be sure I didn't make any mistakes and I'll report the results.

comment:10 Changed 4 years ago by danielkza2@…

It was my mistake after all, I recompiled and the firmware is fine. Unfortunately it doesn't seem to have solved my problem, I still can't get anything over 20Mbps, with either HT20 or HT40+/-, 802.11g or 802.11n, using either an USB Atheros-based TP-Link WiFi adapter or the Broadcom-based internal adapter on my notebook. Windows 7 SP1 and Ubuntu 11.04 show the exact same behavior.

After some testing I found out something interesting: after the Failed to stop TX DMA, queues=0x004! message the throughput from both the desktop and the notebook drop to about 3.5Mbps and only gets restored to (the bad but not that horrible) 20Mbps with a reconnection from the clients.

comment:11 Changed 4 years ago by danielkza2@…

Apparently I spoke too son: after recompiling and making sure to follow all the steps precisely (make distclean, update package feeds, install package feeds, add arch and router to config, make defconfig, make menuconfig, make), I still get reboots and non-responsiveness. After the first drop the WiFi never came back up, and I had to use failsafe mode to recover to RC6 because the router would randomly drop even the wired connections.

comment:12 Changed 4 years ago by Paul Geraedts <p.f.j.geraedts@…>

I have 2 Buffalo WZR-HP-G300NH devices here, both running trunk.

It seems that the one running r28057 shows normal behavior and the one running r29337 shows the behavior as described in this ticket #10486.

I'm planning to find out what is causing this. Currently, two questions:

1) is the hardware of every WZR-HP-G300NH identical or do more hardware revisions exist of the WZR-HP-G300NH? I.e. can I exclude differences between hardware revisions?

2) how can I trigger this specific behavior reliably?

Thanks,

Paul

comment:13 Changed 4 years ago by danielkza2@…

I found these messages in my logs yesterday, running trunk r29330. Nevermind the wrong date:

Sep 10 03:59:29 OpenWrt kern.err kernel: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 DMADBG_7=0x00026020
Sep 10 03:59:29 OpenWrt kern.err kernel: ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
Sep 10 03:59:29 OpenWrt kern.err kernel: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 DMADBG_7=0x00026020
Sep 10 03:59:29 OpenWrt kern.err kernel: ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
Sep 10 03:59:29 OpenWrt kern.err kernel: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 DMADBG_7=0x00026020
Sep 10 03:59:29 OpenWrt kern.err kernel: ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
Sep 10 03:59:29 OpenWrt kern.err kernel: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 DMADBG_7=0x00026020

comment:14 Changed 4 years ago by danielkza2@…

Decided to try trunk again today, r29423 does not cause reboots or complete fails, but is still limited to 20mbps. Monitoring for error messages but have found none so far.

comment:15 Changed 4 years ago by danielkza2@…

Scratch that, I'm getting the DMA errors still:

[31459.070000] ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 DMADBG_7=0x000286c0
[31459.080000] ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
[65583.950000] ath: Failed to stop TX DMA, queues=0x004!
[97263.850000] ath: Failed to stop TX DMA, queues=0x004!

comment:16 Changed 4 years ago by nbd

please try latest trunk and see if you still get these issues.

comment:17 Changed 4 years ago by alexander

r29936

[10772.560000] nf_conntrack: table full, dropping packet.
[11118.710000] ath: Failed to stop TX DMA, queues=0x004!

hardware: tplink mr3420
got first time connected on 150mbs by my notebook via wireless
lookes like more stable than r29592, wifi stops disconnects every 10minutes
dont know how to debug error deeper

comment:18 Changed 4 years ago by nbd

  • Priority changed from normal to response-needed

any issues remaining aside from the kernel messages?

comment:19 Changed 4 years ago by alexander

i think that all works perfectly. no disconnects during all day. exept one thing: speed between two .n notebooks max 300kb/s, but speed between .n & .g is 2mb/s. speed lan-wlan 7mb/s and it's stable! any comments?

comment:20 Changed 4 years ago by alexander <malivinski@…>

today got WR1043ND. with openwrt last trunk speed between two .n is also 300kb/s max. installed original firmware and see 700kb/s max, think this is chip problem not firmware?

comment:21 Changed 4 years ago by jow

Maybe a general interop issue with your client chipset? Did you update your client wifi drivers as well?

comment:22 Changed 4 years ago by danielkza2@…

I see no improvement with trunk r29981: I can't get anything over 20 Mb/s, and in addition I'm getting random latency spikes and packet drops over wireless, on maybe 5 minute intervals.

comment:23 Changed 4 years ago by alexander

i setted up direct computer-computer wifi network and got 2mb/s speed. all drivers installed are lasted available version. i think it's not a problem for me, coz file server is attached to lan, and lan-wifi speed is great with 29936 trunk. cant check 29981 now, maybe a bit later. but 29936 works perfectly for me.

comment:24 Changed 4 years ago by nbd

can anybody still reproduce throughput issues in latest trunk?

comment:25 Changed 4 years ago by Paul Geraedts <p.f.j.geraedts@…>

@nbd: yes, after updating one of my devices (see post above) from r29337 to r30791 I see similar behavior. Please let me know if/how I can help to debug this issue.

comment:26 Changed 4 years ago by nbd

What kind of throughput are you getting. And does it affect both directions or just one?

comment:27 Changed 4 years ago by Paul Geraedts <p.f.j.geraedts@…>

AP->station, inside the same room: typical ~100Mb/s, but when this low-throughput/failure mode gets triggered <20Mb/s during several minutes, sometimes as low as 20Kb/s (continuously measuring with iperf and reporting over 15s intervals).

The low-throughput mode is much harder to trigger than in previous releases though (was lucky yesterday). Did some quick testing today, but was unsuccessful. Hard to reliably trigger. Seems to be related to the specific radio environment.

Felix, what do you expect that may have caused the improved throughput behavior of current trunk? minstrel-related or something?

End of March I'll have 2 identical AP-station combinations running so I can do more proper symmetrical stress tests. I'll report back afterwards..

comment:28 Changed 4 years ago by nbd

in current trunk there are some more ANI improvements, please test again

comment:29 Changed 4 years ago by nbd

  • Resolution set to no_response
  • Status changed from accepted to closed

comment:30 Changed 4 years ago by Paul Geraedts <p.f.j.geraedts@…>

Sorry for not responding in a timely manner. The wifi hardware I was waiting for was never dispatched and I got distracted by something else in the meantime.

The low throughput I observed and still observe seems much more related to #10166 anyhow. So lets leave this ticket closed.

comment:31 Changed 4 years ago by nbd

please try r32159 to see if it makes the issues go away on your hardware

comment:32 Changed 4 years ago by Paul Geraedts <p.f.j.geraedts@…>

Was already doing so : )

I'm doing a 12 hour iperf run between my laptop and my 1Gb/s connected server with 2 WZR-HP-G300NH transmitting in the same 40MHz band (of which the test setup is producing the dominant traffic). Tomorrow morning I will have some real answers.

That said: up to now I am pretty amazed by the steady performance and the responsiveness of the wifi networks! I certainly hope this turns out to be the final conclusion. A great achievement! It has been an annoyance for a long time.

comment:33 Changed 21 months ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.