Modify

Opened 6 years ago

Closed 5 years ago

Last modified 7 months ago

#8343 closed defect (worksforme)

unstable wireless connection on WZR-HP-G300NH ar71xx

Reported by: Anton <anton.bugs@…> Owned by: nbd
Priority: normal Milestone: Backfire 10.03.1
Component: packages Version: Backfire 10.03.1 RC4
Keywords: Cc:

Description

Hi, I've flashed my buffalo g300 with Backfire RC4 (openwrt-ar71xx-wzr-hp-g300nh-jffs2-sysupgrade.bin). Sadly, the wireless connection is still unstable and keep disconnecting randomly. I didn't experience this problem with the original firmware.

I've restricted wireless to 11g via wireless.radio0.hwmode (as wiki suggested) but it didn't help.

The system log file attached to this ticket, please let me know if I can provide anything else useful for debugging and fixing the problem.

Attachments (5)

system.log (15.7 KB) - added by Anton <anton.bugs@…> 6 years ago.
syslog_r24475.log (2.9 KB) - added by Anton <anton.bugs@…> 6 years ago.
the trunk from 11 Dec is still affected
syslog2_r24475.log (2.7 KB) - added by Anton <anton.bugs@…> 6 years ago.
Some more weird logs
syslog_r26181.log (3.0 KB) - added by Anton <anton.bugs@…> 5 years ago.
see more details in the log
ipw2200_r26181.log (16.4 KB) - added by Anton <anton.bugs@…> 5 years ago.
both client and openwrt logs

Download all attachments as: .zip

Change History (48)

Changed 6 years ago by Anton <anton.bugs@…>

comment:1 Changed 6 years ago by Gorby

I recommend to flash to latest trunk. The problem is well (i hope!) known and there were some tickets about. The instability with ath9k drivers is slowly fixing during the further development. Stay tuned.

comment:2 Changed 6 years ago by jow

The mac indicates that the client uses an intel radio. Which chipset exactly? Did you try to update the drivers on the client?

comment:3 Changed 6 years ago by jow

  • Owner changed from developers to nbd
  • Status changed from new to assigned

comment:4 Changed 6 years ago by Anton <anton.bugs@…>

It's not about client. I have 3 wifi devices, all of them effected.

Just in case, it's PRO/Wireless 2200BG [Calexico2], net-wireless/ipw2200-firmware-3.1 with sys-kernel/gentoo-sources-2.6.36. The second client is wip300 linux-based sip phone.

Again, it was working fine with the original firmware and I haven't tried dd-wrt.

ps. How can I subscribe to a bug report so I would receive a mail with each update?

comment:5 Changed 6 years ago by nbd

Normally adding your email address to Cc should be enough to receive an update.
About your issue: please try latest trunk. There were some updates recently that should help with various compatibility issues.

Changed 6 years ago by Anton <anton.bugs@…>

the trunk from 11 Dec is still affected

Changed 6 years ago by Anton <anton.bugs@…>

Some more weird logs

comment:6 Changed 6 years ago by Anton <anton.bugs@…>

comment:7 Changed 6 years ago by Jeff Beard-Shouse <clarke.hackworth@…>

I am having the same issue, my logs look almost identical to the ones provided in this bug. I have the WZR-HP-G300NH router and am running Backfire 10.03.1-rc4. If it is needed I may be able to test with trunk over the holiday.

comment:8 follow-up: Changed 6 years ago by nbd

Yes, please try that.

comment:9 in reply to: ↑ 8 Changed 6 years ago by clarke.hackworth@…

Replying to nbd:

Yes, please try that.

Ok

To be clear as to the problems I do get random wifi disconnects, however the actual disconnections happen about every 1-2 days. This is not that big of deal to me as the operating systems I run just reconnect. I do receive the "WPA: group key handshake completed (RSN)" log messages about every 10-20 minutes.
However what is a bigger deal to me is the wifi stops working about every 1/2 day to 2 days. By stops working I mean all clients get kicked off and the ssid no longer shows up when I scan from a client. The router then needs to be rebooted or the networking subsystem restarted. I am unsure whether this behavior is related to this bug.

I have tried using truck as of Thursday of last week. I have also tried using trunk with kernel 2.6.37. Both have had the same issues (actually my impression was the newer kernel seemed to have the wifi stop more). It may be a coincidence but it seems like the wifi stopping happens more often when I am using a cisco vpn, and/or a bittorrent client, than with regular web traffic (including streaming video). Not sure why that would be, or since I dont have any hard numbers and or facts about it, if it just seemed that way to me.

With the trunk builds I am also seeing new error messages in the log:
kernel: ath: Failed to stop TX DMA in 100 msec after killing last frame
kernel: ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
and
kernel: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020

comment:10 follow-up: Changed 6 years ago by nbd

The error message is not new, it was just hidden before.
Stability should be much better as of r24962 (backfire), r24954 (trunk)
Please test again.

comment:11 in reply to: ↑ 10 Changed 6 years ago by clarke.hackworth@…

Replying to nbd:

The error message is not new, it was just hidden before.
Stability should be much better as of r24962 (backfire), r24954 (trunk)
Please test again.

I have been running trunk r24961, since yesterday. I have had the wireless network stop once so far.

comment:12 Changed 6 years ago by clarke.hackworth@…

Just an update I have trunk r24966 and kernel 2.6.37. The wireless clients drop connections about 3-4 times a day along with the occasional wifi network drop out (1-2 a day).

I did notice that when the wifi network drops out. I get about a log message a second relating to "kernel: ath: Failed to stop (TX/RX)"

comment:13 Changed 6 years ago by anonymous

Has this been fixed yet?

comment:14 Changed 6 years ago by clarke.hackworth@…

I have been running r25472 for about a day and a half. I have not any connection issues like I have had with other builds. It means that this issue may be fixed, however I will need to run for a few more days without problems to be completely sure. I will report back then.

comment:15 Changed 6 years ago by clarke.hackworth@…

I have been running r25472 for 2 full days now and it just had connection problems again. I get log messages like:

Feb 14 10:00:00 OpenWrt user.err kernel: ath: Failed to stop TX DMA in 100 msec
after killing last frame
Feb 14 10:00:00 OpenWrt user.err kernel: ath: Failed to stop TX DMA in 100 msec
after killing last frame
Feb 14 10:00:00 OpenWrt user.err kernel: ath: Failed to stop TX DMA!
Feb 14 10:00:00 OpenWrt user.err kernel: ath: DMA failed to stop in 10 ms AR_CR=
0x00000024 AR_DIAG_SW=0x42000020
Feb 14 10:00:00 OpenWrt user.err kernel: ath: Could not stop RX, we could be con
fusing the DMA engine when we start RX up

every second and the wifi network is not operational (it does not show up as a network when clients scan). The problem corrected itself in about 20 minutes, but then started up again after 5-10 minutes. A reboot temporarily solves the problem.

I would say this bug is not fixed. :-(

comment:16 follow-up: Changed 6 years ago by anonymous

Just wanted to point out that I am experiencing this same issue with both OpenWRT and the latest version of DD-WRT (rev. 16144) on this router. It is a commonly reported issue with DD-WRT, though stock firmware does not seem to exhibit the same problem.

comment:17 in reply to: ↑ 16 ; follow-up: Changed 6 years ago by anonymous

Replying to anonymous:

Yes, it is already been highlighted in the comment #6 that the source code of the stock firmware is public. All we have to do now is to compare two different sources (which is I guess not an easy job) and come up with a proper patch which would fix *that* problem. Sorry, but I don't like "please try the latest trac version" approach. It didn't work for the last 6 month.

comment:18 in reply to: ↑ 17 Changed 6 years ago by jow

All we have to do now is to compare two different sources (which is I guess not an easy job) and come up with a proper patch which would fix *that* problem.

I wonder how you intend to compare a binary driver blob to the ath9k sources.

comment:19 Changed 5 years ago by anonymous

Not to mention that the stock firmware does appear to exhibit similar behavior, albeit not as often as Buffalo's DD-WRT version.

comment:20 Changed 5 years ago by nbd

Please try latest trunk or backfire, stability should be better now.

comment:21 Changed 5 years ago by clarke.hackworth@…

I am now running r26018. It still seems to be happening. Clients get disconnected and "Failed to stop TX" messages every second in the log. The behavior stops after a while. In my latest case I think it happened for 5 minutes or so, but I have seem it last up to 20 minutes in the past.

I am a programmer and know some C/C++. However I am not as familiar with coding for openwrt. If someone out there who is working on this wants help, I would be willing to try my hand at working on it. All I would need is to be brought up to speed on what current theories are on this bug, and where/how to get started working on it. Send me an email: clarke.hackworth@…

comment:22 Changed 5 years ago by clarke.hackworth@…

grr, it always seems to obscure email addresses. contact me here: clarke.hackworth (at) gmail.com

comment:23 Changed 5 years ago by nbd

were you using HT20 or HT40?

comment:24 Changed 5 years ago by nbd

Committed some more fixes in r26167, r26172
Please test.

comment:25 Changed 5 years ago by frankbernier@…

Hi, I got this router last week and got the same issue on DD-WRT. I tried the very latest build I could and got the same results, so I decided to give OpenWRT a try. Setup was a breeze with opkg making everything very easy (got a working NTFS NAS, PPTP, and everything else setup very easily). I found the snapshots and installed the one from march 16th. So far so good with a 24 hour uptime on HT40 (was using "auto" on DD-WRT but I think it was using HT20). I'm still configuring lots of stuff so I may end up rebooting but I'll report if it fails again or after like a week of uptime. Thank you for your work!

Changed 5 years ago by Anton <anton.bugs@…>

see more details in the log

comment:26 Changed 5 years ago by Anton <anton.bugs@…>

nbd,

I really appreciate your work, however need to report that it is still not fixed for me. The connection still drops quite often. See the log file for details.
Also, I've got a personal feeling that the official dd-wrt is more stable then the current trunk.

Thanks.

comment:27 Changed 5 years ago by nbd

Anton: Did you have issues with DD-WRT as well, or did everything work there with the same client - have you made a direct comparison?
Also, which version of iwlagn/mac80211 are you using on the client?

Changed 5 years ago by Anton <anton.bugs@…>

both client and openwrt logs

comment:28 Changed 5 years ago by Anton <anton.bugs@…>

I've done more testing with 4 different clients. Here is the results:

iwlang
02:00.0 Network controller: Intel Corporation Device 4239 (rev 35)
sys-kernel/gentoo-sources-2.6.38
net-wireless/iwl6000-ucode-9.221.4.1
dd-wrt: works fine
openwrt: works ok with some errors as in the attachment

ipw2200
02:02.0 Network controller: Intel Corporation PRO/Wireless 2200BG [Calexico2] Network Connection (rev 05)
net-wireless/ipw2200-firmware-3.1
sys-kernel/gentoo-sources-2.6.36-r5
dd-wrt: works fine
openwrt: works ok with some errors as in the attachment

samsung n150
05:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
ubuntu 10.10 up to date
dd-wrt: works fine
openwrt: connection drops after sometime and never goes back

linksys wip300
00:14:bf:fe:6a:7c
dd-wrt: connection drops after sometime but gets back after that
openwrt: connection drops after sometime but gets back after that
used to work fine with the original firmware

comment:29 Changed 5 years ago by nbd

Anton: Are you using Legacy, HT20 or HT40? Does using legacy 11g mode prevent connection drops with the current version?

Thanks

comment:30 Changed 5 years ago by nbd

Another thing: Please make sure ath9k is compiled with CONFIG_PACKAGE_ATH_DEBUG=y
set in the OpenWrt .config.
After booting your router, run this:

rmmod ath9k
insmod ath9k debug=0x8001

This will show messages in the log when the wireless hardware gets reset due to baseband hang issues or stuck beacons.

comment:31 Changed 5 years ago by Anton <anton.bugs@…>

nbd,

I've downloaded the firmware from
http://downloads.openwrt.org/snapshots/trunk/ar71xx/openwrt-ar71xx-generic-wzr-hp-g300nh-jffs2-sysupgrade.bin
wifi settings are almost default, I've just enabled WPA2 PSK encryption. So it was HT20 with 11g+n.

I've reloaded driver with debug option and switch it to 11g. samsung n150 client is stable now, the rest seems the same.

The log file is still not detailed though. Could you compile it with debug config for me please?

ps. I've found a similar bug report in the ticket https://dev.openwrt.org/ticket/8330

comment:32 Changed 5 years ago by anonymous

Openwrt: Backfire (10.03, r26231)
Device: Tp-Link TL-WR1043ND v1.8
Client: Windows 7 64 bit, RTL8187B, drivers version: 62.1181.1105.2009

wifi connection drops after few hours...

# dmesg
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020

# cat /etc/config/wireless
config 'wifi-device' 'radio0'
        option 'type' 'mac80211'
        option 'macaddr' '74:ea:xx:xx:xx:xx'
        option 'disabled' '0'
        option 'channel' '1'
        option 'hwmode' '11g'
        option 'country' 'PL'

config 'wifi-iface'
        option 'device' 'radio0'
        option 'network' 'lan'
        option 'mode' 'ap'
        option 'ssid' 'xxxxx'
        option 'encryption' 'psk2'
        option 'key' 'xxxxxxxxxxxxx'

comment:33 Changed 5 years ago by jeyjay

same here with recent trunk (r26232)
Device: Asus WL-500gP with Atheros miniPCI (ath9k)

Mar 20 23:45:40 Internetgemeinschaft daemon.notice hostapd: wlan0: STA 00:1d:fe:xx:xx:xx IEEE 802.11: did not acknowledge authentication response
Mar 20 23:45:46 Internetgemeinschaft daemon.notice hostapd: wlan0: STA 00:1d:fe::xx:xx:xx IEEE 802.11: did not acknowledge authentication response
Mar 20 23:45:49 Internetgemeinschaft daemon.notice hostapd: wlan0: STA 00:1d:fe::xx:xx:xxe IEEE 802.11: did not acknowledge authentication response
Mar 20 23:50:41 Internetgemeinschaft daemon.info hostapd: wlan0: STA 00:1d:fe::xx:xx:xx IEEE 802.11: disassociated due to inactivity
Mar 20 23:50:42 Internetgemeinschaft daemon.info hostapd: wlan0: STA 00:1d:fe::xx:xx:xx IEEE 802.11: deauthenticated due to inactivity
Mar 20 23:51:12 Internetgemeinschaft daemon.notice hostapd: wlan0: STA 00:1d:fe::xx:xx:xx IEEE 802.11: did not acknowledge authentication response
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020

comment:34 follow-up: Changed 5 years ago by nbd

  • Resolution set to fixed
  • Status changed from assigned to closed

Please test with r26252 - the bogus inactivity timeouts should be gone now.

comment:35 in reply to: ↑ 34 ; follow-up: Changed 5 years ago by anonymous

Replying to nbd:

Please test with r26252 - the bogus inactivity timeouts should be gone now.

Latest snapshot is r2652. When you ask to test you want us to compile it ourself, or is it hosted elsewhere?

comment:36 in reply to: ↑ 35 Changed 5 years ago by anonymous

Replying to anonymous:

Replying to nbd:

Please test with r26252 - the bogus inactivity timeouts should be gone now.

Latest snapshot is r2652. When you ask to test you want us to compile it ourself, or is it hosted elsewhere?

My bad... r26232

comment:37 Changed 5 years ago by Anton <anton.bugs@…>

we just need to wait for it to get compiled in couple more hours. It should be at 21 Mar 8am of some time zone ;-) I can't wait too.

comment:38 Changed 5 years ago by tuigje

  • Resolution fixed deleted
  • Status changed from closed to reopened

Hi,

some bad news, I'm afraid. With the following source:

Path: .
URL: svn://svn.openwrt.org/openwrt/branches/backfire
Repository Root: svn://svn.openwrt.org/openwrt
Repository UUID: 3c298f89-4303-0410-b956-a3cf2f4a3e73
Revision: 26267

I still get:

ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
ath: Could not stop RX, we could be confusing the DMA engine when we start RX up
ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020

comment:39 Changed 5 years ago by nbd

  • Resolution set to worksforme
  • Status changed from reopened to closed

As mentioned in another ticket:

The message is a known issue and is not necessarily an indicator of a connection hang. Please only reopen the ticket if you have something that is relevant to this issue.

comment:40 Changed 5 years ago by jeyjay

I reopened #8446 for dying wireless with ath9k 11n.

comment:41 Changed 5 years ago by Anton <anton.bugs@…>

Thanks jeyjay. I have recompiled the trunk with the debug option and will try to provide more info there.
Meanwhile, I have opened a separate ticket #9122 for dying wifi on a wip300 sip phone.

comment:42 Changed 4 years ago by richard-laing@…

Hi has this problem been fixed as I have OpenWrt Backfire 10.03.1-RC6 installed however I am still getting WIFI drops

comment:43 Changed 7 months ago by anonymous

Hi,

I also have this problem. Wifi is dropped randomly specially when we download large file from local (use max bandwidth of wifi connection I think.)

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.