Modify

Opened 6 years ago

Last modified 3 years ago

#7790 reopened defect

backfire r22594 ath9k instability

Reported by: Andrew Lutomirski <luto@…> Owned by: developers
Priority: normal Milestone: Backfire 10.03.1
Component: packages Version: Backfire 10.03.1 RC1
Keywords: Cc:

Description

I have a microtik R52N (IIRC), which is an ath9k minipci device:

phy0: Atheros AR9280 Rev:2 mem=0xb0000000, irq=48

Backfire r22594 is much better than the original backfire for me and seems comparable to whatever old pre-backfire trunk build I used to use, but it still occasionally becomes unusable and needs a reboot for the network to start working again. This seems to be related to errors like this:

ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Failed to stop TX DMA in 100 msec after killing last frame
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d20
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Failed to stop TX DMA in 100 msec after killing last frame
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22

These started only in recent builds. Any ideas?

Attachments (0)

Change History (11)

comment:1 Changed 6 years ago by anonymous

I can confirm the instability on a Netgear WNDR3700. With 10.03.1 RC1 a reboot is required at least once per day, as wifi connections drop and the clients can't reconnect - new connections drop after WPA2 auth, during DHCP transaction. However, I don't get any timeouts in the logs.

comment:2 Changed 6 years ago by nbd

Please try the latest version

comment:3 Changed 6 years ago by Stijn Tintel <stijn@…>

I just noticed this in dmesg once on my RSPro with Mikrotik R52n running in AP mode @ ch1/ht20:

ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22

Not sure if it caused any issues though, there are currently still 6 clients connected, and they can still access the Internet. I also can't find it in logread, so I don't have any logs of what else happened at that time. I'll increase my log buffer size and see if it happens again.

Which value should I set in /sys/kernel/debug/ath9k/phy0/debug, to get more information?

Running Backfire branch, r23452.

comment:4 Changed 6 years ago by Stijn Tintel <stijn@…>

FYI, just seen this on my other RSPro too, running trunk r23540. Unfortunately I can't tell if this was on the Mikrotik R52n or the Ubiquiti SR71-A, since wlan0 or wlan1 isn't mentioned in the output.

Oct 20 01:28:30 wrt0 user.debug kernel: ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22
Oct 20 01:28:31 wrt0 user.debug kernel: ath: Failed to stop TX DMA in 100 msec after killing last frame
Oct 20 01:28:31 wrt0 user.debug kernel: ath: Failed to stop TX DMA in 100 msec after killing last frame
Oct 20 01:28:31 wrt0 user.debug kernel: ath: Failed to stop TX DMA in 100 msec after killing last frame
Oct 20 01:28:31 wrt0 user.debug kernel: ath: Failed to stop TX DMA. Resetting hardware!
Oct 20 01:28:31 wrt0 daemon.info hostapd: wlan1: STA 00:27:10:b3:72:a8 IEEE 802.11: authenticated
Oct 20 01:28:31 wrt0 daemon.info hostapd: wlan1: STA 00:27:10:b3:72:a8 IEEE 802.11: associated (aid 1)
Oct 20 01:28:31 wrt0 daemon.info hostapd: wlan0: STA 00:27:10:b3:72:a8 WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter
Oct 20 01:28:31 wrt0 daemon.info hostapd: wlan0: STA 00:27:10:b3:72:a8 WPA: received EAPOL-Key 4/4 Pairwise with unexpected replay counter
Oct 20 01:28:31 wrt0 daemon.info hostapd: wlan1: STA 00:27:10:b3:72:a8 RADIUS: starting accounting session 4CBE2112-00000001
Oct 20 01:28:31 wrt0 daemon.info hostapd: wlan1: STA 00:27:10:b3:72:a8 WPA: pairwise key handshake completed (RSN)

comment:5 Changed 6 years ago by Nikolay

I'm using tp-linik 1043nd and i have similar problems, limited connection with windows clients, drops and many of those in my log:

user.debug kernel: ath: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020

user.debug kernel: ath: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0x40d22

I've been using Backfire 10.03.1-rc3(r23985, r24007, r24045) and 10.03.1-rc4(r24132) so far and the problems still persist.

comment:6 Changed 6 years ago by nbd

  • Resolution set to duplicate
  • Status changed from new to closed

Stability issues are tracked in #8343, #8830, please try the latest version

comment:7 Changed 3 years ago by malaakso@…

  • Resolution duplicate deleted
  • Status changed from closed to reopened

This error message has returned in the recent revisions. I'm using r39770 on a

[   24.560000] ieee80211 phy1: Atheros AR9550 Rev:0 mem=0xb8100000, irq=47:

and I'm getting these in dmesg:

[38444.830000] ath: phy1: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0xc055a
[88852.670000] ath: phy1: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0xc055a
[105059.580000] ath: phy1: Timeout while waiting for nf to load: AR_PHY_AGC_CONTROL=0xc055a

Haven't noticed any instability, though, so they possibly could be suppressed?

comment:8 Changed 3 years ago by nbd

What do you mean by 'returned in the recent revisions' - is there a version that's known to not produce these messages on the same device?

comment:9 Changed 3 years ago by anonymous

Sorry for being ambiguous. What I meant was that I personally have never seen these messages in earlier builds, meaning r39554 and lower.

comment:10 Changed 3 years ago by nbd

I can't find any change between those revisions that would cause this. How reproducible is this? Pleae test the older version again to see if there is a difference in behavior or if it was just random.

comment:11 Changed 3 years ago by malaakso@…

I forgot that while upgrading I also changed the channel from 13 to 1. After testing it seems that I get the errors only on channel 1, but also with the older revision. Sorry for the noise.

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.