Modify

Opened 3 years ago

Closed 3 years ago

Last modified 2 years ago

#14020 closed defect (fixed)

WDR4900 Crashes randomly during network use on latest trunk

Reported by: stephan.oelze@… Owned by: developers
Priority: high Milestone: Chaos Calmer 15.05
Component: base system Version: Trunk
Keywords: Cc:

Description

I use WDR4900 v1 since 4 Month now. Bought in Europe.
The latest builds (since Kernel 3.10) are crashing when Network traffic comes into heavy movement on the switch.(about 700mbits)

Currently i am not able to provide any logs for this.
I checked out a clean new buildtree and compiled it with Debuging enabled. logread -f does not provide any information when network crashes. The Device is not accessible anymore from now to then and needs to be restartet.

So i bought a second unit of wdr4900 v1 and tested this unit with checkout of this morning, with the same result. So the Hardware isnt faulty here.

Can anyone confirm this issue?
I cant find any ticket about that at the moment.

Attachments (0)

Change History (25)

comment:1 Changed 3 years ago by florian@…

Yes: https://forum.openwrt.org/viewtopic.php?pid=209483
A lot of people have this problem (me too). If I generate traffic on the ethernet interface (for testing purposes with netio) the WDR4900 crashes after a few seconds.

comment:2 Changed 3 years ago by florian@…

One more thing:
I testes a few times with netio and one thing is interesting:
I can run UDP tests in a loop and nothing will happen - no crash. As soon as I start netio with TCP benchmark it will crash after a few seconds. Maybe this is a hint...

The firewall is disabled at my setup.

comment:3 Changed 3 years ago by florian@…

I have done some tests:
With the older Image and Kernel (OpenWrt Barrier Breaker r37472 / LuCI Trunk (svn-r9881) ,Kernel 3.8.13) everything is working without any problems. NetIO with TCP and UDP can not crash the router.
Back on the new image and kernel the box crashed under the load of netio (r37708)

comment:4 Changed 3 years ago by dalius@…

Must confirm that same happens on mine.

comment:5 Changed 3 years ago by anonymous

3.10.9 (r37834) still freezes

comment:6 Changed 3 years ago by nlapplegate

Have the same problem with the BrainSlayer build 22118 (DD-WRT) on the WDR4900 v1

Ethernet (WAN) crashes on high load (torrents).

I dicided tot revert back to the orginal and latest firmware from TP-LINK

comment:7 follow-up: Changed 3 years ago by Andrew

Facing same issue here with recent snapshots, crashes after a few days with no load, or crashes quickly with heavy traffic loading.

Reverting to r37472 brings back rock solid stability.

comment:8 in reply to: ↑ 7 Changed 3 years ago by thomas <thomas@…>

Lastest trunk BB v37937 does sustain high network load without crashing.

test setup via ethernet:
server --> wan@tplink _ routing _ lan@tplink --> laptop

iperf tcp gives about 75 MByte/s over 1 hour without any issue.

Greetings Thomas

comment:9 Changed 3 years ago by dalius@…

75mbps never made any programs, try iperf on the router and run it for 30 seconds, it will freeze.

comment:10 Changed 3 years ago by dalius@…

just tried the latest trunk, froze as before.

comment:11 Changed 3 years ago by thomas <thomas@…>

as you suggested to run iperf on the router I just did such a run without getting the router to crash.

setup:
"iperf -s" on the tplink4900 ---ethernet---> MacBook "iperf client"

... no crash and about 50MByte/sec throughput over 10min

iperf client log:
iperf -c 10.10.20.2 -t 600


Client connecting to 10.10.20.2, TCP port 5001
TCP window size: 129 KByte (default)


[ 4] local 10.10.20.1 port 62025 connected with 10.10.20.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-600.0 sec 27.8 GBytes 398 Mbits/sec

Openwrt trunk:

BARRIER BREAKER (Bleeding Edge, r37948)

Do you have a serial line connected to your router that could give a crash log ?

Bye Thomas

comment:12 Changed 3 years ago by dalius@…

Unfortunatelly not.
From your log I see that I was getting the same speed tl--ethernet-->iperf@windows but it stopped after ~12 second, by the way I was using compiled trunk from http://downloads.openwrt.org/snapshots/trunk/mpc85xx/openwrt-mpc85xx-generic-tl-wdr4900-v1-squashfs-sysupgrade.bin which has a timestamp of yesterday (10-Sep-2013 07:31) so probably it's not the latest.

comment:13 Changed 3 years ago by guzzard

It seems stable for me with latest version built from git. Perhaps I'm not able to put enough load on the router?

Firmware Version OpenWrt Barrier Breaker r37981 / LuCI Trunk (svn-r9902)
Kernel Version 3.10.10

Tested using iperf:
Computer (client) --> (lan) TL-WRD4900 --> (wan) TL-WRD4900 --> Netgear router --> Computer (server)

[ ID] Interval Transfer Bandwidth
[ 3] 0.0-3600.0 sec 232 GBytes 555 Mbits/sec

One hour test, transfer speed between 500-680Mbit/sec during test.

Did a second test, 1.5 hour. vnstat from TL-WRD4900:

eth0 / traffic statistics

rx | tx

bytes 371.10 GiB | 367.93 GiB

max 792.64 Mbit/s | 785.89 Mbit/s

average 576.70 Mbit/s | 571.76 Mbit/s

min 261.78 Mbit/s | 259.50 Mbit/s

packets 283421338 | 283420953

max 71968 p/s | 71969 p/s

average 52504 p/s | 52504 p/s

min 24276 p/s | 24276 p/s

time 89.97 minutes

Kind Regards

Last edited 3 years ago by guzzard (previous) (diff)

comment:14 Changed 3 years ago by guzzard

Update: router crashed during no load after about 2 days. Will keep monitoring, and see if it happens again.

comment:15 follow-up: Changed 3 years ago by dalius@…

The best test to do is PC (iperf client) --> Router (iperfserver) then it always crashes after < 20 seconds. If you do a pass-through test eg PC -->router --> PC it will never crash instantly.

just run iperf on router iperf -s and on PC (over ethernet) iperf -c router -t 30 -r. The only firmware survives is r37472 (so far).

comment:16 in reply to: ↑ 15 Changed 3 years ago by swarley

Replying to dalius@…:

The best test to do is PC (iperf client) --> Router (iperfserver)

I did a 6h passthrough test pc -> router -> pc and a 6h test with the router acting as the server using r38002. The router has survived both tests and one day of normal operation without crashing.

PC -> Router -> PC:


Client connecting to 192.168.1.115, TCP port 5001
TCP window size: 23.5 KByte (default)


[ 3] local 192.168.1.164 port 44121 connected with 192.168.1.115 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-3600.0 sec 395 GBytes 941 Mbits/sec
[ 3] 3600.0-7200.0 sec 395 GBytes 941 Mbits/sec
[ 3] 7200.0-10800.0 sec 395 GBytes 941 Mbits/sec
[ 3] 10800.0-14400.0 sec 395 GBytes 941 Mbits/sec
[ 3] 14400.0-18000.0 sec 395 GBytes 941 Mbits/sec
[ 3] 18000.0-21600.0 sec 395 GBytes 941 Mbits/sec
[ 3] 0.0-21600.0 sec 0.00 ˆ ­ûs 941 Mbits/sec

Router -> PC:


Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 23.5 KByte (default)


[ 3] local 192.168.1.164 port 46201 connected with 192.168.1.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-3600.0 sec 256 GBytes 611 Mbits/sec
[ 3] 3600.0-7200.0 sec 256 GBytes 612 Mbits/sec
[ 3] 7200.0-10800.0 sec 256 GBytes 612 Mbits/sec
[ 3] 10800.0-14400.0 sec 256 GBytes 611 Mbits/sec
[ 3] 14400.0-18000.0 sec 256 GBytes 611 Mbits/sec
[ 3] 18000.0-21600.0 sec 256 GBytes 611 Mbits/sec
[ 3] 0.0-21600.0 sec 0.00 ˆ ­ûs 611 Mbits/sec

comment:17 Changed 3 years ago by dalius@…

First test utilized switch only so it's not relevant.
Second one is odd, it always crashes on mine.. is it really WRD4900?
Maybe there is something common in configuration what makes it crash.

comment:18 Changed 3 years ago by swarley

I'm using a TP-Link TL-WDR4900 v1. It is stable with an unaltered out of the box configuration and it is stable with a typical setup (luci, samba, multiple ssid in 5 and 2.4Ghz, vlans, several firewall rules). However, i do not use the IPv6-functionality, but i guess most people don't.

Model TP-Link TL-WDR4900 v1
Firmware Version OpenWrt Barrier Breaker r38002 / LuCI Trunk (svn-r9902)
Kernel Version 3.10.12

comment:19 Changed 3 years ago by dalius@…

Can you provide a link to a firmware you've used?

comment:20 Changed 3 years ago by swarley

Yes, but i suggest we continue discussing in the forum: https://forum.openwrt.org/viewtopic.php?pid=212800#p212800

comment:21 Changed 3 years ago by nbd

  • Resolution set to fixed
  • Status changed from new to closed

fixed in r38409

comment:22 Changed 3 years ago by anonymous

  • Resolution fixed deleted
  • Status changed from closed to reopened

as per the comment on the forum.

The gianfar maintainer, Claudiu just sent me a patch that should fix the napi-poll bug in kernel versions > v3.9.
https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-October/112496.html
I adapted his patch to be applicable to current OpenWRT trunk (with kernel v3.10.15):
If you are willing to help testing his changes.. here we go:
just fetch my patch 900-fix_gianfar_napi_poll.patch
copy it to your build env: ./target/linux/mpc85xx/patches-3.10/.
remove the current workaround patch "200-gianfar_napi_poll_revert.patch" from the same derectory to avoid downgrading to 3.9 (respectivly conflicts between both patches)
do a "make clean" and a fresh "make"

seems to be stable.

comment:23 Changed 3 years ago by nbd

  • Resolution set to fixed
  • Status changed from reopened to closed

I agree with replacing my patch with that one, but why did you reopen the ticket? My patch did fix the issue as well.

comment:24 Changed 2 years ago by bittorf@…

this fix does not work for me. i have 15 of these devices working in a mesh. only those with wireless-connections are crashing from time to time. (sadly in a way which doesnt recover itself, although 'oops_on_panic' and 'panic=10' is active).

comment:25 Changed 2 years ago by bittorf@…

typo in last message, must be: "only those with wired connection are crashing from time to time".

additional info: i will install a serial console on one of these and report.

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.