Modify

Opened 7 years ago

Closed 7 years ago

#7637 closed defect (fixed)

rtl8366s forwarding all packets to all ports (broadcasting)

Reported by: john.southworth@… Owned by: juhosg
Priority: normal Milestone: Backfire 10.03.1
Component: packages Version: Trunk
Keywords: Cc:

Description

I'm currently using a Dlink DIR-825 with the latest trunk. The switch is forwarding every packet to every port on the vlan I have established. It's acting like a hub. When I am on a machine connected to one port and I run a packet capture I receive all the packets from every other port (http reqests, telnet, ssh, etc). This is a home setup so it wouldn't particularly matter but this problem is causing the throughput to peak around 1Mbit/s.

I'm not sure if its related but I had to assign every vlan to 2 ports other wise they were untagged on 5 and attached to another port on the switch as well according to swconfig. is vlan 2 was port 2 and 5; vlan 3 was 3 and 5; vlan 4 was 4 and 5.

Attachments (7)

enable_learning.patch (2.0 KB) - added by Memphis 7 years ago.
Enabling learning, and therefore bring back switching
enable_learning_8366x.patch (3.9 KB) - added by Memphis 7 years ago.
Apply this patch in target/linux/generic/files/drivers/net/phy and activate at runtime with "swconfig dev rtl8366rb enable_learning 1" - this patches 8366rb and 8366s driver
rtl8366xx_enable_learning.patch (5.0 KB) - added by Memphis 7 years ago.
the bugfix
rtl8366xx_enable_learning.2.patch (5.1 KB) - added by Memphis 7 years ago.
there was a return missing :o(
rtl8366xx_enable_learning.3.patch (5.1 KB) - added by Memphis 7 years ago.
i'm too dumb for patches. this time it applys …
rtl8366xx_enable_learning.4.patch (5.3 KB) - added by Memphis 7 years ago.
There where more than 2 parts where a return statement has to be placed. I only missed 2 - isn't that great? (now there are all needed returns inside *hrhr*)
openwrt.log (13.2 KB) - added by anonymous 7 years ago.
Log of network debug

Download all attachments as: .zip

Change History (37)

comment:1 Changed 7 years ago by Memphis

I confirm this bug on TP-Link wr1043nd (rtl8366rb) with r22096 and 2.6.34.

here is my thread to the situation - one even see the "hubbing" when looking at the port leds ...

https://forum.openwrt.org/viewtopic.php?pid=113400#p113400

comment:2 Changed 7 years ago by KillaB

I'm not seeing this behavior on my WR1043ND running r21479.

comment:3 Changed 7 years ago by anonymous

There where some bigger changes in the rtl8366xx.c driver by "juhosg" in the past weeks. Maybe his bug was introduced there.

Since this bug wasn't there in r21479 it must be a change between r21906 and r22044. There where multiple commits between 26.6. and 2.7.2010. One of these should brought this behaviour...

juhosg are you around for clearifying this issue?

comment:4 Changed 7 years ago by Memphis

ah sorry - last post was from me

Memphis

comment:5 Changed 7 years ago by anonymous

The same router, the same problem, and the webinterface shows me:

http://666kb.com/i/bl38pq2xauqjzutpa.gif

comment:6 Changed 7 years ago by jow

  • Owner changed from developers to juhosg
  • Status changed from new to assigned

The last issue (screenshot) is unrelated

comment:7 Changed 7 years ago by Jonathan Bennett <jbscience87@…>

I've observed this on a TL-WR841N.

An added note, when the lan side of the router is plugged into a switch, the switch ceases to function correctly. I believe that the router is repeating the packets received from the switch back to the switch. The fact that this includes broadcast packets means that the switch's MAC address tables are being corrupted and all the traffic is being sent to the port where the router is.

The effect is that plugging the router into a network on the lan side is crashing the entire network.

comment:8 Changed 7 years ago by Jonathan Bennett <jbscience87@…>

After compiling several old revisions, I've narrowed it down a bit more. It was working in 21905, but broken in 21957. I don't know more specifically than that. Hope it helps.

comment:9 Changed 7 years ago by anonymous

Seems like juhosg is on vacation or something like that - so we can only wait ...

comment:10 Changed 7 years ago by Memphis

Looking at the datasheet on chapter 8.4 "search and learning":

http://realtek.info/pdf/rtl8366_8369_datasheet_1-1.pdf

It says that packets are broadcasted if the source mac is not found in the switches lookup table. Inside the function "rtl8366rb_hw_init" the learning, aging, and DA Dropping is disabled (and isn't enabled elsewhere in the code) which imho prevents any entrys to be added to the lookup table (maybe intended for implementing an own management of macaddress lookup with external cpus?).

I'm new to the switching technology and this is only a guess. But it doesn't look like the driver adds any entrys to the MAC Lookuptable of the Switch (only VLAN Table entrys). So could this be the issue?

comment:11 Changed 7 years ago by Memphis

Yesterday I proved my theory. I've attached a little patch to rtl8366rb.c (should be applyable to the other rtl8366xx aswell), which adds an attribute to swconfig for enabling learning mode. When doing this the port separation (switching) does work again. I don't know if this has any influences on vlan, bridging or other features. So KNOW ( :o) ), i really wait for juhosg's return *hehe*

Changed 7 years ago by Memphis

Enabling learning, and therefore bring back switching

Changed 7 years ago by Memphis

Apply this patch in target/linux/generic/files/drivers/net/phy and activate at runtime with "swconfig dev rtl8366rb enable_learning 1" - this patches 8366rb and 8366s driver

comment:12 Changed 7 years ago by Memphis

the second patch is the same as the first, but does this modification to both ... 8366rb and 8366s driver...

comment:13 Changed 7 years ago by Memphis

typo ... it has to be:

swconfig dev rtl8366rb set enable_learning 1

respectively

swconfig dev rtl8366s set enable_learning 1

comment:14 Changed 7 years ago by Memphis

Please include the following patch into trunk. In hw_init it enables learning and ageing for all ports. Additionally it adds an swconfig attribute for manual disableing/reenableing the learning/aging.

This patch should fix this ticket. Patch is for rtl8366rb and rtl8366s and is diffed against r22458.

Changed 7 years ago by Memphis

the bugfix

Changed 7 years ago by Memphis

there was a return missing :o(

comment:15 Changed 7 years ago by anonymous

FYI, the latest patch restores lan-to-lan performance on netgear wndr3700 running backfire r22391. iperf gave me ~94 Mbits/sec before the patch and 832 Mbits/sec after.

Changed 7 years ago by Memphis

i'm too dumb for patches. this time it applys ...

Changed 7 years ago by Memphis

There where more than 2 parts where a return statement has to be placed. I only missed 2 - isn't that great? (now there are all needed returns inside *hrhr*)

comment:16 Changed 7 years ago by anonymous

I have this issue where my wireless clients on my DIR-825 will occasionally lose WAN access. The router page and LAN still works.

nbd thinks it might be this switch issue that is the root cause. I have tried to applied the rtl8366xx_enable_learning.3.patch but it did not solve the issue

comment:17 Changed 7 years ago by Memphis

do an "swconfig dev rtl8366s get enable_learning" on your DIR-825 ... it has to return "1" ... otherwise the patch wasn't applied correctly or didn't make it in your build

comment:18 Changed 7 years ago by maveric@…

Is there a timeline for when this patch will be included in a snapshot or trunk?

comment:19 Changed 7 years ago by Memphis

nbd wants to commit it. I don't know when he finds the time but he is well informed about this ticket :o)

Changed 7 years ago by anonymous

Log of network debug

comment:20 follow-up: Changed 7 years ago by anonymous

hi

this is broken somewhere see the attached
openwrt.log

Ghat

comment:21 in reply to: ↑ 20 ; follow-up: Changed 7 years ago by anonymous

I don't experience this problem on my DIR-825.
I've patched against r22462.

I'll try r22511 now ...

Replying to anonymous:

hi

this is broken somewhere see the attached
openwrt.log

Ghat

comment:22 in reply to: ↑ 21 Changed 7 years ago by anonymous

Hm, no problem at r22511 either.

Replying to anonymous:

I don't experience this problem on my DIR-825.
I've patched against r22462.

I'll try r22511 now ...

Replying to anonymous:

hi

this is broken somewhere see the attached
openwrt.log

Ghat

comment:23 Changed 7 years ago by Memphis

Are the last 3 posts from one and the same person? (would indicate, there is no problem)

If not (and Person1 is not the same as Person 2 and 3 *hrhr*), then i don't the the problem ... the log doesn't help here. I need the /erc/config/network file ... and it looks like 172.16.0.1 just is down, or firewalled ... without further information about your config - i can't help.

comment:24 Changed 7 years ago by anonymous

comment:25 Changed 7 years ago by anonymous

I am person1 (ghat)

I recompiled with the patch on trunk and I am able to ping the WAN gateway. If I apply the patch to the backfire branch then I cannot ping the WAN gateway (even though dhcp works and dns is set properly etc)

I still have performance issues... and cannot associate my notebook intel 5100agn card to the dlink (even though the AE1000 works on the desktop)

G

comment:26 Changed 7 years ago by Jonathan Bennett <jbscience87@…>

I applied the patch, and still no joy on the TP-link wr841n. However, it appears that the 841 doesn't actually use the rtl8366, but a "ar7240 integrated 100M" switch chipset. (https://forum.openwrt.org/viewtopic.php?pid=104080#p104080)

Is this correct? How can I determine the switch chipset? Should I open a new ticket for this problem?

Thanks for the help,
Jonathan

comment:27 Changed 7 years ago by anonymous

cannot associate my notebook intel 5100agn card to the dlink

I figured that the above problem is not related to this ticket, it is because of
https://dev.openwrt.org/ticket/7590

I still have performance issues thought even after applying this patch...
My LAN speed is limited to 150Mbps due to independent reasons, however
LAN->Wireless is in 1Mbps range only...

comment:28 Changed 7 years ago by Memphis <memphis@…>

I just wanted to say, that this patch does only get the switching working again. I does not claim to be a fix for any lan<->wan performance issues ...

comment:29 Changed 7 years ago by vx

i think disabling of learning and auto ageing is very bad idea, rtl8366xx_enable_learning.4.patch looks good

comment:30 Changed 7 years ago by nbd

  • Resolution set to fixed
  • Status changed from assigned to closed

fix added in r22545, r22546

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.