Modify

Opened 2 years ago

Closed 23 months ago

#15448 closed defect (no_response)

Multicast IPv6 ICMP packets are dropped by built-in AR934X switch, breaking neighbour discovery

Reported by: oliver.jowett@… Owned by: developers
Priority: normal Milestone: Barrier Breaker 14.07
Component: base system Version: Attitude Adjustment 12.09
Keywords: Cc:

Description

I have a TP-Link TL-WR841N v8 running stock AA 12.09. This has:

eth0: WAN interface
eth1: CPU port of the built-in AR934X switch connecting to 4 external wired LAN ports
wlan0: wireless LAN

I have configured IPv6 connectivity via a 6in4 tunnel over the WAN interface. The router runs radvd to assign addresses to LAN clients.

This works well over the wireless interface - clients autoconfigure the right IPv6 addresses and connectivity is OK.

On the wired interface I see intermittent connectivity problems which appears to boil down to multicast packets from the wired clients towards the openwrt router get lost somewhere within the tplink's internal switch. Specifically:

a) If a client happens to have the router IP in its neighbour cache, traffic goes out as unicast ethernet frames and connectivity is OK;

b) If a client has a MAC address cached and wants to reverify it, NDP uses unicast frames to the cached address and it works as expected;

c) If a client doesn't have a MAC address in cache, it multicasts to 33:33:ff:00:00:01. This fails:

20:31:48.758978 00:90:f5:cc:2f:6a > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: 2001:470:1f09:61b:d9a:127b:80a0:1a18 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:470:1f09:61b::1, length 32

These packets are not seen on the openwrt box, but are seen by other clients on the same (external to the router) ethernet segment.

d) If a new client connects and multicasts a router solicitation (to 33:33:00:00:00:02), this message is never seen by the openwrt router and therefore radvd does not respond with a router advertisement. Usually to get a new client to obtain an address, I need to restart radvd to provoke an unsolicited advertisement.

If a client can get past (c) and (d) and maintain a cached MAC address for the router, then it has IPv6 connectivity to the rest of the world OK. As soon as the cache goes away, it loses connectivity.

Successful connectivity via the wireless looks like this:

21:18:15.877994 b8:03:05:9f:37:ab > 33:33:ff:9f:37:ab, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff9f:37ab: ICMP6, neighbor solicitation, who has fe80::ba03:5ff:fe9f:37ab, length 24
21:18:15.878133 b8:03:05:9f:37:ab > 33:33:ff:9f:37:ab, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff9f:37ab: ICMP6, neighbor solicitation, who has fe80::ba03:5ff:fe9f:37ab, length 24
21:18:16.878303 b8:03:05:9f:37:ab > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: fe80::ba03:5ff:fe9f:37ab > ff02::2: ICMP6, router solicitation, length 16
21:18:16.878486 b8:03:05:9f:37:ab > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: fe80::ba03:5ff:fe9f:37ab > ff02::2: ICMP6, router solicitation, length 16
21:18:16.887101 c0:4a:00:cc:db:88 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 110: fe80::c24a:ff:fecc:db88 > ff02::1: ICMP6, router advertisement, length 56
21:18:17.245889 b8:03:05:9f:37:ab > 33:33:ff:7d:e2:de, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff7d:e2de: ICMP6, neighbor solicitation, who has 2001:470:6b39:1000:141e:bc52:207d:e2de, length 24
21:18:17.246021 b8:03:05:9f:37:ab > 33:33:ff:7d:e2:de, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff7d:e2de: ICMP6, neighbor solicitation, who has 2001:470:6b39:1000:141e:bc52:207d:e2de, length 24
21:18:17.401939 b8:03:05:9f:37:ab > 33:33:ff:9f:37:ab, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff9f:37ab: ICMP6, neighbor solicitation, who has 2001:470:6b39:1000:ba03:5ff:fe9f:37ab, length 24
21:18:17.402016 b8:03:05:9f:37:ab > 33:33:ff:9f:37:ab, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff9f:37ab: ICMP6, neighbor solicitation, who has 2001:470:6b39:1000:ba03:5ff:fe9f:37ab, length 24
21:18:49.283430 b8:03:05:9f:37:ab > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: 2001:470:6b39:1000:141e:bc52:207d:e2de > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:470:6b39:1000::1, length 32
21:18:49.283725 c0:4a:00:cc:db:88 > b8:03:05:9f:37:ab, ethertype IPv6 (0x86dd), length 86: 2001:470:6b39:1000::1 > 2001:470:6b39:1000:141e:bc52:207d:e2de: ICMP6, neighbor advertisement, tgt is 2001:470:6b39:1000::1, length 32
21:18:49.283813 b8:03:05:9f:37:ab > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: 2001:470:6b39:1000:141e:bc52:207d:e2de > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:470:6b39:1000::1, length 32
21:18:49.286185 b8:03:05:9f:37:ab > c0:4a:00:cc:db:88, ethertype IPv6 (0x86dd), length 118: 2001:470:6b39:1000:141e:bc52:207d:e2de > 2001:470:6b39:1000::1: ICMP6, echo request, seq 1, length 64
21:18:49.286433 c0:4a:00:cc:db:88 > b8:03:05:9f:37:ab, ethertype IPv6 (0x86dd), length 118: 2001:470:6b39:1000::1 > 2001:470:6b39:1000:141e:bc52:207d:e2de: ICMP6, echo reply, seq 1, length 64
21:18:50.282071 b8:03:05:9f:37:ab > c0:4a:00:cc:db:88, ethertype IPv6 (0x86dd), length 118: 2001:470:6b39:1000:141e:bc52:207d:e2de > 2001:470:6b39:1000::1: ICMP6, echo request, seq 2, length 64
21:18:50.282236 c0:4a:00:cc:db:88 > b8:03:05:9f:37:ab, ethertype IPv6 (0x86dd), length 118: 2001:470:6b39:1000::1 > 2001:470:6b39:1000:141e:bc52:207d:e2de: ICMP6, echo reply, seq 2, length 64
21:18:51.284610 b8:03:05:9f:37:ab > c0:4a:00:cc:db:88, ethertype IPv6 (0x86dd), length 118: 2001:470:6b39:1000:141e:bc52:207d:e2de > 2001:470:6b39:1000::1: ICMP6, echo request, seq 3, length 64
21:18:51.284769 c0:4a:00:cc:db:88 > b8:03:05:9f:37:ab, ethertype IPv6 (0x86dd), length 118: 2001:470:6b39:1000::1 > 2001:470:6b39:1000:141e:bc52:207d:e2de: ICMP6, echo reply, seq 3, length 64

The wired and wireless LANs are on separate subnets and are not bridged. I also see the same behaviour with firewalling disabled.

I also see the same behaviour from rdisc6/ndisc6 - they work fine via wireless, but time out via the wired connection.

My conclusion is that the built-in switch is dropping these multicast packets somewhere between the outside world and the router-side interface.

Attachments (5)

etc-config-network (1.0 KB) - added by oliver.jowett@… 2 years ago.
/etc/config/network
etc-config-radvd (566 bytes) - added by oliver.jowett@… 2 years ago.
/etc/config/radvd
etc-config-wireless (549 bytes) - added by oliver.jowett@… 2 years ago.
/etc/config/wireless
ifconfig-output (2.8 KB) - added by oliver.jowett@… 2 years ago.
ifconfig output
0001-AR934X-Enable-unicast-multicast-flooding-to-the-CPU-.patch (1.6 KB) - added by oliver.jowett@… 2 years ago.
patch against AA enabling multicast/unicast flooding

Download all attachments as: .zip

Change History (14)

Changed 2 years ago by oliver.jowett@…

/etc/config/network

Changed 2 years ago by oliver.jowett@…

/etc/config/radvd

Changed 2 years ago by oliver.jowett@…

/etc/config/wireless

Changed 2 years ago by oliver.jowett@…

ifconfig output

comment:1 Changed 2 years ago by anonymous

Also probably worth noting that IPv4 broadcast frames work fine on both the wired and wireless interfaces - clients can get addresses via DHCP and do ARP as normal without problems. Not sure about IPv4 multicast as it's less obvious if that breaks.

comment:2 Changed 2 years ago by oliver.jowett@…

Looks like a bug in the AR934x support.

ar7240sw_setup() sets the BC_DP bit for port 0 in AR934X_REG_FLOOD_MASK so that broadcast frames are flooded to the CPU port.

However it does not do the same for the equivalent multicast/unicast flood bits. The reset value for that register disables those bits for port 0, which means that multicast and unicast frames (without a learned destination) will not be flooded to the CPU port.

Setting these bits appears to have fixed IPv6 neighbor discovery.. patch to follow.

Changed 2 years ago by oliver.jowett@…

patch against AA enabling multicast/unicast flooding

comment:3 Changed 2 years ago by zorun

A similar bug appears in a much simpler setup: eth0, eth1 and wlan0 all bridged together, without any routing or firewall rules.

This happened on a TP-Link TL-WR841N v8.4 running AA 12.09, with an external Linux box serving as an IPv6 uplink and doing RA (plugged on one of the ports of the WR841N).

The symptom were as followed:

  • IPv4 works fine
  • when the uplink is plugged on one of the LAN ports (eth1), then wireless clients receive RA from the uplink, can send IPv6 packets, but never receive responses. The router on the uplink never finds out the MAC of the wireless clients (Neighbour Discovery is only working in one way)
  • when the uplink is plugged on the WAN port (eth0), then wireless clients work fine with IPv6. However, clients plugged on the LAN ports do not immediately receive an IPv6 prefix (the outgoing RS seems to be dropped somewhere). When they finally receive a RA, they are not IPv6-reachable from the rest of the network.

I believe these issues come from the bug described above. Compiling 12.09.1 with the attached patch solves all these issues.

comment:4 Changed 2 years ago by anonymous

Note that a similar patch is available on trunk:

http://git.openwrt.org/?p=openwrt.git;a=commit;h=0b6484dce5ace7f5e0cc3f54fd9b18252affcf42

Interestingly, it doesn't activate unicast flooding. Oliver, do you think it's necessary?

comment:5 Changed 2 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

comment:6 Changed 2 years ago by anonymous

So is this solved in 14.07 or not?

comment:7 Changed 2 years ago by anonymous

Any updates, please? Is the bug solved in the build of 29th September?

comment:8 Changed 2 years ago by nbd

it should work, please test.

comment:9 Changed 23 months ago by nbd

  • Resolution set to no_response
  • Status changed from new to closed

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.