Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECMP nexthop issue - cannot configure a default gateway #18326

Open
2 tasks done
rozmanro1 opened this issue Mar 6, 2025 · 2 comments
Open
2 tasks done

ECMP nexthop issue - cannot configure a default gateway #18326

rozmanro1 opened this issue Mar 6, 2025 · 2 comments
Labels
triage Needs further investigation

Comments

@rozmanro1
Copy link

Description

Hi Guys,

We have a LINUX gateway device with cellular connection which is using dhclient to retrieve the interface and the default gateway addresses.
Those addresses are being set to Linux via FRR.

Once in a while (few days) we observe an issue while trying to configure the default gateway address.
From syslog we observe the following prints:

"Feb 17 17:58:19 localhost NET: dhclient: Checking for existing default route to remove...
Feb 17 17:58:19 localhost set_ambient_netadm[95188]: [pid: 95188|app: 0|req: 741/1530] 127.0.0.1 () {40 vars in 539 bytes} [Mon Feb 17 17:58:19 2025] GET /routers/1/routes/notify/wwan0 => generated 51 bytes in 48 msecs (HTTP/1.1 200) 8 headers in 304 bytes (2 switches on core 0)
Feb 17 17:58:20 localhost NET: dhclient: Adding new default route via 10.146.169.4 on wwan0 with metric 254
Feb 17 17:58:20 localhost staticd[2886]: [MHYBZ-5A04C][EC 100663334] error processing configuration change: error [generic error] event [validate] operation [create] xpath [/frr-routing:routing/control-plane-protocols/control-plane-protocol[type='frr-staticd:staticd'][name='staticd'][vrf='default']/frr-staticd:staticd/route-list[prefix='0.0.0.0/0'][afi-safi='frr-routing:ipv4-unicast']/path-list[table-id='0'][distance='254']/frr-nexthops/nexthop[nh-type='ip4'][vrf='default'][gateway='10.146.169.4'][interface='(null)']] message: Route cannot have more than 1 ECMP nexthops
Feb 17 17:58:20 localhost staticd[2886]: [H68KZ-12QEF][EC 100663340] nb_candidate_commit_prepare: failed to validate candidate configuration
Feb 17 17:58:20 localhost staticd[2886]: [KFEJ3-7JXVF] BE-CLIENT: mgmt_be_txn_cfg_prepare: ERROR: Failed to validate configs txn-id: 722 1 batches, err: 'Route cannot have more than 1 ECMP nexthops'
Feb 17 17:58:20 localhost mgmtd[2838]: [KF39R-NRP86] mgmt_txn_notify_be_cfgdata_reply: ERROR: CFGDATA_CREATE_REQ sent to 'staticd' failed txn-id: 722 err: Route cannot have more than 1 ECMP nexthops
Feb 17 17:58:20 localhost mgmtd[2838]: [GGJTQ-VTT01] SET_CONFIG request for client 0x2bb failed, Error: 'Route cannot have more than 1 ECMP nexthops'
Feb 17 17:58:20 localhost NET: dhclient: Failed to create default route: 10.146.169.4 dev wwan0 254
Feb 17 17:58:20 localhost ntpd[4373]: Listen normally on 108 wwan0 10.146.169.3:123
Feb 17 17:58:20 localhost ntpd[4373]: new interface(s) found: waking up resolver"

It says that there is a default gateway already where there is no such configured.
"show running config",
"show ip route",
"route -n" in Linux,

all of those show no default route is configured.

Version

localhost# show version
FRRouting 10.1 (localhost) on Linux(6.12.5).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/aarch64-linux-gnu' '--libexecdir=${prefix}/lib/aarch64-linux-gnu' '--disable-maintainer-mode' '--host=aarch64-linux-gnu' '--sbindir=/usr/lib/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/aarch64-linux-gnu/frr' '--with-moduledir=/usr/lib/aarch64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--disable-scripting' '--enable-pim6d' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'host_alias=aarch64-linux-gnu' 'PYTHON=python3'
localhost#

How to reproduce

Cycles of reconnections on the cellular interface.

Expected behavior

To be able to set the default gateway.

Example with successful logs in syslog:
Nov 21 21:17:09 localhost NET: dhclient: Checking for existing cellular default route to remove...
Nov 21 21:17:09 localhost NET: dhclient: Adding new default route via 192.168.2.1 on eth2 with metric 254
Nov 21 21:17:10 localhost NET: dhclient: Successfully added new default route: 192.168.2.1 dev eth2 254

Actual behavior

Cannot configure the default gateway as the FRR states that there is already a default route with the same metric.
While there is nothing inside "show running config","show ip route", "route -n" in Linux.

Additional context

Example with log:
Cellular wwan0 interface has address 10.149.2.65
Cellular default gateway should be 10.149.2.66 (per dhclient logs)
There are pings to the default gateway address 10.149.2.66.

root@localhost:/# ip route show table all
10.149.2.64/30 dev wwan0 proto kernel scope link src 10.149.2.65
169.254.1.0/24 dev lan3 proto kernel scope link src 169.254.1.1
172.31.0.0/24 dev docker0 proto kernel scope link src 172.31.0.1 linkdown
local 10.149.2.65 dev wwan0 table local proto kernel scope host src 10.149.2.65
broadcast 10.149.2.67 dev wwan0 table local proto kernel scope link src 10.149.2.65
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1
local 169.254.1.1 dev lan3 table local proto kernel scope host src 169.254.1.1
broadcast 169.254.1.255 dev lan3 table local proto kernel scope link src 169.254.1.1
local 172.31.0.1 dev docker0 table local proto kernel scope host src 172.31.0.1
broadcast 172.31.0.255 dev docker0 table local proto kernel scope link src 172.31.0.1 linkdown
local ::1 dev lo table local proto kernel metric 0 pref medium
root@localhost:/#

root@localhost:/# vtysh

Hello, this is FRRouting (version 10.1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

localhost# show running-config
Building configuration...

Current configuration:
!
frr version 10.1
frr defaults traditional
hostname localhost
log syslog informational
no ip nht resolve-via-default
service integrated-vtysh-config
!
interface wwan0
ip address 10.149.2.65/24
exit
!
interface lan0
shutdown
exit

The error included in the description.
It feels like the FRR holds somewhere mistakenly the default gateway address from previous sessions.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@rozmanro1 rozmanro1 added the triage Needs further investigation label Mar 6, 2025
@donaldsharp
Copy link
Member

show zebra please

@rozmanro1
Copy link
Author

rozmanro1 commented Mar 9, 2025

show zebra please

Hi, zebra show:

root@localhost:/# vtysh

Hello, this is FRRouting (version 10.1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

localhost# show zebra
OS Linux(6.12.5)
ECMP Maximum 1
v4 Forwarding On
v6 Forwarding On
MPLS Off
EVPN Off
Kernel socket buffer size 90000000
v6 Route Replace Semantics Replace
VRF l3mdev Available
v6 with v4 nexthop Unavaliable
ASIC offload Unavailable
RA Compiled in
RFC 5549 BGP is not using
Kernel NHG Available
Allow Non FRR route deletion No
v4 All LinkDown Routes Off
v4 Default LinkDown Routes Off
v6 All LinkDown Routes Off
v6 Default LinkDown Routes Off
v4 All MC Forwarding Off
v4 Default MC Forwarding Off
v6 All MC Forwarding Off
v6 Default MC Forwarding Off

                        Route      Route      Neighbor   LSP        LSP

VRF Installs Removals Updates Installs Removals
default 1227 1205 0 0 0

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

2 participants