Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition between netpol and IPVS based ipset updates #1732

Closed
alexcriss opened this issue Sep 5, 2024 · 15 comments · Fixed by #1806
Closed

Race condition between netpol and IPVS based ipset updates #1732

alexcriss opened this issue Sep 5, 2024 · 15 comments · Fixed by #1806
Labels

Comments

@alexcriss
Copy link

What happened?

I am observing a race condition between the NetworkPolicyController and the NetworkServicesController when updating IPVS entries. The scenario is as follow:

  • There is a service that has an ExternalIP associated with it.
  • A new pod that the service targets start on a host.
  • kube-router runs the periodic syncIpvsFirewall and adds the ExternalIP to the kube-router-svip-prt ipset. Here traffic to the ExternalIP coming from other nodes start being ACCEPT-ed by iptables. At this stage, NetworkServicesController also adds the ExternalIP to the ipSetHandlers map it maintains in memory.
  • something triggers a network policy sync, and kube-router runs syncNetworkPolicyChains. This refreshes ipsets to include IPs contained in NetworkPolicies, starting from the in memory values that the NetworkPolicyController holds in its ipSetHandlers.
  • The NetworkPolicyController ipSetHandlers map doesn't know anything about the ExternalIP that was added by the NetworkServicesController, and hence it is removed from kube-router-svip-prt. Traffic to the ExternalIP gets REJECT-ed by itlables, until syncIpvsFirewall runs again.

What did you expect to happen?

The ExternalIPs of services should be added to the kube-router-svip-prt ipset and remain there, instead of getting removed and re-added.

How can we reproduce the behavior you experienced?

Steps to reproduce the behavior:

  1. Have a service with an ExternalIP added to it, say a.b.c.d.
  2. Spin up a new pod targeted by the service
  3. Observe the content of the kube-router-svip-prt ipset on the host where the pod started with ipset list kube-router-svip-prt | grep -P "a\.b\.c\.d"
  4. The IP will be there after kube-router runs syncIpvsFirewall and will disappear when kube-router runs fullPolicySync.

System Information (please complete the following information)

  • Kube-Router Version (kube-router --version):
    Running kube-router version v2.1.0-11-gac6b898c, built on 2024-03-18T20:39:38+0100, go1.22.0

  • Kube-Router Parameters:

/usr/bin/kube-router --advertise-cluster-ip=false --advertise-external-ip=true --advertise-loadbalancer-ip=false --advertise-pod-cidr=true --bgp-graceful-restart=true --bgp-port=179 --bgp-local-port=199 --cluster-asn=65532 --enable-cni=true --enable-ibgp=false --enable-overlay=false --enable-pod-egress=false --health-port=1420 --hostname-override=vip-k8s-general-c12-36.dfw.vipv2.net --iptables-sync-period=97s --ipvs-sync-period=181s --kubeconfig=/etc/kubernetes/dfw-vip-exploud/kube-proxy.kubeconfig.yaml --masquerade-all=false --metrics-port=1421 --nodes-full-mesh=false --override-nexthop=true --peer-router-asns=65532 --peer-router-ips=127.0.0.1 --router-id=127.0.0.1 --routes-sync-period=67s --run-firewall=true --run-router=true --run-service-proxy=true --service-cluster-ip-range=172.30.0.0/16 --service-external-ip-range=87.250.180.0/23 --v=5
  • Kubernetes Version (kubectl version) : 1.27.13
  • Cloud Type: on premise
  • Kubernetes Deployment Type: custom scripts
  • Kube-Router Deployment Type: System service
  • Cluster Size: The cluster test is 10 nodes, prod clusters are around 100 nodes

Logs, other output, metrics

This is what i see in logs (I extracted the relevant parts, the full run is attached)

Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: I0905 08:29:55.259444  248635 service_endpoints_sync.go:87] Syncing IPVS Firewall
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: I0905 08:29:55.259452  248635 network_services_controller.go:612] Attempting to attain ipset mutex lock
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: I0905 08:29:55.259460  248635 network_services_controller.go:614] Attained ipset mutex lock, continuing...
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: I0905 08:29:55.265608  248635 ipset.go:564] 
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: create TMP-TF3INM4IEYGA443O hash:ip,port timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: flush TMP-TF3INM4IEYGA443O
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.28.239,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 87.250.179.244,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 87.250.179.246,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.216.84,tcp:25 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 87.250.179.242,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.36.204,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.232.251,tcp:8080 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.50,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.28.239,tcp:8080 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.204.62,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.248.172,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.79.217,tcp:11233 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.232.251,tcp:9666 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.69,tcp:2379 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.216.84,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.140.191,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.64,tcp:2379 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.65,tcp:2379 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.16.186,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.66,tcp:2379 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.67,tcp:2379 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.48.115,tcp:2379 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.10,tcp:53 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.16.131,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.118.158,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.183.152,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.10,udp:53 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.16.131,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.200.95,tcp:8080 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.127.69,tcp:9402 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.118.158,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.183.152,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.189.55,tcp:9443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.200.95,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.140.191,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.16.186,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.10,tcp:9153 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.204.62,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.2.10,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.0.1,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.10.1,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.189.55,tcp:8080 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 87.250.179.246,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.108.132,tcp:9222 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 87.250.179.244,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 87.250.179.242,tcp:80 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-TF3INM4IEYGA443O 172.30.148.17,tcp:443 timeout 0
Sep 05 08:29:55 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: create kube-router-svip-prt hash:ip,port timeout 0
...
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: I0905 08:30:00.159752  248635 network_policy_controller.go:195] Received request for a full sync, processing
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: I0905 08:30:00.159764  248635 network_policy_controller.go:243] Starting sync of iptables with version: 1725525000159758032
Please provide logs, other kind of output or observed metrics here.
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.216.84,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.248.172,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.16.186,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.67,tcp:2379 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.64,tcp:2379 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.118.158,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.183.152,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.127.69,tcp:9402 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.28.239,tcp:8080 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.10,tcp:53 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.10,tcp:9153 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.200.95,tcp:8080 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.2.10,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.140.191,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 87.250.179.244,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.65,tcp:2379 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.232.251,tcp:9666 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.189.55,tcp:9443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.16.131,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.16.186,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.66,tcp:2379 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.232.251,tcp:8080 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.118.158,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.1,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.140.191,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.10.1,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.204.62,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.50,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.183.152,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.10,udp:53 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.48.115,tcp:2379 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.16.131,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.0.69,tcp:2379 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.204.62,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 87.250.179.244,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 87.250.179.242,tcp:80 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.200.95,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.189.55,tcp:8080 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.148.17,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.79.217,tcp:11233 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.216.84,tcp:25 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.108.132,tcp:9222 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.36.204,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 87.250.179.242,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: add TMP-DEZFJSJBULNQ6H3V 172.30.28.239,tcp:443 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: create kube-router-svip-prt hash:ip,port family inet hashsize 1024 maxelem 65536 timeout 0
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: swap TMP-DEZFJSJBULNQ6H3V kube-router-svip-prt
Sep 05 08:30:00 vip-k8s-general-C12-36.dfw.vipv2.net kube-router[248635]: flush TMP-DEZFJSJBULNQ6H3V

When ipsets are restored by the NetworkServicesController the kube-router-svip-prt contains 87.250.179.246, while when they are restored by the NetworkPolicyController 87.250.179.246 is missing.

I am patching the issue for now by running ipset.Save() at each controller before they build their updated version, to make sure the base layer is the current config, instead of the previous inmemory content which might be outdated.
kube-router-ipset-race.log

@alexcriss alexcriss added the bug label Sep 5, 2024
Copy link

github-actions bot commented Oct 6, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Oct 6, 2024
@walthowd
Copy link

walthowd commented Oct 6, 2024

Not stale.

@github-actions github-actions bot removed the Stale label Oct 7, 2024
@rbrtbnfgl
Copy link
Contributor

Hi @alexcriss I'm trying to replicate your issue but I wasn't able to do it. On your setup do you have other services created and only one is getting the issue? Do you have network policies configured?

@alexcriss
Copy link
Author

Hi @rbrtbnfgl,

We have multiple services, all of them are impacted. These services all have ExternalIPs announced by BGP and wwe see traffic failing on those IPs. They are also all target of Network Policies, which allow traffic to said ExternalIPs only by specific IPs.

So yeah, we send traffic to the ExternalIP, and the IPs that the Netowrk Policies allow to send traffic there are not in the ipset that allow traffic at the iptables layer.

Hopefully this helps, I am here for any other question!

@rbrtbnfgl
Copy link
Contributor

Could it be possible related to the network policy defined? How you defined it? I tried using from and to and still I'm not getting the missing ipset reference.

@alexcriss
Copy link
Author

alexcriss commented Oct 30, 2024

The netpol we use has multiple entries, we match on pod selectors and on raw IPs. I am not really sure it matters though, since the IP that is not getting set in the kube-router-svip-prt has nothing to do with the netpol itself.

A stripped down version of the netpol looks like

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: ingress-nginx-api
  namespace: ingress-nginx
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: monitoring
      podSelector:
        matchLabels:
          app: prometheus
    ports:
    - port: 12345
      protocol: TCP
  - from:
    - ipBlock:
        cidr: a.b.c.d/32
    - ipBlock:
        cidr: x.y.0.0/16
    ports:
    - port: 80
      protocol: TCP
    - port: 443
      protocol: TCP
  podSelector:
    matchLabels:
      nginx-ingress: api
  policyTypes:
  - Ingress

Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Nov 30, 2024
@alexcriss
Copy link
Author

not stale :)

@rbrtbnfgl
Copy link
Contributor

sorry I didn't have time to look at it lately.

@github-actions github-actions bot removed the Stale label Dec 3, 2024
Copy link

github-actions bot commented Jan 3, 2025

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Jan 3, 2025
@alexcriss
Copy link
Author

not stale

@aauren
Copy link
Collaborator

aauren commented Feb 2, 2025

@alexcriss - After spending some time trying to reproduce this @rbrtbnfgl and myself were never able to reproduce this. However, race conditions can be tricky on things like this. It likely has to do with the size of your cluster, how many network policies you have, specific timing and the like.

After reviewing the logic, we do think we could see how kube-router's state could become unaligned with ipset state loaded in by the NetworkServicesController.

Essentially ipsets only get saved in the NPC when cleanupStaleIPSets() is called which is at the tail end of fullPolicySync(). This means that when the ipsets are restored in syncNetworkPolicyChains(), since it happens before cleanupStaleIPSets() is called, it is effectively running with saved state from a previous run of fullPolicySync().

@rbrtbnfgl added what we believe to be a fix for this issue in #1806, would you be able to test it and see if that resolves the issue that you are experiencing? If it helps, I built a container containing this change and pushed it to docker hub: https://hub.docker.com/layers/cloudnativelabs/kube-router-git/amd64-prs1806/images/sha256-a3b126d49890b40408e5cd010806af1d28dab785fa7585d61f4e6708036e1e3a

cloudnativelabs/kube-router-git:amd64-prs1806

@aauren
Copy link
Collaborator

aauren commented Feb 10, 2025

@alexcriss ping - It would be really good to know if the patch in #1806 fixes the issue for you before we merge it since @rbrtbnfgl and myself are unable to reproduce this issue in our cluster. Do you have time to test this, in the next couple of days?

@alexcriss
Copy link
Author

alexcriss commented Feb 10, 2025

Sorry, i saw the comment, and did not have the time to look properly.

We have been running basically the same patch and it solves the issue.

We also call ipset.Save() in the other controllers, if for whatever reason the race happens "on the other side", but we never observed it, it was out of caution.

Will post what we use later, but this is resolving it for us.

@alexcriss
Copy link
Author

This is what we use, which looks exactly like your commit for the pkg/controllers/netpol/policy.go part

diff --git a/pkg/controllers/netpol/policy.go b/pkg/controllers/netpol/policy.go
index 7c0ff7d2..ed0a9a7c 100644
--- a/pkg/controllers/netpol/policy.go
+++ b/pkg/controllers/netpol/policy.go
@@ -98,6 +98,13 @@ func (npc *NetworkPolicyController) syncNetworkPolicyChains(networkPoliciesInfo
 		}
 	}()
 
+	for _, ipset := range npc.ipSetHandlers {
+		err := ipset.Save()
+		if err != nil {
+			klog.Warningf("Error saving ipset rules before building network policy based sets: %+v", err)
+		}
+	}
+
 	// run through all network policies
 	for _, policy := range networkPoliciesInfo {
 		currentPodIPs := make(map[api.IPFamily][]string)
diff --git a/pkg/controllers/proxy/network_services_controller.go b/pkg/controllers/proxy/network_services_controller.go
index 3173c21d..3a74fdc5 100644
--- a/pkg/controllers/proxy/network_services_controller.go
+++ b/pkg/controllers/proxy/network_services_controller.go
@@ -617,6 +617,13 @@ func (nsc *NetworkServicesController) syncIpvsFirewall() error {
 		klog.V(1).Infof("Returned ipset mutex lock")
 	}()
 
+	for _, ipset := range nsc.ipSetHandlers {
+		err := ipset.Save()
+		if err != nil {
+			klog.Warningf("Error saving ipset rules before building network services based sets: %+v", err)
+		}
+	}
+
 	// Populate local addresses ipset.
 	addrsMap, err := getAllLocalIPs()
 	if err != nil {
diff --git a/pkg/controllers/routing/network_routes_controller.go b/pkg/controllers/routing/network_routes_controller.go
index cf97a554..bcf0c7ae 100644
--- a/pkg/controllers/routing/network_routes_controller.go
+++ b/pkg/controllers/routing/network_routes_controller.go
@@ -958,6 +958,13 @@ func (nrc *NetworkRoutingController) syncNodeIPSets() error {
 		klog.V(1).Infof("Returned ipset mutex lock")
 	}()
 
+	for _, ipset := range nrc.ipSetHandlers {
+		err := ipset.Save()
+		if err != nil {
+			klog.Warningf("Error saving ipset rules before building node based sets: %+v", err)
+		}
+	}
+
 	nodes := nrc.nodeLister.List()
 
 	// Collect active PodCIDR(s) and NodeIPs from nodes

As I mentioned before, we added a ipset.Save() to the other controllers too. I am not sure that was needed, I did it out of possible races via code inspection, but never had a real case of stale rules in the other controllers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants