Using kube-router to run BGP

This guide explains how to configure Cilium and kube-router to co-operate to use kube-router for BGP peering and route propagation and Cilium for policy enforcement and load-balancing.

Note

This is a beta feature. Please provide feedback and file a GitHub issue if you experience any problems.

Deploy kube-router

Download the kube-router DaemonSet template:

curl -LO https://raw.githubusercontent.com/cloudnativelabs/kube-router/v0.4.0/daemonset/generic-kuberouter-only-advertise-routes.yaml

Open the file generic-kuberouter-only-advertise-routes.yaml and edit the args: section. The following arguments are requried to be set to exactly these values:

- "--run-router=true"
- "--run-firewall=false"
- "--run-service-proxy=false"
- "--enable-cni=false"
- "--enable-pod-egress=false"

The following arguments are optional and may be set according to your needs. For the purpose of keeping this guide simple, the following values are being used which require the least preparations in your cluster. Please see the kube-router user guide for more information.

- "--enable-ibgp=true"
- "--enable-overlay=true"
- "--advertise-cluster-ip=true"
- "--advertise-external-ip=true"
- "--advertise-loadbalancer-ip=true"

The following arguments are optional and should be set if you want BGP peering with an external router. This is useful if you want externally routable Kubernetes Pod and Service IPs. Note the values used here should be changed to whatever IPs and ASNs are configured on your external router.

- "--cluster-asn=65001"
- "--peer-router-ips=10.0.0.1,10.0.2"
- "--peer-router-asns=65000,65000"

Apply the DaemonSet file to deploy kube-router and verify it has come up correctly:

$ kubectl apply -f generic-kuberouter-only-advertise-routes.yaml
$ kubectl -n kube-system get pods -l k8s-app=kube-router
NAME                READY     STATUS    RESTARTS   AGE
kube-router-n6fv8   1/1       Running   0          10m
kube-router-nj4vs   1/1       Running   0          10m
kube-router-xqqwc   1/1       Running   0          10m
kube-router-xsmd4   1/1       Running   0          10m

Deploy Cilium

In order for routing to be delegated to kube-router, tunneling/encapsulation must be disabled. This is done by setting the tunnel=disabled in the ConfigMap cilium-config or by adjusting the DaemonSet to run the cilium-agent with the argument --tunnel=disabled:

# Encapsulation mode for communication between nodes
# Possible values:
#   - disabled
#   - vxlan (default)
#   - geneve
tunnel: "disabled"

You can then install Cilium according to the instructions in section Requirements.

Ensure that Cilium is up and running:

$ kubectl -n kube-system get pods -l k8s-app=cilium
NAME           READY     STATUS    RESTARTS   AGE
cilium-fhpk2   1/1       Running   0          45m
cilium-jh6kc   1/1       Running   0          44m
cilium-rlx6n   1/1       Running   0          44m
cilium-x5x9z   1/1       Running   0          45m

Verify Installation

Verify that kube-router has installed routes:

$ kubectl -n kube-system exec -ti cilium-fhpk2 -- ip route list scope global
default via 172.0.32.1 dev eth0 proto dhcp src 172.0.50.227 metric 1024
10.2.0.0/24 via 10.2.0.172 dev cilium_host src 10.2.0.172
10.2.1.0/24 via 172.0.51.175 dev eth0 proto 17
10.2.2.0/24 dev tun-172011760 proto 17 src 172.0.50.227
10.2.3.0/24 dev tun-1720186231 proto 17 src 172.0.50.227

In the above example, we see three categories of routes that have been installed:

  • Local PodCIDR: This route points to all pods running on the host and makes these pods available to * 10.2.0.0/24 via 10.2.0.172 dev cilium_host src 10.2.0.172
  • BGP route: This type of route is installed if kube-router determines that the remote PodCIDR can be reached via a router known to the local host. It will instruct pod to pod traffic to be forwarded directly to that router without requiring any encapsulation. * 10.2.1.0/24 via 172.0.51.175 dev eth0 proto 17
  • IPIP tunnel route: If no direct routing path exists, kube-router will fall back to using an overlay and establish an IPIP tunnel between the nodes. * 10.2.2.0/24 dev tun-172011760 proto 17 src 172.0.50.227 * 10.2.3.0/24 dev tun-1720186231 proto 17 src 172.0.50.227

Deploy the connectivity test

You can deploy the “connectivity-check” to test connectivity between pods.

kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.8.4/examples/kubernetes/connectivity-check/connectivity-check.yaml

It will deploy a series of deployments which will use various connectivity paths to connect to each other. Connectivity paths include with and without service load-balancing and various network policy combinations. The pod name indicates the connectivity variant and the readiness and liveness gate indicates success or failure of the test:

$ kubectl get pods -n cilium-test
NAME                                                    READY   STATUS    RESTARTS   AGE
echo-a-6788c799fd-42qxx                                 1/1     Running   0          69s
echo-b-59757679d4-pjtdl                                 1/1     Running   0          69s
echo-b-host-f86bd784d-wnh4v                             1/1     Running   0          68s
host-to-b-multi-node-clusterip-585db65b4d-x74nz         1/1     Running   0          68s
host-to-b-multi-node-headless-77c64bc7d8-kgf8p          1/1     Running   0          67s
pod-to-a-allowed-cnp-87b5895c8-bfw4x                    1/1     Running   0          68s
pod-to-a-b76ddb6b4-2v4kb                                1/1     Running   0          68s
pod-to-a-denied-cnp-677d9f567b-kkjp4                    1/1     Running   0          68s
pod-to-b-intra-node-nodeport-8484fb6d89-bwj8q           1/1     Running   0          68s
pod-to-b-multi-node-clusterip-f7655dbc8-h5bwk           1/1     Running   0          68s
pod-to-b-multi-node-headless-5fd98b9648-5bjj8           1/1     Running   0          68s
pod-to-b-multi-node-nodeport-74bd8d7bd5-kmfmm           1/1     Running   0          68s
pod-to-external-1111-7489c7c46d-jhtkr                   1/1     Running   0          68s
pod-to-external-fqdn-allow-google-cnp-b7b6bcdcb-97p75   1/1     Running   0          68s

Note

If you deploy the connectivity check to a single node cluster, pods that check multi-node functionalities will remain in the Pending state. This is expected since these pods need at least 2 nodes to be scheduled successfully.