Installation Using Kind

This guide uses kind to demonstrate deployment and operation of Cilium in a multi-node Kubernetes cluster running locally on Docker.

Install Dependencies

  1. Install docker stable as described in Install Docker Engine

  2. Install kubectl version >= v1.14.0 as described in the Kubernetes Docs

  3. Install helm >= v3.13.0 per Helm documentation: Installing Helm

  4. Install kind >= v0.7.0 per kind documentation: Installation and Usage

Configure kind

Configuring kind cluster creation is done using a YAML configuration file. This step is necessary in order to disable the default CNI and replace it with Cilium.

Create a kind-config.yaml file based on the following template. It will create a cluster with 3 worker nodes and 1 control-plane node.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true

By default, the latest version of Kubernetes from when the kind release was created is used.

To change the version of Kubernetes being run, image has to be defined for each node. See the Node Configuration documentation for more information.

Tip

By default, kind uses the following pod and service subnets:

Networking.PodSubnet     = "10.244.0.0/16"
Networking.ServiceSubnet = "10.96.0.0/12"

If any of these subnets conflicts with your local network address range, update the networking section of the kind configuration file to specify different subnets that do not conflict or you risk having connectivity issues when deploying Cilium. For example:

networking:
  disableDefaultCNI: true
  podSubnet: "10.10.0.0/16"
  serviceSubnet: "10.11.0.0/16"

Create a cluster

To create a cluster with the configuration defined above, pass the kind-config.yaml you created with the --config flag of kind.

kind create cluster --config=kind-config.yaml

After a couple of seconds or minutes, a 4 nodes cluster should be created.

A new kubectl context (kind-kind) should be added to KUBECONFIG or, if unset, to ${HOME}/.kube/config:

kubectl cluster-info --context kind-kind

Note

The cluster nodes will remain in state NotReady until Cilium is deployed. This behavior is expected.

Install Cilium

Setup Helm repository:

helm repo add cilium https://helm.cilium.io/

Preload the cilium image into each worker node in the kind cluster:

docker pull quay.io/cilium/cilium:v1.16.3
kind load docker-image quay.io/cilium/cilium:v1.16.3

Then, install Cilium release via Helm:

helm install cilium cilium/cilium --version 1.16.3 \
   --namespace kube-system \
   --set image.pullPolicy=IfNotPresent \
   --set ipam.mode=kubernetes

Note

To enable Cilium’s Socket LB (Kubernetes Without kube-proxy), cgroup v2 needs to be enabled, and Kind nodes need to run in separate cgroup namespaces, and these namespaces need to be different from the cgroup namespace of the underlying host so that Cilium can attach BPF programs at the right cgroup hierarchy. To verify this, run the following commands, and ensure that the cgroup values are different:

$ docker exec kind-control-plane ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 20 19:20 /proc/self/ns/cgroup -> 'cgroup:[4026532461]'

$ docker exec kind-worker ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 20 19:20 /proc/self/ns/cgroup -> 'cgroup:[4026532543]'

$ ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 19 09:38 /proc/self/ns/cgroup -> 'cgroup:[4026531835]'

One way to enable cgroup v2 is to set the kernel parameter systemd.unified_cgroup_hierarchy=1. To enable cgroup namespaces, a container runtime needs to configured accordingly. For example in Docker, dockerd’s --default-cgroupns-mode has to be set to private.

Another requirement for the Socket LB on Kind to properly function is that either cgroup v1 controllers net_cls and net_prio are disabled (or cgroup v1 altogether is disabled e.g., by setting the kernel parameter cgroup_no_v1="all"), or the host kernel should be 5.14 or more recent to include this fix.

See the Pull Request for more details.

Validate the Installation

Warning

Make sure you install cilium-cli v0.15.0 or later. The rest of instructions do not work with older versions of cilium-cli. To confirm the cilium-cli version that’s installed in your system, run:

cilium version --client

See Cilium CLI upgrade notes for more details.

Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features (e.g. clustermesh, Hubble).

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

To validate that Cilium has been properly installed, you can run

$ cilium status --wait
   /¯¯\
/¯¯\__/¯¯\    Cilium:         OK
\__/¯¯\__/    Operator:       OK
/¯¯\__/¯¯\    Hubble:         disabled
\__/¯¯\__/    ClusterMesh:    disabled
   \__/

DaemonSet         cilium             Desired: 2, Ready: 2/2, Available: 2/2
Deployment        cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
Containers:       cilium-operator    Running: 2
                  cilium             Running: 2
Image versions    cilium             quay.io/cilium/cilium:v1.9.5: 2
                  cilium-operator    quay.io/cilium/operator-generic:v1.9.5: 2

Run the following command to validate that your cluster has proper network connectivity:

$ cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
✨ [k8s-cluster] Creating namespace for connectivity check...
(...)
---------------------------------------------------------------------------------------------------------------------
📋 Test Report
---------------------------------------------------------------------------------------------------------------------
✅ 69/69 tests successful (0 warnings)

Note

The connectivity test may fail to deploy due to too many open files in one or more of the pods. If you notice this error, you can increase the inotify resource limits on your host machine (see Pod errors due to “too many open files”).

Congratulations! You have a fully functional Kubernetes cluster with Cilium. 🎉

Next Steps

Attaching a Debugger

Cilium’s Kind configuration enables access to Delve debug server instances running in the agent and operator Pods by default. See Debugging to learn how to use it.

Troubleshooting

Unable to contact k8s api-server

In the Cilium agent logs you will see:

level=info msg="Establishing connection to apiserver" host="https://10.96.0.1:443" subsys=k8s
level=error msg="Unable to contact k8s api-server" error="Get https://10.96.0.1:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: connect: no route to host" ipAddr="https://10.96.0.1:443" subsys=k8s
level=fatal msg="Unable to initialize Kubernetes subsystem" error="unable to create k8s client: unable to create k8s client: Get https://10.96.0.1:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: connect: no route to host" subsys=daemon

As Kind is running nodes as containers in Docker, they’re sharing your host machines’ kernel. If the socket LB wasn’t disabled, the eBPF programs attached by Cilium may be out of date and no longer routing api-server requests to the current kind-control-plane container.

Recreating the kind cluster and using the helm command Install Cilium will detach the inaccurate eBPF programs.

Crashing Cilium agent pods

Check if Cilium agent pods are crashing with following logs. This may indicate that you are deploying a kind cluster in an environment where Cilium is already running (for example, in the Cilium development VM). This can also happen if you have other overlapping BPF cgroup type programs attached to the parent cgroup hierarchy of the kind container nodes. In such cases, either tear down Cilium, or manually detach the overlapping BPF cgroup programs running in the parent cgroup hierarchy by following the bpftool documentation. For more information, see the Pull Request.

level=warning msg="+ bpftool cgroup attach /var/run/cilium/cgroupv2 connect6 pinned /sys/fs/bpf/tc/globals/cilium_cgroups_connect6" subsys=datapath-loader
level=warning msg="Error: failed to attach program" subsys=datapath-loader
level=warning msg="+ RETCODE=255" subsys=datapath-loader

Cluster Mesh

With Kind we can simulate Cluster Mesh in a sandbox too.

Kind Configuration

This time we need to create (2) config.yaml, one for each kubernetes cluster. We will explicitly configure their pod-network-cidr and service-cidr to not overlap.

Example kind-cluster1.yaml:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true
  podSubnet: "10.0.0.0/16"
  serviceSubnet: "10.1.0.0/16"

Example kind-cluster2.yaml:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true
  podSubnet: "10.2.0.0/16"
  serviceSubnet: "10.3.0.0/16"

Create Kind Clusters

We can now create the respective clusters:

kind create cluster --name=cluster1 --config=kind-cluster1.yaml
kind create cluster --name=cluster2 --config=kind-cluster2.yaml

Setting up Cluster Mesh

We can deploy Cilium, and complete setup by following the Cluster Mesh guide with Setting up Cluster Mesh. For Kind, we’ll want to deploy the NodePort service into the kube-system namespace.