Installation Using Kind

This guide uses kind to demonstrate deployment and operation of Cilium in a multi-node Kubernetes cluster running locally on Docker.

Install Dependencies

Install docker stable as described in Install Docker Engine
Install kubectl version >= v1.14.0 as described in the Kubernetes Docs
Install helm >= v3.13.0 per Helm documentation: Installing Helm
Install kind >= v0.7.0 per kind documentation: Installation and Usage

Configure kind

Configuring kind cluster creation is done using a YAML configuration file. This step is necessary in order to disable the default CNI and replace it with Cilium.

Create a kind-config.yaml file based on the following template. It will create a cluster with 3 worker nodes and 1 control-plane node.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true

By default, the latest version of Kubernetes from when the kind release was created is used.

To change the version of Kubernetes being run, image has to be defined for each node. See the Node Configuration documentation for more information.

Tip

By default, kind uses the following pod and service subnets:

Networking.PodSubnet     = "10.244.0.0/16"
Networking.ServiceSubnet = "10.96.0.0/12"

If any of these subnets conflicts with your local network address range, update the networking section of the kind configuration file to specify different subnets that do not conflict or you risk having connectivity issues when deploying Cilium. For example:

networking:
  disableDefaultCNI: true
  podSubnet: "10.10.0.0/16"
  serviceSubnet: "10.11.0.0/16"

Create a cluster

To create a cluster with the configuration defined above, pass the kind-config.yaml you created with the --config flag of kind.

kind create cluster --config=kind-config.yaml

After a couple of seconds or minutes, a 4 nodes cluster should be created.

A new kubectl context (kind-kind) should be added to KUBECONFIG or, if unset, to ${HOME}/.kube/config:

kubectl cluster-info --context kind-kind

Note

The cluster nodes will remain in state NotReady until Cilium is deployed. This behavior is expected.

Install Cilium

Download the Cilium release tarball and change to the kubernetes install directory:

curl -LO https://github.com/cilium/cilium/archive/main.tar.gz
tar xzf main.tar.gz
cd cilium-main/install/kubernetes

Preload the cilium image into each worker node in the kind cluster:

docker pull quay.io/cilium/cilium:v1.16.0-dev
kind load docker-image quay.io/cilium/cilium:v1.16.0-dev

Then, install Cilium release via Helm:

helm install cilium ./cilium \
   --namespace kube-system \
   --set image.pullPolicy=IfNotPresent \
   --set ipam.mode=kubernetes

Note

To enable Cilium’s Socket LB (Kubernetes Without kube-proxy), cgroup v2 needs to be enabled, and Kind nodes need to run in separate cgroup namespaces, and these namespaces need to be different from the cgroup namespace of the underlying host so that Cilium can attach BPF programs at the right cgroup hierarchy. To verify this, run the following commands, and ensure that the cgroup values are different:

$ docker exec kind-control-plane ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 20 19:20 /proc/self/ns/cgroup -> 'cgroup:[4026532461]'

$ docker exec kind-worker ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 20 19:20 /proc/self/ns/cgroup -> 'cgroup:[4026532543]'

$ ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 19 09:38 /proc/self/ns/cgroup -> 'cgroup:[4026531835]'

One way to enable cgroup v2 is to set the kernel parameter systemd.unified_cgroup_hierarchy=1. To enable cgroup namespaces, a container runtime needs to configured accordingly. For example in Docker, dockerd’s --default-cgroupns-mode has to be set to private.

Another requirement for the Socket LB on Kind to properly function is that either cgroup v1 controllers net_cls and net_prio are disabled (or cgroup v1 altogether is disabled e.g., by setting the kernel parameter cgroup_no_v1="all"), or the host kernel should be 5.14 or more recent to include this fix.

See the Pull Request for more details.

Validate the Installation

Warning

Make sure you install cilium-cli v0.15.0 or later. The rest of instructions do not work with older versions of cilium-cli. To confirm the cilium-cli version that’s installed in your system, run:

cilium version --client

See Cilium CLI upgrade notes for more details.

Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features (e.g. clustermesh, Hubble).

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "arm64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-darwin-${CLI_ARCH}.tar.gz{,.sha256sum}
shasum -a 256 -c cilium-darwin-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-darwin-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-darwin-${CLI_ARCH}.tar.gz{,.sha256sum}

Clone the Cilium GitHub repository so that the Cilium CLI can access the latest unreleased Helm chart from the main branch:

git clone git@github.com:cilium/cilium.git
cd cilium

To validate that Cilium has been properly installed, you can run

$ cilium status --wait
   /¯¯\
/¯¯\__/¯¯\    Cilium:         OK
\__/¯¯\__/    Operator:       OK
/¯¯\__/¯¯\    Hubble:         disabled
\__/¯¯\__/    ClusterMesh:    disabled
   \__/

DaemonSet         cilium             Desired: 2, Ready: 2/2, Available: 2/2
Deployment        cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
Containers:       cilium-operator    Running: 2
                  cilium             Running: 2
Image versions    cilium             quay.io/cilium/cilium:v1.9.5: 2
                  cilium-operator    quay.io/cilium/operator-generic:v1.9.5: 2

Run the following command to validate that your cluster has proper network connectivity:

$ cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
✨ [k8s-cluster] Creating namespace for connectivity check...
(...)
---------------------------------------------------------------------------------------------------------------------
📋 Test Report
---------------------------------------------------------------------------------------------------------------------
✅ 69/69 tests successful (0 warnings)

Note

The connectivity test may fail to deploy due to too many open files in one or more of the pods. If you notice this error, you can increase the inotify resource limits on your host machine (see Pod errors due to “too many open files”).

Congratulations! You have a fully functional Kubernetes cluster with Cilium. 🎉

You can monitor as Cilium and all required components are being installed:

$ kubectl -n kube-system get pods --watch
NAME                                    READY   STATUS              RESTARTS   AGE
cilium-operator-cb4578bc5-q52qk         0/1     Pending             0          8s
cilium-s8w5m                            0/1     PodInitializing     0          7s
coredns-86c58d9df4-4g7dd                0/1     ContainerCreating   0          8m57s
coredns-86c58d9df4-4l6b2                0/1     ContainerCreating   0          8m57s

It may take a couple of minutes for all components to come up:

cilium-operator-cb4578bc5-q52qk         1/1     Running   0          4m13s
cilium-s8w5m                            1/1     Running   0          4m12s
coredns-86c58d9df4-4g7dd                1/1     Running   0          13m
coredns-86c58d9df4-4l6b2                1/1     Running   0          13m

You can deploy the “connectivity-check” to test connectivity between pods. It is recommended to create a separate namespace for this.

kubectl create ns cilium-test

Deploy the check with:

kubectl apply -n cilium-test -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/kubernetes/connectivity-check/connectivity-check.yaml

It will deploy a series of deployments which will use various connectivity paths to connect to each other. Connectivity paths include with and without service load-balancing and various network policy combinations. The pod name indicates the connectivity variant and the readiness and liveness gate indicates success or failure of the test:

$ kubectl get pods -n cilium-test
NAME                                                     READY   STATUS    RESTARTS   AGE
echo-a-76c5d9bd76-q8d99                                  1/1     Running   0          66s
echo-b-795c4b4f76-9wrrx                                  1/1     Running   0          66s
echo-b-host-6b7fc94b7c-xtsff                             1/1     Running   0          66s
host-to-b-multi-node-clusterip-85476cd779-bpg4b          1/1     Running   0          66s
host-to-b-multi-node-headless-dc6c44cb5-8jdz8            1/1     Running   0          65s
pod-to-a-79546bc469-rl2qq                                1/1     Running   0          66s
pod-to-a-allowed-cnp-58b7f7fb8f-lkq7p                    1/1     Running   0          66s
pod-to-a-denied-cnp-6967cb6f7f-7h9fn                     1/1     Running   0          66s
pod-to-b-intra-node-nodeport-9b487cf89-6ptrt             1/1     Running   0          65s
pod-to-b-multi-node-clusterip-7db5dfdcf7-jkjpw           1/1     Running   0          66s
pod-to-b-multi-node-headless-7d44b85d69-mtscc            1/1     Running   0          66s
pod-to-b-multi-node-nodeport-7ffc76db7c-rrw82            1/1     Running   0          65s
pod-to-external-1111-d56f47579-d79dz                     1/1     Running   0          66s
pod-to-external-fqdn-allow-google-cnp-78986f4bcf-btjn7   1/1     Running   0          66s

Note

If you deploy the connectivity check to a single node cluster, pods that check multi-node functionalities will remain in the Pending state. This is expected since these pods need at least 2 nodes to be scheduled successfully.

Once done with the test, remove the cilium-test namespace:

kubectl delete ns cilium-test

Next Steps

Setting up Hubble Observability

Inspecting Network Flows with the CLI

Service Map & Hubble UI

Identity-Aware and HTTP-Aware Policy Enforcement

Setting up Cluster Mesh

Attaching a Debugger

Cilium’s Kind configuration enables access to Delve debug server instances running in the agent and operator Pods by default. See Debugging to learn how to use it.

Troubleshooting

Unable to contact k8s api-server

In the Cilium agent logs you will see:

level=info msg="Establishing connection to apiserver" host="https://10.96.0.1:443" subsys=k8s
level=error msg="Unable to contact k8s api-server" error="Get https://10.96.0.1:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: connect: no route to host" ipAddr="https://10.96.0.1:443" subsys=k8s
level=fatal msg="Unable to initialize Kubernetes subsystem" error="unable to create k8s client: unable to create k8s client: Get https://10.96.0.1:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: connect: no route to host" subsys=daemon

As Kind is running nodes as containers in Docker, they’re sharing your host machines’ kernel. If the socket LB wasn’t disabled, the eBPF programs attached by Cilium may be out of date and no longer routing api-server requests to the current kind-control-plane container.

Recreating the kind cluster and using the helm command Install Cilium will detach the inaccurate eBPF programs.

Crashing Cilium agent pods

Check if Cilium agent pods are crashing with following logs. This may indicate that you are deploying a kind cluster in an environment where Cilium is already running (for example, in the Cilium development VM). This can also happen if you have other overlapping BPF cgroup type programs attached to the parent cgroup hierarchy of the kind container nodes. In such cases, either tear down Cilium, or manually detach the overlapping BPF cgroup programs running in the parent cgroup hierarchy by following the bpftool documentation. For more information, see the Pull Request.

level=warning msg="+ bpftool cgroup attach /var/run/cilium/cgroupv2 connect6 pinned /sys/fs/bpf/tc/globals/cilium_cgroups_connect6" subsys=datapath-loader
level=warning msg="Error: failed to attach program" subsys=datapath-loader
level=warning msg="+ RETCODE=255" subsys=datapath-loader

Cluster Mesh

With Kind we can simulate Cluster Mesh in a sandbox too.

Kind Configuration

This time we need to create (2) config.yaml, one for each kubernetes cluster. We will explicitly configure their pod-network-cidr and service-cidr to not overlap.

Example kind-cluster1.yaml:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true
  podSubnet: "10.0.0.0/16"
  serviceSubnet: "10.1.0.0/16"

Example kind-cluster2.yaml:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true
  podSubnet: "10.2.0.0/16"
  serviceSubnet: "10.3.0.0/16"

Create Kind Clusters

We can now create the respective clusters:

kind create cluster --name=cluster1 --config=kind-cluster1.yaml
kind create cluster --name=cluster2 --config=kind-cluster2.yaml

Setting up Cluster Mesh

We can deploy Cilium, and complete setup by following the Cluster Mesh guide with Setting up Cluster Mesh. For Kind, we’ll want to deploy the NodePort service into the kube-system namespace.