Installation Using Kind
This guide uses kind to demonstrate deployment and operation of Cilium in a multi-node Kubernetes cluster running locally on Docker.
Install Dependencies
Install
docker
stable as described in Install Docker EngineInstall
kubectl
version >= v1.14.0 as described in the Kubernetes DocsInstall
helm
>= v3.13.0 per Helm documentation: Installing HelmInstall
kind
>= v0.7.0 per kind documentation: Installation and Usage
Configure kind
Configuring kind cluster creation is done using a YAML configuration file. This step is necessary in order to disable the default CNI and replace it with Cilium.
Create a kind-config.yaml
file based on the
following template. It will create a cluster with 3 worker nodes and 1
control-plane node.
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
disableDefaultCNI: true
By default, the latest version of Kubernetes from when the kind release was created is used.
To change the version of Kubernetes being run, image
has to be defined for
each node. See the
Node Configuration
documentation for more information.
Tip
By default, kind uses the following pod and service subnets:
Networking.PodSubnet = "10.244.0.0/16"
Networking.ServiceSubnet = "10.96.0.0/12"
If any of these subnets conflicts with your local network address range,
update the networking
section of the kind configuration file to specify
different subnets that do not conflict or you risk having connectivity
issues when deploying Cilium. For example:
networking:
disableDefaultCNI: true
podSubnet: "10.10.0.0/16"
serviceSubnet: "10.11.0.0/16"
Create a cluster
To create a cluster with the configuration defined above, pass the
kind-config.yaml
you created with the --config
flag of kind.
kind create cluster --config=kind-config.yaml
After a couple of seconds or minutes, a 4 nodes cluster should be created.
A new kubectl
context (kind-kind
) should be added to KUBECONFIG
or, if unset,
to ${HOME}/.kube/config
:
kubectl cluster-info --context kind-kind
Note
The cluster nodes will remain in state NotReady
until Cilium is deployed.
This behavior is expected.
Install Cilium
Setup Helm repository:
helm repo add cilium https://helm.cilium.io/
Preload the cilium
image into each worker node in the kind cluster:
docker pull quay.io/cilium/cilium:v1.16.3 kind load docker-image quay.io/cilium/cilium:v1.16.3
Then, install Cilium release via Helm:
helm install cilium cilium/cilium --version 1.16.3 \ --namespace kube-system \ --set image.pullPolicy=IfNotPresent \ --set ipam.mode=kubernetes
Note
To enable Cilium’s Socket LB (Kubernetes Without kube-proxy), cgroup v2 needs to be enabled, and Kind nodes need to run in separate cgroup namespaces, and these namespaces need to be different from the cgroup namespace of the underlying host so that Cilium can attach BPF programs at the right cgroup hierarchy. To verify this, run the following commands, and ensure that the cgroup values are different:
$ docker exec kind-control-plane ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 20 19:20 /proc/self/ns/cgroup -> 'cgroup:[4026532461]'
$ docker exec kind-worker ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 20 19:20 /proc/self/ns/cgroup -> 'cgroup:[4026532543]'
$ ls -al /proc/self/ns/cgroup
lrwxrwxrwx 1 root root 0 Jul 19 09:38 /proc/self/ns/cgroup -> 'cgroup:[4026531835]'
One way to enable cgroup v2 is to set the kernel parameter
systemd.unified_cgroup_hierarchy=1
. To enable cgroup namespaces, a container
runtime needs to configured accordingly. For example in Docker,
dockerd’s --default-cgroupns-mode
has to be set to private
.
Another requirement for the Socket LB on Kind to properly function is that either cgroup v1
controllers net_cls
and net_prio
are disabled (or cgroup v1 altogether is disabled
e.g., by setting the kernel parameter cgroup_no_v1="all"
), or the host kernel
should be 5.14 or more recent to include this fix.
See the Pull Request for more details.
Validate the Installation
Warning
Make sure you install cilium-cli v0.15.0 or later. The rest of instructions do not work with older versions of cilium-cli. To confirm the cilium-cli version that’s installed in your system, run:
cilium version --client
See Cilium CLI upgrade notes for more details.
Install the latest version of the Cilium CLI. The Cilium CLI can be used to install Cilium, inspect the state of a Cilium installation, and enable/disable various features (e.g. clustermesh, Hubble).
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "arm64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-darwin-${CLI_ARCH}.tar.gz{,.sha256sum}
shasum -a 256 -c cilium-darwin-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-darwin-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-darwin-${CLI_ARCH}.tar.gz{,.sha256sum}
See the full page of releases.
To validate that Cilium has been properly installed, you can run
$ cilium status --wait
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Hubble: disabled
\__/¯¯\__/ ClusterMesh: disabled
\__/
DaemonSet cilium Desired: 2, Ready: 2/2, Available: 2/2
Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2
Containers: cilium-operator Running: 2
cilium Running: 2
Image versions cilium quay.io/cilium/cilium:v1.9.5: 2
cilium-operator quay.io/cilium/operator-generic:v1.9.5: 2
Run the following command to validate that your cluster has proper network connectivity:
$ cilium connectivity test
ℹ️ Monitor aggregation detected, will skip some flow validation steps
✨ [k8s-cluster] Creating namespace for connectivity check...
(...)
---------------------------------------------------------------------------------------------------------------------
📋 Test Report
---------------------------------------------------------------------------------------------------------------------
✅ 69/69 tests successful (0 warnings)
Note
The connectivity test may fail to deploy due to too many open files in one
or more of the pods. If you notice this error, you can increase the
inotify
resource limits on your host machine (see
Pod errors due to “too many open files”).
Congratulations! You have a fully functional Kubernetes cluster with Cilium. 🎉
You can monitor as Cilium and all required components are being installed:
$ kubectl -n kube-system get pods --watch
NAME READY STATUS RESTARTS AGE
cilium-operator-cb4578bc5-q52qk 0/1 Pending 0 8s
cilium-s8w5m 0/1 PodInitializing 0 7s
coredns-86c58d9df4-4g7dd 0/1 ContainerCreating 0 8m57s
coredns-86c58d9df4-4l6b2 0/1 ContainerCreating 0 8m57s
It may take a couple of minutes for all components to come up:
cilium-operator-cb4578bc5-q52qk 1/1 Running 0 4m13s
cilium-s8w5m 1/1 Running 0 4m12s
coredns-86c58d9df4-4g7dd 1/1 Running 0 13m
coredns-86c58d9df4-4l6b2 1/1 Running 0 13m
You can deploy the “connectivity-check” to test connectivity between pods. It is recommended to create a separate namespace for this.
kubectl create ns cilium-test
Deploy the check with:
kubectl apply -n cilium-test -f https://raw.githubusercontent.com/cilium/cilium/1.16.3/examples/kubernetes/connectivity-check/connectivity-check.yaml
It will deploy a series of deployments which will use various connectivity paths to connect to each other. Connectivity paths include with and without service load-balancing and various network policy combinations. The pod name indicates the connectivity variant and the readiness and liveness gate indicates success or failure of the test:
$ kubectl get pods -n cilium-test
NAME READY STATUS RESTARTS AGE
echo-a-76c5d9bd76-q8d99 1/1 Running 0 66s
echo-b-795c4b4f76-9wrrx 1/1 Running 0 66s
echo-b-host-6b7fc94b7c-xtsff 1/1 Running 0 66s
host-to-b-multi-node-clusterip-85476cd779-bpg4b 1/1 Running 0 66s
host-to-b-multi-node-headless-dc6c44cb5-8jdz8 1/1 Running 0 65s
pod-to-a-79546bc469-rl2qq 1/1 Running 0 66s
pod-to-a-allowed-cnp-58b7f7fb8f-lkq7p 1/1 Running 0 66s
pod-to-a-denied-cnp-6967cb6f7f-7h9fn 1/1 Running 0 66s
pod-to-b-intra-node-nodeport-9b487cf89-6ptrt 1/1 Running 0 65s
pod-to-b-multi-node-clusterip-7db5dfdcf7-jkjpw 1/1 Running 0 66s
pod-to-b-multi-node-headless-7d44b85d69-mtscc 1/1 Running 0 66s
pod-to-b-multi-node-nodeport-7ffc76db7c-rrw82 1/1 Running 0 65s
pod-to-external-1111-d56f47579-d79dz 1/1 Running 0 66s
pod-to-external-fqdn-allow-google-cnp-78986f4bcf-btjn7 1/1 Running 0 66s
Note
If you deploy the connectivity check to a single node cluster, pods that check multi-node
functionalities will remain in the Pending
state. This is expected since these pods
need at least 2 nodes to be scheduled successfully.
Once done with the test, remove the cilium-test
namespace:
kubectl delete ns cilium-test
Next Steps
Attaching a Debugger
Cilium’s Kind configuration enables access to Delve debug server instances running in the agent and operator Pods by default. See Debugging to learn how to use it.
Troubleshooting
Unable to contact k8s api-server
In the Cilium agent logs you will see:
level=info msg="Establishing connection to apiserver" host="https://10.96.0.1:443" subsys=k8s
level=error msg="Unable to contact k8s api-server" error="Get https://10.96.0.1:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: connect: no route to host" ipAddr="https://10.96.0.1:443" subsys=k8s
level=fatal msg="Unable to initialize Kubernetes subsystem" error="unable to create k8s client: unable to create k8s client: Get https://10.96.0.1:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: connect: no route to host" subsys=daemon
As Kind is running nodes as containers in Docker, they’re sharing your host machines’ kernel.
If the socket LB wasn’t disabled, the eBPF programs attached by Cilium may be out of date
and no longer routing api-server requests to the current kind-control-plane
container.
Recreating the kind cluster and using the helm command Install Cilium will detach the inaccurate eBPF programs.
Crashing Cilium agent pods
Check if Cilium agent pods are crashing with following logs. This may indicate
that you are deploying a kind cluster in an environment where Cilium is already
running (for example, in the Cilium development VM). This can also happen if you
have other overlapping BPF cgroup
type programs attached to the parent cgroup
hierarchy of the kind container nodes. In such cases, either tear down Cilium, or manually
detach the overlapping BPF cgroup
programs running in the parent cgroup
hierarchy
by following the bpftool documentation.
For more information, see the Pull Request.
level=warning msg="+ bpftool cgroup attach /var/run/cilium/cgroupv2 connect6 pinned /sys/fs/bpf/tc/globals/cilium_cgroups_connect6" subsys=datapath-loader
level=warning msg="Error: failed to attach program" subsys=datapath-loader
level=warning msg="+ RETCODE=255" subsys=datapath-loader
Cluster Mesh
With Kind we can simulate Cluster Mesh in a sandbox too.
Kind Configuration
This time we need to create (2) config.yaml
, one for each kubernetes cluster.
We will explicitly configure their pod-network-cidr
and service-cidr
to not overlap.
Example kind-cluster1.yaml
:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
disableDefaultCNI: true
podSubnet: "10.0.0.0/16"
serviceSubnet: "10.1.0.0/16"
Example kind-cluster2.yaml
:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
disableDefaultCNI: true
podSubnet: "10.2.0.0/16"
serviceSubnet: "10.3.0.0/16"
Create Kind Clusters
We can now create the respective clusters:
kind create cluster --name=cluster1 --config=kind-cluster1.yaml
kind create cluster --name=cluster2 --config=kind-cluster2.yaml
Setting up Cluster Mesh
We can deploy Cilium, and complete setup by following the Cluster Mesh guide
with Setting up Cluster Mesh. For Kind, we’ll want to deploy the NodePort
service into the kube-system
namespace.