Complete Kubernetes Cluster Setup Guide with Cilium on Arch Linux

This comprehensive guide walks you through deploying a production-ready Kubernetes cluster on Arch Linux using Cilium as the Container Network Interface (CNI), ingress controller, and load balancer. We’ll also set up automated NFS storage provisioning and demonstrate how to deploy GitLab with CI/CD pipelines on the cluster.

Architecture Overview

Our cluster deployment includes:

  • 3 Kubernetes nodes: 1 control plane + 2 worker nodes
  • 1 NFS server: For persistent storage with automatic provisioning
  • Cilium networking: Replaces kube-proxy with advanced networking features
  • Integrated load balancing: L2 announcements for external IP management
  • TLS ingress capabilities: SSL/TLS termination with certificate management
  • Storage automation: Dynamic persistent volume provisioning

Key Features:

  • No kube-proxy dependency (Cilium handles everything)
  • Advanced networking with eBPF
  • Built-in ingress controller and load balancer
  • Automatic NFS storage provisioning
  • Production-ready security configurations

Prerequisites

Before starting, ensure you have:

  • Basic knowledge of Kubernetes concepts
  • Understanding of Linux networking and storage
  • 4 virtual or physical machines ready for Arch Linux installation
  • Network access between all machines
  • Administrative privileges on all systems

Resource Requirements:

  • Kubernetes nodes: 8GB RAM, 4 CPU cores, 20GB storage each
  • NFS server: 1GB RAM, 1 CPU core, 200GB storage
  • Network: Private subnet with sufficient IP addresses

Virtual Environment Configuration

We’ll start by setting up the infrastructure for our cluster deployment.

Network Setup

Create a virtual network with the IP range 22.22.22.0/24. This provides ample address space for our cluster and future expansion.

Arch Linux Template Preparation

We’ll create a base Arch Linux template with pre-configured SSH keys that can be cloned for all our machines. This approach ensures consistency and speeds up deployment.

Installation Process:

  1. Initial Setup Commands:
root@archiso ~ # pacman -Sy --noconfirm --needed git wget
root@archiso ~ # git clone https://github.com/hibrit/ais
root@archiso ~ # cd ais
root@archiso ~/ais (git)-[main] # ./1_update_mirrors.sh
  1. Disk Partitioning:
root@archiso ~/ais (git)-[main] # cfdisk /dev/vda
# Create a single partition covering the entire disk
root@archiso ~/ais (git)-[main] # mkfs.ext4 /dev/vda1
root@archiso ~/ais (git)-[main] # mount /dev/vda1 /mnt
  1. System Installation:
root@archiso ~/ais (git)-[main] # ./2_install_base_system.sh
[root@archiso /]# cd ais
[root@archiso ais]# vim ./3_install_system.sh
# Edit the script for minimal installation and correct GRUB configuration
  1. Post-Installation Configuration: After reboot, configure networking using nmtui and enable SSH access:
[flouda@ak8s ~]$ cd Documents/ais
[flouda@ak8s ais]$ ./5_install_yay.sh
[flouda@ak8s ais]$ vim 7_terminal_setup.sh
# Edit for minimal server installation (remove GUI components)
[flouda@ak8s ais]$ ./7_terminal_setup.sh
[flouda@ak8s ais]$ systemctl enable --now sshd.service
  1. SSH Key Distribution:
ssh-copy-id -i ~/.ssh/id_rsa flouda@template.ak8s.net

Once the template is ready, create snapshots and clone 4 instances for your cluster.

NFS Server Configuration

The NFS server provides centralized storage for persistent volumes with automatic provisioning capabilities.

Storage Setup

  1. Add Storage Disk: Attach a 200GB disk to your NFS server and partition it:
sudo cfdisk /dev/vdb
# Create a single partition
sudo mkfs.ext4 /dev/vdb1
  1. Persistent Mount Configuration: Get the partition UUID and configure automatic mounting:
lsblk -f
# Note the UUID of /dev/vdb1
sudo vim /etc/fstab

Add this line to /etc/fstab:

UUID=your-uuid-here /data ext4 defaults 0 2
  1. Mount Point Setup:
sudo mkdir /data
sudo mount -a
sudo chown nobody:nobody -R /data

NFS Service Configuration

  1. Install NFS Utilities:
sudo pacman -Sy --noconfirm --needed nfs-utils
  1. Configure Exports: Edit /etc/exports to share the storage:
sudo vim /etc/exports

Add the following configuration:

/data 22.22.22.0/24(rw,sync,no_subtree_check,no_root_squash)
  1. Start NFS Services:
sudo systemctl enable --now nfs-server.service
sudo exportfs -av

Security Note: The no_root_squash option is used for Kubernetes compatibility but should be carefully considered in production environments.

Kubernetes Node Preparation

Configure all three Kubernetes nodes with the required system modifications and package installations.

System Configuration

Apply these configurations to all Kubernetes nodes:

  1. Kernel Module Loading:
sudo modprobe overlay
sudo modprobe br_netfilter
  1. Persistent Module Configuration: Create /etc/modules-load.d/k8s.conf:
overlay
br_netfilter
  1. Network Parameter Configuration: Create /etc/sysctl.d/k8s.conf:
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1

Apply the settings:

sudo sysctl --system
  1. Verification:
lsmod | grep -E "(overlay|br_netfilter)"
sysctl net.bridge.bridge-nf-call-iptables net.ipv4.ip_forward

Package Installation

Control Plane Node:

sudo pacman -Sy containerd kubelet kubeadm kubectl cni-plugins helm cilium-cli

Worker Nodes:

sudo pacman -Sy containerd kubelet kubeadm kubectl cni-plugins

Container Runtime Configuration

Configure containerd on all nodes. For detailed information about containerd alternatives and rootless configurations, see our comprehensive containerd setup guide:

  1. Generate Default Configuration:
sudo mkdir /etc/containerd
sudo bash -c "containerd config default > /etc/containerd/config.toml"
  1. Critical Configuration Changes: Edit /etc/containerd/config.toml and modify:
  • Set SystemdCgroup = true (for proper cgroup management)
  • Set sandbox_image = "registry.k8s.io/pause:3.9" (use current pause image)
  1. Service Management:
sudo systemctl enable --now containerd
sudo systemctl enable --now kubelet.service
  1. Pre-pull Images:
sudo kubeadm config images pull

Cluster Initialization

Control Plane Setup

Initialize the Kubernetes cluster on your control plane node:

sudo kubeadm init --pod-network-cidr='10.85.0.0/16' --skip-phases=addon/kube-proxy

Important Parameters:

  • --pod-network-cidr: Defines the IP range for pod networking
  • --skip-phases=addon/kube-proxy: We’ll use Cilium instead of kube-proxy

After initialization, kubeadm will provide two critical commands:

  1. Configure kubectl access (run on control plane):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
  1. Join worker nodes (run on each worker):
kubeadm join <control-plane-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>

Node Verification and Labeling

  1. Check Node Status:
kubectl get nodes -o wide

Nodes will show “NotReady” status until networking is configured.

  1. Label Worker Nodes:
kubectl label node ak8s2 node-role.kubernetes.io/worker=worker
kubectl label node ak8s3 node-role.kubernetes.io/worker=worker

Cilium Installation and Configuration

Cilium provides advanced networking capabilities, replacing kube-proxy with eBPF-based networking.

Install Cilium

Deploy Cilium with comprehensive features enabled:

cilium install \
    --set kubeProxyReplacement=true \
    --set ingressController.enabled=true \
    --set ingressController.loadbalancerMode=dedicated \
    --set ingressController.default=true \
    --set l2announcements.enabled=true \
    --set k8sClientRateLimit.qps=30 \
    --set k8sClientRateLimit.burst=50 \
    --set externalIPs.enabled=true \
    --set devices=enp1s0

Configuration Explanation:

  • kubeProxyReplacement=true: Replace kube-proxy with Cilium
  • ingressController.enabled=true: Enable built-in ingress controller
  • l2announcements.enabled=true: Enable Layer 2 IP announcements
  • externalIPs.enabled=true: Support for external IP addresses
  • devices=enp1s0: Network interface for L2 announcements

Monitor Deployment

watch kubectl -n kube-system get pod

Wait for all Cilium pods to reach “Running” status.

Load Balancer IP Pool Configuration

Create an IP pool for load balancer services:

apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
    name: "ak8s-pool"
    namespace: kube-system
spec:
    cidrs:
        - cidr: "22.22.22.0/24"

L2 Announcement Policy

Configure which nodes announce external IPs:

apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
    name: ak8s-policy
    namespace: kube-system
spec:
    nodeSelector:
        matchExpressions:
            - key: node-role.kubernetes.io/control-plane
              operator: DoesNotExist
    interfaces:
        - enp1s0
    externalIPs: true
    loadBalancerIPs: true

Apply both configurations:

kubectl apply -f cilium-ip-pool.yaml
kubectl apply -f cilium-l2-policy.yaml

Cluster Validation

Test your cluster functionality with a complete ingress setup.

TLS Certificate Generation

Create a self-signed certificate for testing:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout cert.key -out cert.crt -subj "/CN=test.ak8s.net"
kubectl create secret tls web-tls --cert=./cert.crt --key=./cert.key

Test Application Deployment

  1. Deploy Test Pod:
apiVersion: v1
kind: Pod
metadata:
    labels:
        run: web
    name: web
spec:
    containers:
        - image: nginx:latest
          name: web
          ports:
          - containerPort: 80
    restartPolicy: Always
  1. Create Service:
apiVersion: v1
kind: Service
metadata:
    labels:
        run: web
    name: web
spec:
    ports:
        - port: 80
          protocol: TCP
          targetPort: 80
    selector:
        run: web
    type: ClusterIP
  1. Configure Ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
    name: web-ingress
    annotations:
        nginx.ingress.kubernetes.io/rewrite-target: /
spec:
    tls:
        - hosts:
              - test.ak8s.net
          secretName: web-tls
    rules:
        - host: test.ak8s.net
          http:
              paths:
                  - path: /
                    pathType: Prefix
                    backend:
                        service:
                            name: web
                            port:
                                number: 80

Verification Steps

  1. Check Resources:
kubectl get pod,svc,ingress
  1. Note External IP: Check the EXTERNAL-IP assigned to your ingress and add it to your /etc/hosts file:
<external-ip> test.ak8s.net
  1. Test Access: Navigate to https://test.ak8s.net in your browser. You should see the default Nginx page with a self-signed certificate warning.

NFS Storage Provisioning

Implement automatic persistent volume provisioning using NFS.

Install NFS Provisioner

Use Helm to deploy the NFS subdir external provisioner:

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
 
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
    --namespace kube-system \
    --set nfs.server=nfs.ak8s.net \
    --set nfs.path=/data \
    --set storageClass.defaultClass=true \
    --set storageClass.onDelete=delete \
    --set storageClass.accessModes=ReadWriteMany

Configuration Parameters:

  • nfs.server: FQDN or IP of your NFS server
  • nfs.path: Export path on the NFS server
  • storageClass.defaultClass=true: Makes this the default storage class
  • storageClass.onDelete=delete: Removes data when PVCs are deleted
  • accessModes=ReadWriteMany: Allows multiple pods to mount the same volume

Verify Storage Class

kubectl get storageclass

You should see nfs-client marked as the default storage class.

Test Persistent Volume Claims

Create a test PVC to verify automatic provisioning:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
    name: test-pvc
spec:
    accessModes:
        - ReadWriteMany
    resources:
        requests:
            storage: 1Gi

After applying this configuration:

kubectl apply -f test-pvc.yaml
kubectl get pv,pvc

You should see a persistent volume automatically created and bound to your claim.

Advanced Configuration and Optimization

Performance Tuning

Cilium Optimization:

  • Adjust k8sClientRateLimit values based on cluster size
  • Configure CPU and memory limits for Cilium pods
  • Enable Cilium Hubble for network observability

Storage Optimization:

  • Configure NFS mount options for better performance
  • Consider using faster storage backend for NFS server
  • Implement backup strategies for persistent data

Security Hardening

Network Policies: Implement Cilium network policies to restrict pod-to-pod communication:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
    name: "default-deny"
spec:
    endpointSelector: {}
    egress:
    - toEndpoints:
      - matchLabels:
          k8s:io.kubernetes.pod.namespace: kube-system

RBAC Configuration: Implement proper Role-Based Access Control for cluster security.

Monitoring and Logging

Cilium Hubble: Enable network observability:

cilium hubble enable --ui

Metrics Collection: Configure Prometheus and Grafana for cluster monitoring (see our Prometheus and Grafana deployment guide).

Troubleshooting Common Issues

Network Connectivity Problems

Symptom: Pods cannot communicate across nodes Solution:

  1. Verify Cilium pods are running: kubectl -n kube-system get pods -l k8s-app=cilium
  2. Check Cilium status: cilium status
  3. Verify L2 announcements: cilium l2 announce list

Symptom: External IPs not accessible Solution:

  1. Check CiliumLoadBalancerIPPool configuration
  2. Verify L2AnnouncementPolicy targets correct nodes
  3. Ensure network interface is correctly specified

Storage Issues

Symptom: PVCs stuck in Pending state Solution:

  1. Check NFS server connectivity: showmount -e nfs.ak8s.net
  2. Verify NFS provisioner pods: kubectl -n kube-system get pods -l app=nfs-subdir-external-provisioner
  3. Check storage class configuration: kubectl describe storageclass nfs-client

Symptom: NFS mount failures Solution:

  1. Verify NFS exports: sudo exportfs -av
  2. Check firewall rules on NFS server
  3. Test manual mount: sudo mount -t nfs nfs.ak8s.net:/data /mnt/test

Certificate and TLS Issues

Symptom: Ingress TLS not working Solution:

  1. Verify certificate secret exists: kubectl get secrets
  2. Check ingress controller logs: kubectl -n kube-system logs -l app.kubernetes.io/name=cilium
  3. Validate certificate: openssl x509 -in cert.crt -text -noout

Next Steps and Integration

With your cluster fully operational, you can proceed with:

  1. GitLab Deployment: Follow our GitLab deployment guide to set up version control and CI/CD
  2. Container Registry: Deploy a private Docker registry for custom images
  3. Monitoring Stack: Implement comprehensive monitoring and alerting
  4. Backup Solutions: Set up automated backup strategies for persistent data
  5. High Availability: Expand to multiple control plane nodes for production resilience

References and Resources

Official Documentation

Cilium-Specific Resources

Storage and NFS

Security and Best Practices

Questions Answered in This Document

Q: How do I set up a Kubernetes cluster without kube-proxy? A: Use Cilium as a complete replacement for kube-proxy by installing it with kubeProxyReplacement=true. Cilium provides all networking functionality using eBPF technology, offering better performance and advanced features.

Q: What are the hardware requirements for a Cilium-based Kubernetes cluster? A: For a basic setup, use 8GB RAM and 4 CPU cores per Kubernetes node, plus 1GB RAM and 1 CPU core for the NFS server. The NFS server needs substantial storage (200GB+) for persistent volumes.

Q: How does Cilium handle load balancing and ingress traffic? A: Cilium includes built-in ingress controller and load balancer capabilities using L2 announcements. Configure CiliumLoadBalancerIPPool for IP allocation and CiliumL2AnnouncementPolicy to control which nodes announce external IPs.

Q: Can I use NFS for Kubernetes persistent storage in production? A: Yes, with proper configuration. Use the NFS subdir external provisioner for automatic volume provisioning, ensure NFS server redundancy, implement proper backup strategies, and consider performance implications for high-throughput applications.

Q: How do I troubleshoot Cilium networking issues? A: Use cilium status to check overall health, cilium connectivity test for network validation, and kubectl -n kube-system logs -l k8s-app=cilium for detailed logs. Enable Hubble for network observability and troubleshooting.

Q: What makes Cilium better than traditional Kubernetes networking? A: Cilium uses eBPF technology for faster packet processing, provides built-in ingress and load balancing, offers advanced security policies, includes network observability tools, and eliminates the need for kube-proxy.

Q: How do I secure my Kubernetes cluster with Cilium? A: Implement Cilium network policies for micro-segmentation, use proper RBAC configurations, enable TLS for all communications, secure the NFS server with appropriate export permissions, and regularly update all cluster components.

Q: Can this setup handle production workloads? A: This setup provides a solid foundation for production use. For full production readiness, add multiple control plane nodes for high availability, implement comprehensive monitoring and alerting, set up automated backups, and establish proper disaster recovery procedures.