RKE2 with Cilium CNI - Production-Ready Configuration Guide

This comprehensive guide covers deploying a production-ready Kubernetes cluster using RKE2 (Rancher Kubernetes Engine 2) with Cilium as the Container Network Interface (CNI). This configuration provides advanced networking capabilities, built-in load balancing, ingress control, and observability features through Hubble.

Overview

RKE2 is Rancher’s next-generation Kubernetes distribution that focuses on security and compliance. When combined with Cilium CNI, it provides a powerful networking solution that can replace kube-proxy while offering advanced features like L2 announcements, ingress control, and comprehensive network observability.

This setup is ideal for production environments that need:

  • High-performance networking without kube-proxy overhead
  • Built-in ingress controller capabilities
  • Advanced load balancing and service mesh features
  • Comprehensive network observability and monitoring
  • Simplified cluster management

Prerequisites

Before proceeding with this configuration, ensure you have:

  • A Linux server with root access (minimum 4GB RAM, 2 CPU cores)
  • Network connectivity between nodes (if multi-node setup)
  • Helm 3.x installed on your system
  • Basic understanding of Kubernetes concepts
  • Container runtime properly configured (for alternatives to Docker, see our containerd setup guide)

For package management on Arch Linux systems, refer to our pacman cheatsheet.

Core Components

RKE2 Server Configuration

The RKE2 server requires specific configuration to work optimally with Cilium. The key aspects include disabling the default CNI, kube-proxy, and ingress controller to allow Cilium to handle these functions.

Create the RKE2 configuration file at /etc/rancher/rke2/config.yaml:

write-kubeconfig-mode: "0644"
tls-san:
  - "10.10.10.55"
  - "cluster.lab.net"
cni: "none"
disable-kube-proxy: "true"
disable-cloud-controller: "true"
cluster-domain: "cluster.lab.net"
node-ip: "10.10.10.55"
node-name: "master01"
cluster-cidr: "10.40.0.0/16"
service-cidr: "10.41.0.0/16"
cluster-dns: "10.41.0.10"
disable:
  - rke2-ingress-nginx

Configuration Breakdown:

  • write-kubeconfig-mode: "0644" - Sets appropriate permissions for kubeconfig file
  • tls-san - Subject Alternative Names for TLS certificates (include your cluster IP and domain)
  • cni: "none" - Disables default CNI to allow Cilium installation
  • disable-kube-proxy: "true" - Allows Cilium to replace kube-proxy functionality
  • disable-cloud-controller: "true" - Disables cloud controller for on-premises setups
  • cluster-domain - Sets the cluster’s internal domain name
  • node-ip - Specifies the IP address for this node
  • cluster-cidr - Pod IP range (adjust based on your network requirements)
  • service-cidr - Service IP range (must not overlap with cluster-cidr)
  • cluster-dns - DNS server IP (typically first IP in service-cidr range + 10)
  • disable: [rke2-ingress-nginx] - Disables default ingress controller

Cilium Helm Installation

Once RKE2 is installed and running, deploy Cilium using Helm with the following configuration:

# Add Cilium Helm repository
helm repo add cilium https://helm.cilium.io
helm repo update
 
# Install Cilium with comprehensive feature set
helm upgrade --install cilium cilium/cilium --version 1.17.5 \
    --namespace kube-system \
    --reuse-values \
    --set ingressController.enabled=true \
    --set ingressController.default=true \
    --set ingressController.loadbalancerMode=shared \
    --set kubeProxyReplacement=true \
    --set k8sServiceHost=10.10.10.55 \
    --set k8sServicePort=6443 \
    --set l2announcements.enabled=true \
    --set k8sClientRateLimit.qps=30 \
    --set k8sClientRateLimit.burst=50 \
    --set externalIPs.enabled=true \
    --set operator.replicas=1 \
    --set hubble.relay.enabled=true \
    --set hubble.ui.enabled=true \
    --set hubble.enabled=true \
    --set gatewayAPI.enabled=true

Cilium Configuration Options Explained:

Core Networking:

  • kubeProxyReplacement=true - Enables Cilium to replace kube-proxy entirely
  • k8sServiceHost/Port - Kubernetes API server endpoint for direct communication
  • externalIPs.enabled=true - Allows services to use external IP addresses

Ingress and Load Balancing:

  • ingressController.enabled=true - Enables built-in ingress controller
  • ingressController.default=true - Makes Cilium the default ingress class
  • ingressController.loadbalancerMode=shared - Enables shared load balancer mode
  • l2announcements.enabled=true - Enables L2 network announcements for load balancing

Performance and Scaling:

  • k8sClientRateLimit.qps=30 - API server query rate limit
  • k8sClientRateLimit.burst=50 - API server burst rate limit
  • operator.replicas=1 - Number of Cilium operator replicas

Observability:

  • hubble.enabled=true - Enables Hubble observability platform
  • hubble.relay.enabled=true - Enables Hubble relay for cluster-wide visibility
  • hubble.ui.enabled=true - Enables web-based Hubble UI

Advanced Features:

  • gatewayAPI.enabled=true - Enables Kubernetes Gateway API support

Deployment Process

Step 1: Install RKE2 Server and Agent

  1. Download and install RKE2:
# for server installations
curl -sfL https://get.rke2.io | sh -
 
# for agent installations
curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -
  1. Create the configuration directory:
mkdir -p /etc/rancher/rke2
  1. Create the configuration file with the YAML content above

    RKE2 Server Configuration

    The RKE2 server requires specific configuration to work optimally with Cilium. The key aspects include disabling the default CNI, kube-proxy, and ingress controller to allow Cilium to handle these functions.

    Create the RKE2 configuration file at /etc/rancher/rke2/config.yaml:

    write-kubeconfig-mode: "0644"
    tls-san:
      - "10.10.10.55"
      - "cluster.lab.net"
    cni: "none"
    disable-kube-proxy: "true"
    disable-cloud-controller: "true"
    cluster-domain: "cluster.lab.net"
    node-ip: "10.10.10.55"
    node-name: "master01"
    cluster-cidr: "10.40.0.0/16"
    service-cidr: "10.41.0.0/16"
    cluster-dns: "10.41.0.10"
    disable:
      - rke2-ingress-nginx

    Configuration Breakdown:

    • write-kubeconfig-mode: "0644" - Sets appropriate permissions for kubeconfig file
    • tls-san - Subject Alternative Names for TLS certificates (include your cluster IP and domain)
    • cni: "none" - Disables default CNI to allow Cilium installation
    • disable-kube-proxy: "true" - Allows Cilium to replace kube-proxy functionality
    • disable-cloud-controller: "true" - Disables cloud controller for on-premises setups
    • cluster-domain - Sets the cluster’s internal domain name
    • node-ip - Specifies the IP address for this node
    • cluster-cidr - Pod IP range (adjust based on your network requirements)
    • service-cidr - Service IP range (must not overlap with cluster-cidr)
    • cluster-dns - DNS server IP (typically first IP in service-cidr range + 10)
    • disable: [rke2-ingress-nginx] - Disables default ingress controller
    Link to original

  2. Start and enable RKE2:

# for servers
systemctl enable --now rke2-server.service
# for agents
systemctl enable --now rke2-agent.service
  1. Set up kubectl access:
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

Step 2: Install Helm

curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Step 3: Deploy Cilium

Execute the Helm command provided in the components section above.

Cilium Helm Installation

Once RKE2 is installed and running, deploy Cilium using Helm with the following configuration:

# Add Cilium Helm repository
helm repo add cilium https://helm.cilium.io
helm repo update
 
# Install Cilium with comprehensive feature set
helm upgrade --install cilium cilium/cilium --version 1.17.5 \
    --namespace kube-system \
    --reuse-values \
    --set ingressController.enabled=true \
    --set ingressController.default=true \
    --set ingressController.loadbalancerMode=shared \
    --set kubeProxyReplacement=true \
    --set k8sServiceHost=10.10.10.55 \
    --set k8sServicePort=6443 \
    --set l2announcements.enabled=true \
    --set k8sClientRateLimit.qps=30 \
    --set k8sClientRateLimit.burst=50 \
    --set externalIPs.enabled=true \
    --set operator.replicas=1 \
    --set hubble.relay.enabled=true \
    --set hubble.ui.enabled=true \
    --set hubble.enabled=true \
    --set gatewayAPI.enabled=true

Cilium Configuration Options Explained:

Core Networking:

  • kubeProxyReplacement=true - Enables Cilium to replace kube-proxy entirely
  • k8sServiceHost/Port - Kubernetes API server endpoint for direct communication
  • externalIPs.enabled=true - Allows services to use external IP addresses

Ingress and Load Balancing:

  • ingressController.enabled=true - Enables built-in ingress controller
  • ingressController.default=true - Makes Cilium the default ingress class
  • ingressController.loadbalancerMode=shared - Enables shared load balancer mode
  • l2announcements.enabled=true - Enables L2 network announcements for load balancing

Performance and Scaling:

  • k8sClientRateLimit.qps=30 - API server query rate limit
  • k8sClientRateLimit.burst=50 - API server burst rate limit
  • operator.replicas=1 - Number of Cilium operator replicas

Observability:

  • hubble.enabled=true - Enables Hubble observability platform
  • hubble.relay.enabled=true - Enables Hubble relay for cluster-wide visibility
  • hubble.ui.enabled=true - Enables web-based Hubble UI

Advanced Features:

  • gatewayAPI.enabled=true - Enables Kubernetes Gateway API support
Link to original

Step 4: Verify Installation

# Check Cilium status
kubectl get pods -n kube-system -l app.kubernetes.io/name=cilium
 
# Verify Cilium connectivity
kubectl exec -n kube-system -ti ds/cilium -- cilium-health status
 
# Check Hubble status
kubectl get pods -n kube-system -l k8s-app=hubble-relay

Advanced Configuration Options

Container Runtime Integration

This RKE2 setup works seamlessly with various container runtimes. While RKE2 includes containerd by default, you can also configure alternative runtimes. For detailed container runtime setup and management, including rootless configurations, see our containerd setup guide.

L2 Announcements Configuration

For environments requiring L2 load balancing, create a CiliumL2AnnouncementPolicy:

apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
  name: policy1
spec:
  serviceSelector:
    matchLabels:
      color: blue
  nodeSelector:
    matchExpressions:
      - key: node-role.kubernetes.io/control-plane
        operator: DoesNotExist
  interfaces:
  - ^eth[0-9]+
  externalIPs: true
  loadBalancerIPs: true

IP Pool Configuration

CiliumLoadBalancerIPPool CRD is used for giving the cilium ability to distribute IP addresses when needed.

apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: "blue-pool"
spec:
  blocks:
  - cidr: "10.0.10.0/24"
  - cidr: "2004::0/64"
  - start: "20.0.20.100"
    stop: "20.0.20.200"
  - start: "1.2.3.4"

Hubble Observability Access

Access the Hubble UI for network observability:

# Port forward to Hubble UI
kubectl port-forward -n kube-system svc/hubble-ui 12000:80
 
# Access at http://localhost:12000

Troubleshooting

Common Issues and Solutions

Cilium pods not starting:

  • Check that CNI is disabled in RKE2 configuration
  • Verify cluster CIDR ranges don’t conflict
  • Ensure sufficient system resources

Ingress not working:

  • Verify ingress controller is enabled and running
  • Check that the ingress class is set to ‘cilium’
  • Confirm L2 announcements are properly configured

Network connectivity issues:

  • Run cilium connectivity tests: kubectl exec -n kube-system -ti ds/cilium -- cilium connectivity test
  • Check Hubble for network policy violations
  • Verify firewall rules allow required ports

Useful Commands

# Check Cilium agent status
kubectl exec -n kube-system -ti ds/cilium -- cilium status
 
# View Cilium configuration
kubectl exec -n kube-system -ti ds/cilium -- cilium config
 
# Monitor real-time network flows
kubectl exec -n kube-system -ti ds/cilium -- hubble observe
 
# Check load balancer services
kubectl get svc -o wide

Performance Optimization

Network Performance Tuning

For high-performance networking, consider these additional Helm values:

--set tunnel=disabled \
--set autoDirectNodeRoutes=true \
--set ipv4NativeRoutingCIDR=10.40.0.0/16 \
--set endpointRoutes.enabled=true

Resource Limits

For production deployments, set appropriate resource limits:

--set resources.requests.cpu=100m \
--set resources.requests.memory=128Mi \
--set resources.limits.cpu=4000m \
--set resources.limits.memory=4Gi

Security Considerations

  • Ensure proper network policies are in place
  • Use TLS for all communications
  • Regularly update Cilium and RKE2 versions
  • Monitor cluster activity through Hubble
  • Implement proper RBAC policies
  • Configure firewall rules appropriately (see our UFW firewall guide)
  • Consider container runtime security best practices (detailed in our containerd guide)

Integration with Existing Infrastructure

Container Runtime Considerations

RKE2 includes containerd as the default container runtime, but understanding container runtime concepts is crucial for advanced configurations. For comprehensive container runtime management, including rootless setups and Docker alternatives, refer to our containerd setup guide.

This RKE2 with Cilium setup provides an excellent foundation for various Kubernetes workloads. For specific deployment guides, check our other Kubernetes resources:

  • Container registry deployment
  • Monitoring and observability stack setup
  • GitLab deployment on Kubernetes

Network Integration

The network configuration in this guide integrates well with existing infrastructure. For additional networking tools and configurations, see our UFW firewall cheatsheet for securing your cluster.

References and Resources

Questions Answered in This Document

Q: What is RKE2 and why use it with Cilium? A: RKE2 is Rancher’s next-generation Kubernetes distribution focusing on security and compliance. Combined with Cilium CNI, it provides advanced networking capabilities including kube-proxy replacement, built-in ingress control, and comprehensive observability.

Q: How do I disable the default CNI in RKE2? A: Set cni: "none" in the RKE2 configuration file (/etc/rancher/rke2/config.yaml) to disable the default CNI plugin and allow Cilium installation.

Q: What does kubeProxyReplacement do in Cilium? A: When enabled, Cilium completely replaces kube-proxy functionality, providing better performance and additional features like direct server return and advanced load balancing.

Q: How do I enable Hubble for network observability? A: Enable Hubble by setting --set hubble.enabled=true, --set hubble.relay.enabled=true, and --set hubble.ui.enabled=true in the Helm installation command.

Q: What are L2 announcements in Cilium? A: L2 announcements enable Cilium to advertise service IPs at the L2 network level, providing load balancing capabilities without requiring external load balancers.

Q: How do I access the Hubble UI? A: Use port forwarding: kubectl port-forward -n kube-system svc/hubble-ui 12000:80 and access http://localhost:12000 in your browser.

Q: What IP ranges should I use for cluster-cidr and service-cidr? A: Use non-overlapping private IP ranges. Common examples: cluster-cidr: “10.40.0.0/16” for pods, service-cidr: “10.41.0.0/16” for services.

Q: How do I troubleshoot Cilium connectivity issues? A: Run connectivity tests using kubectl exec -n kube-system -ti ds/cilium -- cilium connectivity test and check network flows with Hubble observability tools.

Q: Can Cilium replace the default ingress controller? A: Yes, Cilium includes a built-in ingress controller. Enable it with --set ingressController.enabled=true and disable the default RKE2 ingress controller.

Q: What are the minimum system requirements for this setup? A: Minimum 4GB RAM, 2 CPU cores, and sufficient disk space. For production environments, consider higher specifications based on workload requirements.