Skip to content

K8s Deep Dive: Networking

This is a companion to the Home Kubernetes Cluster overview. That page covers the full stack at a high level. This one goes deep on networking — how traffic enters the cluster, how it moves between pods, and how it gets observed and controlled.

Networking on bare-metal Kubernetes is fundamentally different from managed cloud environments. There’s no cloud load balancer API, no VPC-native pod networking, no managed firewall rules. Every layer has to be explicitly built and configured. That’s the tradeoff for running on Orange Pi 5 SBCs: complete control, complete responsibility.

Traffic flows through four distinct layers before it reaches a pod. Each layer has a specific job, and understanding the handoffs between them is key to debugging anything network-related.

graph TB
ext["External Client"] -->|"HTTP/HTTPS"| mlb["MetalLB L2\n(ARP announcement)"]
mlb -->|"Routes to LoadBalancer IP\n192.168.86.18"| nginx["ingress-nginx\n(controller pod)"]
nginx -->|"Route by Host/Path"| svc["Kubernetes Services\n(ClusterIP)"]
svc -->|"eBPF load balancing"| pods["Application Pods"]
pods <-->|"Pod-to-Pod\n(eBPF datapath)"| cilium["Cilium Agent\n(per node)"]
cilium -->|"Flow telemetry"| hubble["Hubble\n(relay + UI)"]
cilium -->|"Policy enforcement"| cnp["CiliumNetworkPolicy\n(L3-L7 rules)"]
style ext fill:#f5f5f5,stroke:#333
style mlb fill:#e8f4fd,stroke:#2196F3
style nginx fill:#e8f4fd,stroke:#2196F3
style svc fill:#e8f4fd,stroke:#2196F3
style pods fill:#e8f5e9,stroke:#4CAF50
style cilium fill:#fff3e0,stroke:#FF9800
style hubble fill:#fff3e0,stroke:#FF9800
style cnp fill:#fce4ec,stroke:#E91E63

The short version: MetalLB gives us a LoadBalancer IP on the local network. ingress-nginx listens on that IP and routes HTTP(S) traffic by hostname and path. Cilium handles everything from the Service layer down — load balancing to pods, pod-to-pod communication, and policy enforcement. Hubble watches all of it.

Version: 1.16.18 | Kernel: 6.1.115-vendor-rk35xx | Mode: kube-proxy replacement

Cilium is the CNI (Container Network Interface) for the cluster, and it replaces kube-proxy entirely. There is no kube-proxy DaemonSet running on these nodes. Cilium handles service discovery, load balancing, and network policy enforcement through eBPF programs attached directly to the Linux kernel’s networking stack.

The decision to use Cilium over alternatives like Calico or Flannel is documented in ADR-001: Cilium as CNI over Calico. The short version: eBPF performance on resource-constrained hardware, L7-aware network policies for IoT segmentation, and Hubble observability out of the box.

The traditional kube-proxy model uses iptables rules for service routing. Every Kubernetes Service creates a set of iptables chains — PREROUTING, FORWARD, POSTROUTING — and each packet traverses them linearly. With 50 services, that’s potentially hundreds of rules evaluated per packet. The lookup complexity is O(n) where n is the number of rules.

eBPF replaces this with hash-map lookups in kernel space. A packet arrives, Cilium’s eBPF program does a single hash-map lookup to find the target pod, and the packet is redirected. O(1) regardless of how many services exist.

On a cloud VM with 32 cores and 128 GB RAM, the iptables overhead is noise. On an Orange Pi 5 with 8 ARM cores (half of which are efficiency cores) and 16 GB RAM running a graph database, an AI agent platform, and Kubernetes system components simultaneously — it’s not noise. Every CPU cycle spent traversing iptables chains is a cycle not available for actual workloads.

The practical impact: Cilium’s eBPF kube-proxy replacement measurably reduced CPU overhead on these nodes. Not by a dramatic amount in absolute terms, but enough to matter when you’re running close to capacity.

Cilium runs as a DaemonSet — one agent pod per node. Each agent:

  1. Compiles eBPF programs for the node’s kernel at startup. The Rockchip vendor kernel (6.1.115) supports eBPF, which was verified before deploying Cilium. This is non-negotiable: if the kernel doesn’t support the required eBPF features, Cilium won’t start.

  2. Manages pod networking by assigning IPs from the cluster CIDR, setting up veth pairs, and attaching eBPF programs to each pod’s network interface.

  3. Replaces kube-proxy by watching the Kubernetes API for Service and Endpoint changes, then updating eBPF maps accordingly. Service traffic never touches iptables.

  4. Enforces CiliumNetworkPolicy rules by attaching eBPF filters that inspect packets at L3, L4, and L7 before allowing or dropping them.

Standard Kubernetes NetworkPolicy resources operate at L3/L4 — you can allow or deny traffic based on IP addresses, CIDR blocks, ports, and protocols. That’s useful but limited. You can say “allow TCP/443 from namespace X” but you can’t say “allow GET requests to /api/health but deny POST requests to /api/admin.”

CiliumNetworkPolicy adds L7 awareness. Three capabilities matter most in this cluster:

CiliumNetworkPolicy can inspect HTTP headers, methods, and paths. For a cluster running Home Assistant (which exposes an HTTP API that controls physical devices in my house), this is the difference between “allow HTTP traffic” and “allow GET requests to specific API endpoints only.” The threat model includes compromised IoT devices making unexpected API calls — L3/L4 policies can’t distinguish a legitimate sensor reading from a malicious command injection if they’re both HTTP on the same port.

Policies can reference external destinations by DNS name rather than IP address. This matters for egress control — you can allow a pod to reach api.openai.com without hardcoding IP ranges that change whenever the provider updates their infrastructure. Cilium’s DNS proxy intercepts DNS queries and dynamically updates the allowed IP set.

Instead of referencing pods by IP (which is ephemeral in Kubernetes), Cilium assigns identities based on labels. Policies reference these identities, which means they survive pod restarts, rescheduling, and IP changes. A policy that says “allow traffic from pods with label app=openclaw-api” works regardless of which node the pod lands on or what IP it gets.

The cluster currently has targeted egress policies:

NamespacePolicy NamePurpose
openclawopenclaw-ha-egressControls outbound traffic from OpenClaw to Home Assistant
openclaw-debraopenclaw-debra-ha-egressControls outbound traffic from the Debra agent to Home Assistant

One operational nuance worth noting: Home Assistant runs with hostNetwork: true because it needs direct access to the host network for mDNS device discovery and multicast traffic that doesn’t work well through Kubernetes networking. The consequence is that pod-level CiliumNetworkPolicy enforcement is bypassed for HA’s inbound traffic — the pod shares the host’s network namespace, so Cilium’s per-pod eBPF filters aren’t in the packet path. The egress policies on the calling side (OpenClaw’s namespaces) are what actually enforce the boundary. This is a known tradeoff, documented and intentional.

Hubble is Cilium’s observability layer. It taps into the same eBPF programs that handle networking and policy enforcement, which means it sees every packet, every DNS query, every HTTP request, and every policy verdict — with zero additional instrumentation in the application code.

Three components make up the Hubble deployment:

ComponentClusterIPPortPurpose
hubble-relay10.98.200.22080Aggregates flow data from all Cilium agents into a single gRPC stream
hubble-ui10.105.175.1680Web dashboard for visualizing service maps and traffic flows
hubble-peer10.105.63.222443Node-to-node communication for distributed flow collection

The practical value of Hubble in this cluster:

  • Traffic flows in real time. I can see exactly which pods are talking to which other pods, what protocols they’re using, and how much data is moving. When an AI agent starts making unexpected external calls, it shows up immediately.

  • DNS queries. Every DNS lookup from every pod is visible. This is how I verify that egress DNS policies are working — if a pod tries to resolve a domain it shouldn’t be reaching, Hubble captures the query and the policy verdict.

  • HTTP request/response metadata. For L7-inspected traffic, Hubble shows HTTP methods, paths, status codes, and latency. Useful for debugging service-to-service communication without deploying a separate service mesh.

  • Policy verdicts. Every packet that gets allowed or dropped by a CiliumNetworkPolicy is logged with the specific policy that made the decision. When something breaks after a policy change, Hubble tells you exactly which rule is dropping the traffic.

The Hubble UI is accessible within the cluster via its ClusterIP. For remote access, I use kubectl port-forward through the Tailscale mesh — no need to expose it externally.

Version: v0.13.12 | Mode: L2

In a cloud environment, when you create a Kubernetes Service of type LoadBalancer, the cloud provider’s controller allocates an external IP and configures a load balancer. On bare-metal, there’s no cloud controller. Without MetalLB (or something like it), LoadBalancer services stay in Pending state forever — Kubernetes is waiting for an external system that doesn’t exist.

MetalLB fills this gap. It watches for LoadBalancer services and assigns IPs from a configured pool, then announces those IPs on the local network so traffic can reach them.

MetalLB supports two modes: L2 (ARP/NDP) and BGP. I use L2 because the cluster sits on a flat home network with a single subnet. There’s no BGP router to peer with, and the simplicity of L2 mode is appropriate for this topology.

In L2 mode, one node becomes the “leader” for each allocated IP. That node responds to ARP requests for the IP, making all traffic for that service flow through a single node. This is a limitation — there’s no true load balancing across nodes at the network layer. But for a 4-node home cluster, this is fine. The ingress controller on the receiving node handles distribution to backend pods across all nodes via Cilium’s eBPF load balancing.

SettingValue
IP Pool Namefirst-pool
Address Range192.168.86.5192.168.86.19
Available IPs15
L2Advertisementkubelab
ModeLayer 2 (ARP)

The pool is carved out of my home network’s 192.168.86.0/24 subnet, with addresses reserved in the router’s DHCP configuration so they’re never assigned to other devices. Fifteen IPs is more than I currently need — most services are ClusterIP behind the ingress controller — but having headroom avoids the situation where a new LoadBalancer service can’t get an IP.

Here’s the full sequence for an external HTTP request reaching a pod:

sequenceDiagram
participant Client as External Client
participant Router as Home Router
participant MetalLB as MetalLB (Leader Node)
participant Nginx as ingress-nginx
participant Cilium as Cilium eBPF
participant Pod as Application Pod
Client->>Router: HTTP request to 192.168.86.18
Router->>Router: ARP: Who has 192.168.86.18?
MetalLB-->>Router: ARP reply (leader node MAC)
Router->>MetalLB: Forward packet to leader node
MetalLB->>Nginx: kube-proxy replacement routes to ingress controller pod
Nginx->>Nginx: Match Host header + path to Ingress rule
Nginx->>Cilium: Forward to backend Service ClusterIP
Cilium->>Pod: eBPF load balances to healthy endpoint
Pod-->>Client: Response traverses reverse path

The key insight: MetalLB only handles the “get traffic to a node” problem. Once the packet is on a node, Cilium’s eBPF takes over for all subsequent routing — from the ingress controller to the backend service, and from the service to the actual pod.

Helm Chart: 4.14.1 | Controller: 1.14.1

ingress-nginx is the cluster’s HTTP(S) ingress controller. It’s the single point of entry for all HTTP traffic from outside the cluster. Every web-facing service — Home Assistant’s dashboard, OpenClaw’s API, Hubble’s UI when port-forwarded — routes through it.

ServiceTypeIPPorts
ingress-nginx-controllerLoadBalancer192.168.86.1880 (HTTP), 443 (HTTPS)
ingress-nginx-controller-admissionClusterIP443 (webhook)
ingress-nginx-default-backendClusterIP80
ingress-nginx-controller-metricsClusterIP10254

The controller service is the only LoadBalancer type — MetalLB assigns it 192.168.86.18. The admission webhook validates Ingress resources before they’re applied (catching syntax errors at apply-time rather than runtime). The default backend returns a 404 for any request that doesn’t match a configured Ingress rule. Metrics on port 10254 expose Prometheus-format data for monitoring.

TLS is handled by cert-manager (v1.13.3), which provisions and renews certificates automatically. The ingress controller terminates TLS at the edge — backend pods receive plain HTTP. This simplifies application configuration and centralizes certificate management.

For services accessed over Tailscale, TLS isn’t strictly necessary (the WireGuard tunnel is already encrypted), but I run it anyway. Defense in depth, and it means the same Ingress definitions work whether the request comes over Tailscale or the local network.

ingress-nginx was chosen for two reasons: maturity and simplicity. It’s the most widely deployed ingress controller in the Kubernetes ecosystem, which means every problem I hit has been hit by someone else first. Traefik has a nicer dashboard and more features, but features mean complexity, and on a cluster where I’m also debugging Cilium policies, Longhorn storage, and ARM64 compatibility issues, I want the ingress layer to be boring.

I’m watching the Kubernetes Gateway API as the eventual replacement for Ingress resources. Cilium has its own Gateway API implementation, which would collapse the ingress controller and CNI into a single component. But as of now, the Gateway API ecosystem on ARM64 isn’t as battle-tested as ingress-nginx, so I’m staying with what works.

ComponentVersionPurpose
Cilium1.16.18CNI, eBPF dataplane, kube-proxy replacement, network policy enforcement
Hubble(bundled with Cilium)Network observability — flows, DNS, HTTP, policy verdicts
MetalLBv0.13.12Bare-metal LoadBalancer implementation, L2 mode
ingress-nginxController 1.14.1 (Chart 4.14.1)HTTP(S) ingress routing, TLS termination
cert-managerv1.13.3Automated TLS certificate provisioning and renewal
Tailscale(node-level)Encrypted remote access via WireGuard mesh, no exposed ports
Kernel6.1.115-vendor-rk35xxRockchip vendor kernel with eBPF support for Cilium

Every layer in this stack was chosen to solve a specific bare-metal problem. MetalLB because cloud LoadBalancer doesn’t exist here. Cilium because iptables doesn’t scale well on 8-core SBCs. ingress-nginx because HTTP routing needs to happen somewhere. cert-manager because manual certificate management doesn’t survive 3 AM renewals. Tailscale because exposing ports to the public internet is not an option when Home Assistant controls your thermostat.

The architecture is intentionally layered so each component can be replaced independently. If Cilium’s Gateway API matures enough, ingress-nginx can be removed. If I move to a network with BGP support, MetalLB can switch from L2 to BGP without touching anything above it. If a better CNI emerges for ARM64, the ingress and LoadBalancer layers don’t care.

That modularity isn’t theoretical — it’s how the cluster has actually evolved. Components have been swapped, upgraded, and reconfigured without full-stack rebuilds. That’s the payoff for doing the hard work of understanding each layer independently.