Deploying RKE2 with VeilNet

Learn how to deploy RKE2 with VeilNet across multiple regions.

Prerequisites

Ubuntu/Debian-based Linux system
Root or sudo access
VeilNet Conflux binary (veilnet-conflux)
VeilNet registration token

Setup RKE2 Cluster

This guide walks you through setting up an RKE2 Kubernetes cluster using VeilNet for networking across multiple nodes.

Important: Certain configuration values must be identical across all server nodes in your cluster. These include: cluster-cidr, cluster-dns, cluster-domain, service-cidr, and cni. Make sure to use the same values on all nodes to prevent cluster join failures.

Step 1: Install VeilNet

First, prepare the VeilNet Conflux binary and register the node:

chmod +x ./veilnet-conflux

sudo ./veilnet-conflux register \
    -t <YOUR_VEILNET_TOKEN> \
    --cidr <YOUR_CIDR> \
    --tag <YOUR_TAG> \
    -p

Replace the placeholders:

<YOUR_VEILNET_TOKEN>: Your VeilNet registration token
<YOUR_CIDR>: The CIDR block for this node (e.g., 10.128.0.1/16)
<YOUR_TAG>: A tag to identify this node (e.g., master-node-1)

Check the VeilNet service logs:

journalctl -u veilnet -f

Step 2: Install RKE2 Control Node

Update the system:

sudo apt update
sudo apt upgrade -y

Create the RKE2 config directory:

sudo mkdir -p /etc/rancher/rke2

Create the RKE2 configuration file. Replace <YOUR_NODE_IP> with the VeilNet IP address assigned to this node:

sudo tee /etc/rancher/rke2/config.yaml > /dev/null <<EOF
node-ip: <YOUR_NODE_IP>
node-external-ip: <YOUR_NODE_IP>
tls-san:
  - <YOUR_NODE_IP>
cni: canal
EOF

Note: RKE2 uses canal as the default CNI (which includes Flannel). By setting node-ip to your VeilNet IP address, the CNI will automatically use the VeilNet interface for pod networking.

Replace the placeholders:

<YOUR_NODE_IP>: The VeilNet IP address of this node (e.g., 10.128.0.1)

Install RKE2:

curl -sfL https://get.rke2.io | sudo sh -

Enable and start the RKE2 service:

sudo systemctl enable rke2-server.service
sudo systemctl start rke2-server.service

Wait for the service to start and check its status:

sudo systemctl status rke2-server.service

Get the node token for joining additional nodes:

sudo cat /var/lib/rancher/rke2/server/node-token

Set up kubectl:

mkdir -p ~/.kube
sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config
sudo chown $USER:$USER ~/.kube/config
export KUBECONFIG=~/.kube/config

Add the kubectl path to your PATH:

export PATH=$PATH:/var/lib/rancher/rke2/bin
echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc

Step 3: Verify CNI Configuration

After RKE2 starts, verify that the CNI (Canal/Flannel) is using the VeilNet interface. The CNI should automatically bind to the interface specified by node-ip. Check the Canal pods:

kubectl get pods -n kube-system -l app=flannel
kubectl get pods -n kube-system -l k8s-app=canal

If needed, you can patch the Canal daemonset to explicitly use the VeilNet interface by editing the Flannel daemonset:

kubectl patch daemonset kube-flannel-ds -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--iface=veilnet"}]'

Note: RKE2 manages the CNI configuration automatically. By setting node-ip to your VeilNet IP, the CNI should use that interface. The manual patch above is only needed if the CNI doesn't automatically detect the correct interface.

Step 4: Join Additional Server Nodes

To join additional server nodes to form a HA cluster, first register each node with VeilNet (as in Step 1), then create the RKE2 config directory:

sudo mkdir -p /etc/rancher/rke2

Create the RKE2 configuration file. Replace the placeholders with your values:

sudo tee /etc/rancher/rke2/config.yaml > /dev/null <<EOF
server: https://<CONTROL_NODE_IP>:9345
token: <NODE_TOKEN>
node-ip: <NEW_NODE_IP>
node-external-ip: <NEW_NODE_IP>
tls-san:
  - <NEW_NODE_IP>
cni: canal
EOF

Important: The cni, cluster-cidr, service-cidr, cluster-dns, and cluster-domain values must match the control node configuration. If you customized these on the control node, use the same values here.

Replace the placeholders:

<NODE_TOKEN>: The token from the control node
<CONTROL_NODE_IP>: The VeilNet IP of the control node (e.g., 10.128.0.1)
<NEW_NODE_IP>: The VeilNet IP of the new server node (e.g., 10.128.0.2)

Install RKE2:

curl -sfL https://get.rke2.io | sudo sh -

Enable and start the RKE2 server service:

sudo systemctl enable rke2-server.service
sudo systemctl start rke2-server.service

Step 5: Join Worker Nodes

To join worker nodes to the cluster, first register each node with VeilNet (as in Step 1), then create the RKE2 config directory:

sudo mkdir -p /etc/rancher/rke2

Create the RKE2 configuration file for the worker node:

sudo tee /etc/rancher/rke2/config.yaml > /dev/null <<EOF
server: https://<CONTROL_NODE_IP>:9345
token: <NODE_TOKEN>
node-ip: <WORKER_NODE_IP>
node-external-ip: <WORKER_NODE_IP>
cni: canal
EOF

Replace the placeholders:

<NODE_TOKEN>: The token from the control node
<CONTROL_NODE_IP>: The VeilNet IP of the control node (e.g., 10.128.0.1)
<WORKER_NODE_IP>: The VeilNet IP of the worker node (e.g., 10.128.0.3)

Install RKE2:

curl -sfL https://get.rke2.io | sudo sh -

Enable and start the RKE2 agent service:

sudo systemctl enable rke2-agent.service
sudo systemctl start rke2-agent.service

Verification

Verify your cluster is running correctly:

kubectl get nodes
kubectl get pods --all-namespaces

Check that Canal/Flannel is using the VeilNet interface:

kubectl get pods -n kube-system -l app=flannel
kubectl get pods -n kube-system -l k8s-app=canal
kubectl logs -n kube-system -l app=flannel | grep -i veilnet

Updating VeilNet

To update VeilNet on a node, download the new binary and follow these steps:

Download the new VeilNet Conflux binary
Make it executable:

chmod +x ./veilnet-conflux

Remove the existing VeilNet installation:

sudo ./veilnet-conflux remove

Install the new version:

sudo ./veilnet-conflux install

Reboot the node:

sudo reboot

After rebooting, the node will reconnect to the VeilNet network with the updated binary, as well as the RKE2 cluster.

Updating RKE2

To update RKE2 on a node:

Stop the RKE2 service:

sudo systemctl stop rke2-server.service
# or for worker nodes:
sudo systemctl stop rke2-agent.service

Install the new version:

curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=<VERSION> sudo sh -

Start the service:

sudo systemctl start rke2-server.service
# or for worker nodes:
sudo systemctl start rke2-agent.service

Using VeilNet Conflux as Sidecar for Direct Service Mesh

To achieve direct service mesh connectivity (similar to Docker's network namespace sharing where containers share the network namespace with veilnet-conflux), you can deploy veilnet-conflux as a sidecar container in your Kubernetes pods. This allows your application containers to share the network namespace with veilnet-conflux, giving them direct access to the VeilNet TUN device and enabling direct communication between services using VeilNet IP addresses.

Example: Deployment with VeilNet Conflux Sidecar

Here's an example manifest that deploys an application with veilnet-conflux as a sidecar:

apiVersion: v1
kind: Secret
metadata:
  name: veilnet-conflux-secret
  namespace: default
type: Opaque
stringData:
  VEILNET_REGISTRATION_TOKEN: <YOUR_REGISTRATION_TOKEN>
  VEILNET_GUARDIAN: <YOUR_GUARDIAN_URL>
  VEILNET_PORTAL: "true"
  VEILNET_CONFLUX_TAG: <YOUR_CONFLUX_TAG>
  VEILNET_CONFLUX_CIDR: <VEILNET_CIDR>
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      # VeilNet Conflux sidecar - must be first container
      - name: veilnet-conflux
        image: veilnet/conflux:beta
        imagePullPolicy: Always
        securityContext:
          capabilities:
            add:
              - NET_ADMIN
        volumeMounts:
          - name: dev-net-tun
            mountPath: /dev/net/tun
        envFrom:
          - secretRef:
              name: veilnet-conflux-secret
        resources:
          requests:
            memory: "64Mi"
            cpu: "50m"
          limits:
            memory: "128Mi"
            cpu: "100m"
      # Your application container
      - name: app
        image: your-app:latest
        ports:
          - containerPort: 8080
            name: http
        # Application shares network namespace with veilnet-conflux
        # Access other services via VeilNet IP addresses
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
      volumes:
        - name: dev-net-tun
          hostPath:
            path: /dev/net/tun
            type: CharDevice
      # All containers in the pod share the same network namespace
      # This is the default behavior in Kubernetes
---
apiVersion: v1
kind: Service
metadata:
  name: my-app
  namespace: default
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: ClusterIP

Key Points:

Shared Network Namespace: All containers in a Kubernetes pod share the same network namespace by default, which achieves the same effect as Docker's network_mode: "container:veilnet-conflux".
Sidecar Container: The veilnet-conflux container runs as a sidecar alongside your application container in the same pod.
TUN Device Access: The sidecar needs access to /dev/net/tun and NET_ADMIN capability to create the VeilNet interface.
Environment Variables: Store VeilNet configuration in a Secret and reference it using envFrom.
Service Access: Your application can access other services using their VeilNet IP addresses, just like in the Docker setup.

Accessing Services

Once deployed, your application can:

Access services on other pods using their VeilNet IP addresses
Use the VeilNet TUN device directly through the shared network namespace
Communicate with services across different Kubernetes nodes via VeilNet

Example: Multi-Container Pod with Database

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app-with-db
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web-app-with-db
  template:
    metadata:
      labels:
        app: web-app-with-db
    spec:
      containers:
      # VeilNet Conflux sidecar
      - name: veilnet-conflux
        image: veilnet/conflux:beta
        imagePullPolicy: Always
        securityContext:
          capabilities:
            add:
              - NET_ADMIN
        volumeMounts:
          - name: dev-net-tun
            mountPath: /dev/net/tun
        envFrom:
          - secretRef:
              name: veilnet-conflux-secret
      # Web application
      - name: web-app
        image: nginx:latest
        ports:
          - containerPort: 80
      # Database (shares network namespace)
      - name: database
        image: postgres:15-alpine
        env:
          - name: POSTGRES_DB
            value: mydb
          - name: POSTGRES_USER
            value: user
          - name: POSTGRES_PASSWORD
            value: password
        ports:
          - containerPort: 5432
      volumes:
        - name: dev-net-tun
          hostPath:
            path: /dev/net/tun
            type: CharDevice

In this example, all three containers (veilnet-conflux, web-app, and database) share the same network namespace, allowing them to communicate via localhost while also having access to the VeilNet network.

FAQ

Do I need to configure a sub-router?

No, you do not need to configure a sub-router. VeilNet handles all the networking automatically, including routing between nodes across different regions.

Do I need to configure firewall rules or CNI VXLAN settings?

No, you do not need to configure firewall rules or CNI VXLAN settings. VeilNet manages the network layer, and the Canal CNI (which includes Flannel) will use the VeilNet interface automatically when node-ip is set to your VeilNet IP address.

Can I use Cilium or Calico instead of Canal?

Yes, RKE2 supports multiple CNI options: canal (default, includes Flannel), calico, cilium, or none. To use Cilium or Calico, set cni: cilium or cni: calico in your RKE2 config file. You'll need to configure the CNI to use the VeilNet interface by ensuring node-ip is set to your VeilNet IP address. Some CNIs may require additional configuration to bind to the veilnet interface.

Can I use Longhorn for distributed storage?

We do not recommend using Longhorn for distributed storage unless all nodes are in the same local network. Longhorn has strict latency requirements that may not be met when nodes are distributed across different regions or have higher network latency. For multi-region deployments, consider using other storage solutions that are designed for higher latency environments.

Should I use VeilNet even if all my nodes are local?

Yes, you can still use VeilNet for your cluster even if all nodes are on the same local network. VeilNet provides additional security by encrypting all traffic between nodes and can help isolate your cluster traffic from other network traffic on the same physical network.

What's the difference between RKE2 and K3s?

RKE2 is Rancher's enterprise-grade Kubernetes distribution that follows upstream Kubernetes more closely, while K3s is a lightweight distribution. RKE2 uses systemd for service management and has stricter security defaults. Both work well with VeilNet, but RKE2 may be preferred for production environments requiring full Kubernetes compliance.

Deploying K3s with VeilNet

Learn how to deploy K3s with VeilNet across multiple regions.

Private AI agent with VeilNet

Learn how to deploy a private AI agent with VeilNet.