Tag: kubelet error

  • Setting Up Kubernetes on Vagrant: What Broke, What I Fixed, and What I Learned”

    Setting Up Kubernetes on Vagrant: What Broke, What I Fixed, and What I Learned”

    You know when you think something’s going to be quick and easy, but it ends up being a rabbit hole of learning? That’s exactly what happened when I tried setting up a Kubernetes cluster on Vagrant using VirtualBox.

    This blog is about all the hiccups I hit, how I fixed them, and what I learned along the way. If you’re doing something similar or planning to, maybe this will save you some time and headaches.


    I Decided to Build My Own Cluster

    After attending KubeCon Delhi in December 2024, I came back inspired to level up my Kubernetes game. I’m doing the Kubestronaut certification, and rather than use something managed like EKS or GKE, I decided to roll up my sleeves and do it all manually.

    It sounded simple. Spin up a couple of Vagrant boxes with VirtualBox, install Kubernetes, and go. But as I found out, theory and practice are not the same thing.


    The Setup

    I used a Vagrantfile to spin up a control-plane node and a worker node.

    Vagrantfile (Snippet)

    Vagrant.configure("2") do |config|
      config.vm.box = "ubuntu/focal64"
    
      config.vm.define "k8s-controller" do |node|
        node.vm.hostname = "k8s-controller"
        node.vm.network "private_network", ip: "192.168.33.10"
      end
    
      config.vm.define "k8s-node1" do |node|
        node.vm.hostname = "k8s-node1"
        node.vm.network "private_network", ip: "192.168.33.11"
      end
    end

    Where It Started Going Sideways: Networking Woes

    So VirtualBox sets every VM’s default NAT IP to 10.0.2.15. If you don’t set up a second network (like a host-only adapter), Kubernetes nodes just can’t talk properly to each other.

    Also, I forgot to set the --apiserver-advertise-address, and Kubernetes defaulted to the NAT IP. That meant the worker node couldn’t find the control plane. This is important to always point to host-adapter or your secondary NIC.


    The Wall of Errors (and Fixes)

    container runtime is not running

    What it means: containerd wasn’t running.

    Simply fix by,

    sudo apt install -y containerd
    containerd config default | sudo tee /etc/containerd/config.toml
    sudo systemctl restart containerd kubelet

    kubelet-start: timed out waiting for the condition

    Why it happens: Kubelet can’t talk to the API server, usually because the config is missing or wrong, or CNI isn’t ready.

    Can be resolved by running

    sudo systemctl restart kubelet

    NetworkPluginNotReady: cni plugin not initialized

    Root cause: Calico wasn’t running. Either because the binaries were missing or iptables was blocking traffic.

    To fix need to apply the calico.yaml file so the pod networking can be setup.

    kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

    Also ensure CNI plugins are installed:

    sudo mkdir -p /opt/cni/bin
    wget -qO- https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz | sudo tar -xz -C /opt/cni/bin

    AppArmor and Firewall Madness

    • AppArmor was causing container launch failures.
    • The firewall (UFW) and iptables were messing with CNI networking.

    This one is difficult to identify and is easier to overlook, fix this by running the following commands:

    sudo systemctl stop apparmor
    sudo systemctl disable apparmor
    sudo systemctl disable --now ufw
    sudo iptables -F
    sudo iptables -tnat -F

    Clean Reset and Rebuild

    When nothing worked and after going in circles, I wiped it all out and restarted everything.

    sudo kubeadm reset -f
    sudo systemctl restart containerd kubelet

    Updated Hosts File

    192.168.33.10 k8s-controller
    192.168.33.11 k8s-node1

    Re-initialize Kubernetes

    sudo kubeadm init --apiserver-advertise-address=192.168.33.10 --pod-network-cidr=192.168.0.0/16

    Configure kubectl on the host

    mkdir -p $HOME/.kube
    sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    Worker Node Join

    Copied the kubeadm join command from the output of the init and ran it on the worker node.


    Accessing the Kubernetes cluster From My Laptop

    scp vagrant@k8s-controller:/etc/kubernetes/admin.conf ~/.kube/config
    chmod 600 ~/.kube/config

    Final Thoughts: What I Learned

    • Most of the errors can be simply fixed by proper Networking and VirtualBox + Vagrant need that extra private_network for everything to work.
    • The --apiserver-advertise-address must be a reachable IP. NAT won’t work here.
    • AppArmor, UFW, and iptables can break Kubernetes in subtle ways.
    • If Calico fails to start, look at your CNI paths and system services.

    This whole process was a grind, but now I get why people say: if you want to learn Kubernetes, build your own cluster.

    Hope this helps you avoid the same potholes. If you’ve been through similar issues — or fixed them another way — I’d love to hear from you.

    Happy hacking!