-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine gets stuck in provisioning state #291
Comments
Please share the cluster manifest, you're using and let us debug |
I have the same issue I think. Relevant capmox logs:
Possibly related to #290? This specifically mentions talos, but seems like this crd version is not yet released. Any chance we could soon get a release with these fixes so we can deploy this with talos? Edit: I switched to using a cloud-init compatible talos image, however it seems like the cloud-init config is crashing talos: Seems to be an issue in talos: siderolabs/talos#9352 |
Can confirm that the issue I mentioned has been solved in talos 1.9 alpha.3. The only remaining issue I see is that capmox is not updating the node IPs in the machine CR which cause talos to wait with bootstrapping. This can be solved with the skipQemuCheck in the proxmoxmachine CR but this has not yet been released, it's on main only. |
@rouke-broersma Can you share working manifests ? |
https://github.com/broersma-forslund/homelab/tree/main/apps%2Finfrastructure |
@rouke-broersma Thanks !! I used one of the forked releases that supports skipQemuCheck in the proxmoxmachine yet the nodeIP is not being updated even when it does shows the IP from IPAM provider as a label for proxmox VM. Can it be the issue with the template being used for machine creation? Because I am using template created out of runnig talos instance as proxmox image builder doesn't have option for building talos image. |
I also don't use image builder, I'm pretty sure that's only for kubeadm. Did you use a nocloud type talos image? Did you also disable the cloud init check? Only qemu is not enough. You should check the controllers (proxmox and takos) logs to see which controller is waiting on which status. |
Full configuration:
Machines: Taloscontrolplanes status: Note that I've given controlplane IP as 10.0.15.241 but taloscontrolplane is trying to access it on 10.0.15.242 which is IP from the ipv4Configs |
Proxmox provider does not yet support dhcp, you need to disable dhcp. Also talos is trying to configure the node so of course it's trying to reach the node ip and not the control plane ip. Your node needs to be able to arp its assigned ip and should then be routable from your cluster api provider. |
What steps did you take and what happened:
Now the machine is created successfully in proxmox with IP assigned to it as well but machine phase in the management cluster is stuck in provisioning state as a result no further action of bootstrapping by talos takes place as it keeps waiting for infrastructure to be ready. Upon cheking the logs of capmox-controller-manager, this is what I found:
Note that I have already enables qemu-agent in the VM template as well.
Status of machine in management cluster:
What did you expect to happen:
Machine is provisioned successfully and control plane is initialized.
Environment:
kubectl version
):v1.30.1/etc/os-release
): talos:v1.7.4The text was updated successfully, but these errors were encountered: