Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rcu_preempt detected stalls on cpus/tasks #2992

Open
dankocrama opened this issue Dec 14, 2023 · 22 comments
Open

rcu_preempt detected stalls on cpus/tasks #2992

dankocrama opened this issue Dec 14, 2023 · 22 comments
Labels
board/ova Open Virtual Appliance (Virtual Machine) bug hypervisor/vmware VMware related issues

Comments

@dankocrama
Copy link

dankocrama commented Dec 14, 2023

Describe the issue you are experiencing

Hello,

We have home Assistant OS running in a VM on Truenass, and we are getting sometimes this error and this means that we have to reboot the VM every 1 or 2 days.

hope some one has a solution because its not really practical because all our home automations are not working 24h/7

Thanks in advance

Screenshot 2023-12-13 at 09 35 41

What operating system image do you use?

generic-aarch64 (Generic UEFI capable aarch64 systems)

What version of Home Assistant Operating System is installed?

6.1.63

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

We don't know where the problem is coming from we change the number of VM core, but no change.

Anything in the Supervisor logs that might be useful for us?

No problems In log

Anything in the Host logs that might be useful for us?

No problems In log

System information

A Truenas with HA running in a VM

Additional information

@dankocrama dankocrama added the bug label Dec 14, 2023
@baturinivan
Copy link

I have similar problem. Home Assistant OS is installed on Proxmox virtual environment on mini-pc.

@siw1973
Copy link

siw1973 commented Feb 4, 2024

I'm having similar issues on Home Assistant OS on VMWare Workstation Pro

@dankocrama
Copy link
Author

Any update on this ?

@mwb9aa
Copy link

mwb9aa commented Feb 21, 2024

Maybe this link will help? I'm trying it now. Basically, in the virtual machine, I changed the chipset type to ICH9. On my old machine, I used the other chipset, but on my new machine, I may need the other chipset.

@dankocrama
Copy link
Author

I will have a check thanks for the feedback 🚀🚀

@MomosX
Copy link

MomosX commented Apr 5, 2024

@dankocrama did you solve the problem ? I started to have the same issue on my HA VM in Trunas Core. Every 2-3 days it would crash with the same error messages. Thank you

@baturinivan
Copy link

I have fixed by updating proxmox VM to latest kernel version.

@dankocrama
Copy link
Author

dankocrama commented May 1, 2024

@MomosX No I still have the problem

@MomosX
Copy link

MomosX commented May 1, 2024

same :(

@siw1973
Copy link

siw1973 commented May 1, 2024

Still got the issue on VMWare Workstation 17. 0.0

@dankocrama
Copy link
Author

dankocrama commented May 23, 2024

Did someone find a solution?

@CristianGonzalezFernandez

Same issue here :(

@dankocrama
Copy link
Author

Moved to Truenas scale, and already running for 30days without any bug

@MomosX
Copy link

MomosX commented Jul 25, 2024

I was thinking of doing the upgrade to Scale also, but too scared that something will break and it will take me a long time to figure it out.

Copy link

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 24, 2024
@siw1973
Copy link

siw1973 commented Oct 24, 2024

Staying alive.....

Still broken on VMWare Workstation and need to reboot everything every 4-5 days.

@github-actions github-actions bot removed the stale label Oct 25, 2024
@mischief
Copy link

seen after upgrade to HA OS 14.1. VM was unreachable over the network. previous version was OK. after a vm reset via virt-manager, it recovered. dumping what was on the console and my libvirt vm config here..

[277488.020311] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[277488.021572] rcu: 	0-...0: (0 ticks this GP) idle=955c/1/0x4000000000000000 softirq=1241785/1241785 fqs=1284672
[277488.023179] rcu: 	(detected by 3, t=6069646 jiffies, g=3192761, q=15615 ncpus=4)
[277551.026498] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[277551.027695] rcu: 	0-...0: (0 ticks this GP) idle=955c/1/0x4000000000000000 softirq=1241785/1241785 fqs=1297795
[277551.029152] rcu: 	(detected by 1, t=6132653 jiffies, g=3192761, q=15798 ncpus=4)

ha-libvirt.xml

@johnwalk61
Copy link

I solved this issue on VMWare Workstation by turning off the automatic snapshots. I saw someone post that and didn't expect it to work. No CPU errors in 2 weeks.

@siw1973
Copy link

siw1973 commented Jan 29, 2025

I solved this issue on VMWare Workstation by turning off the automatic snapshots. I saw someone post that and didn't expect it to work. No CPU errors in 2 weeks.

Thank you for this, however this is a good thing and a bad thing !

Good - VMWare Workstation HA instance over 24 hours no CPU errors
Bad - My lazy backup strategy is now trashed and I actually have to put some effort into a proper backup plan....

@johnwalk61
Copy link

johnwalk61 commented Jan 29, 2025 via email

@sairon sairon added board/ova Open Virtual Appliance (Virtual Machine) hypervisor/vmware VMware related issues labels Jan 29, 2025
@sairon
Copy link
Member

sairon commented Jan 29, 2025

Nice find, but I'd say it's really just a workaround. @johnwalk61 do you remember where you saw the post suggesting it?

@johnwalk61
Copy link

johnwalk61 commented Jan 29, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/ova Open Virtual Appliance (Virtual Machine) bug hypervisor/vmware VMware related issues
Projects
None yet
Development

No branches or pull requests

9 participants