Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HAOS in Proxmox VM - Memory Cache never being released #2999

Closed
gfn256 opened this issue Dec 19, 2023 · 5 comments
Closed

HAOS in Proxmox VM - Memory Cache never being released #2999

gfn256 opened this issue Dec 19, 2023 · 5 comments

Comments

@gfn256
Copy link

gfn256 commented Dec 19, 2023

Describe the issue you are experiencing

I have noticed (maybe since forever - have been running HAOS in VM on proxmox for a few years) that memory usage - as reported in Proxmox about HAOS VM - steadily rises in increments over time. Note that memory usage as reported from inside HAOS does not rise!
I finally decided to analyze this problem - and managed to discover that the memory increments occur whenever a backup is made inside HAOS. What happens is - the buff/cache inside HAOS VM increases with every backup - BUT IS NEVER RELEASED!

Here is example from inside HAOS VM (ssh output):

[core-ssh ~]$ free -h
total used free shared buff/cache available
Mem: 7.8G 844.6M 2.2G 5.1M 4.7G 6.8G
Swap: 2.6G 0 2.6G

As you can see I have a total of 8GB of ram allocated to HAOS VM - with less than 1GB actually being used, and 6.8GB being available, BUT the buff/cache has reached 4.7GB so HAOS reports that only 2.2GB is "free"! This is what Proxmox sees and reports! This "free" number steadily decreases with every backup as the buff/cache rises - NEVER BEING RELEAESED!

Maybe someone can enlighten me on this situation.

Upon googling around - I found this exact issue in a HA blog:

https://community.home-assistant.io/t/memory-leak-home-assistant-2022/457565/61

In his blog he suggests clearing the memory cache with:

sync; echo 3 > /proc/sys/vm/drop_caches

However on my HAOS system I am unable to this as I get:

-bash: /proc/sys/vm/drop_caches: Read-only file system

My only workaround (definitely NOT A SOLOUTION!) is to reboot the VM - and then memory returns to "normal"!

What operating system image do you use?

ova (for Virtual Machines)

What version of Home Assistant Operating System is installed?

Home Assistant OS 11.2

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Keep running HAOS for a week
  2. Check Mem usage in Proxmox periodically
  3. Analyze when jumps occur

Anything in the Supervisor logs that might be useful for us?

Nothing interesting

Anything in the Host logs that might be useful for us?

Nothing interesting

System information

System Information

version core-2023.12.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.11.6
os_name Linux
os_version 6.1.63-haos
arch x86_64
timezone XXXX/XXXX
config_dir /config
Home Assistant Cloud
logged_in true
subscription_expiration August 11, 2024 at 3:00 AM
relayer_connected true
relayer_region XX-XXXXXX-XX
remote_enabled true
remote_connected true
alexa_enabled false
google_enabled true
remote_server XX-XXXXXX-XX
certificate_status ready
instance_id XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 11.2
update_channel stable
supervisor_version supervisor-2023.11.6
agent_version 1.6.0
docker_version 24.0.7
disk_total 30.8 GB
disk_used 10.3 GB
healthy true
supported true
board ova
supervisor_api ok
version_api ok
installed_addons Mosquitto broker (6.4.0), File editor (5.7.0), Home Assistant Google Drive Backup (0.112.1), RPC Shutdown (2.4), Samba share (12.2.0), Terminal & SSH (9.8.1), eWeLink Smart Home (1.4.3)
Dashboards
dashboards 3
resources 0
views 15
mode storage
Recorder
oldest_recorder_run December 9, 2023 at 4:18 PM
current_recorder_run December 15, 2023 at 11:21 AM
estimated_db_size 87.14 MiB
database_engine sqlite
database_version 3.41.2

Additional information

No response

@gfn256 gfn256 added the bug label Dec 19, 2023
@sairon
Copy link
Member

sairon commented Dec 19, 2023

It is expected that Linux might/will eventually use all the available RAM it has available. See here to understand what the numbers in free mean: https://www.linuxatemyram.com/

With QEMU/KVM virtualization, the virtual machine behaves like a real computer, and it sees the memory you allocated to it the same way as it would on a bare-metal system. Which means the amount of memory you set for the VM in the Proxmox configuration is dedicated only for it, and it's only up to the guest OS how it uses it. This is not a bug and doing stuff like dropping caches might eventually only have detrimental effect on the system performance.

@sairon sairon added the wontfix label Dec 19, 2023
@sairon sairon closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2023
@gfn256
Copy link
Author

gfn256 commented Dec 19, 2023

@sairon
Thanks for your prompt reply.

In the link you referenced https://www.linuxatemyram.com/ , I quote "If, however, you find yourself needing to clear some RAM quickly to workaround another issue, like a VM misbehaving, you can force linux to nondestructively drop caches using echo 3 | sudo tee /proc/sys/vm/drop_caches." So it appears to be "nondestructive".

Yes I agree this doesn't mean that it doesn't impact general system performance, but why not cater for folks virtualizing to manually/periodically be able to perform this - without having to completely reboot.

On a second note - why does this additional cache-grabbing have to be done again for every backup? Maybe its beyond our control?

@sairon
Copy link
Member

sairon commented Dec 19, 2023

It is nondestructive in a meaning it does not cause system instability. However, it hurts performance - you trade free memory (which means nothing in the guest OS, given there's enough of available memory) for more disk I/O operations which must be done when the OS wants to access any files on the disk again. There is also answer for your second question - it happens after the backup because HA accesses large amount of data at that time. It's up to the Linux kernel which data it keeps in caches and which are dropped.

For the other question, it's wrong to treat the part of the memory used by caches as somehow available to the host OS when the guest is running, dropping caches regularly to make the graphs nicer would be wrong and if the memory were needed by the guest later, you will run into OOM situation anyway. There are methods how memory can be dynamically allocated (search for example for "memory ballooning") but it has some drawbacks too, and it's beyond this discussion.

@gfn256
Copy link
Author

gfn256 commented Dec 19, 2023

@sairon
Thanks again for you're prompt reply.
I employ memory "balooning devices" regularly in Poxmox with my VM's.
However in this case I have to agree with you, that for the HAOS VM, which I've "only" given 8GB of ram anyway, I see not much benefit to enabling balooning or not.
I do agree with you, that purely because of cosmetical reasons (graph-enhancing), there is no point, to clearing the cache.
However, I was bothered that somehow, Proxmox will become "upset" if the HAOS VM memory (as it sees it) becomes 100% used for a long period. Please note this has never yet happened to me.
Just sharing my thoughts.....

@Impact123
Copy link

The PVE memory usage/graph is often causing confusion (especially with ZFS) and as such it's often discussed on their forums.
It also depends a bit on the guest OS (Windows behaves/reports different), qemu agent, ballooning and so on.
This happens for other linux distros too but see the link to research more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants