Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AAP-37500 Excessive number of open file descriptors #39

Open
AlanCoding opened this issue Jan 3, 2025 · 2 comments · May be fixed by #78
Open

AAP-37500 Excessive number of open file descriptors #39

AlanCoding opened this issue Jan 3, 2025 · 2 comments · May be fixed by #78
Labels
awx needed for AWX MVP minimum set of issues to fix

Comments

@AlanCoding
Copy link
Member

https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

The multiprocessing lib default start method (which is implicitly used here) is fork. If you look at the docs for spawn, it says this:

In particular, unnecessary file descriptors and handles from the parent process will not be inherited.

...implying that file descriptors are inherited by the child processors when using fork. The consequence is predictable, and the same for both the old dispatcher and the current state of this library. Every worker will inherit O(N) file descriptors as the parent process has an open connection to each existing worker. This leads to a total of O(N^2) open file descriptors as the number of workers grows. A total of O(N) is expected, so this issue tracks the objective of bringing it down from a quadratic behavior to linear.

@AlanCoding AlanCoding added awx needed for AWX MVP minimum set of issues to fix labels Jan 8, 2025
@AlanCoding
Copy link
Member Author

I did some more research on this, and I'll capture a few things from that here.

Measuring file descriptors

This is fairly simple. We should add this to some debugging methods, and the structure of the control-and-reply methods mature.

import os

def open_fds_ct() -> int:
    return len(os.listdir("/proc/self/fd/"))

Measuring memory use

Standard off-the-shelf thing

import psutil

def memory use():
    process = psutil.Process(os.getpid())
    return process.memory_info().rss / (1024 * 1024)  # Convert bytes to MB

Important - method for abusing forserver

As we get a config system in place, #44, we know very well how callbacks will be specified. There is still some question about how we can pass data from the management command to the base forkserver process, but worst-case, we can re-load a config file.

This proposal is that we directly execute a pre-fork callback from a new special-purpose module, say, dispatcher.hazmat.pre_fork. This module would do nothing other than reference the config callback or pre-fork and execute it.

forserver imports

At some point, any point, in the code before we run the service, we would do this.

import multiprocessing

multiprocessing.set_forkserver_preload(["dispatcher.hazmat.pre_fork"])

This is coupled with using forserver generally.

import multiprocessing

multiprocessing.set_start_method("forkserver")

And the rest of the code would proceed as it is now. Doing this, we need to carefully measure and compare memory use.

@AlanCoding
Copy link
Member Author

Link branch that demos this on AWX side:

ansible/awx@devel...AlanCoding:awx:forkserver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awx needed for AWX MVP minimum set of issues to fix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant