Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move_mount: EINVAL on bazzite #38

Open
ishcabittle opened this issue Dec 17, 2024 · 27 comments
Open

move_mount: EINVAL on bazzite #38

ishcabittle opened this issue Dec 17, 2024 · 27 comments

Comments

@ishcabittle
Copy link

bug-report.zip

I am attempting to launch any application from the macOS GUI app and receiving a timeout message. I've tried three applications, steam, heroic games launcher, and lutris. When launching, I will get the spinner for about 10 seconds and then a timeout message:

Screenshot 2024-12-17 at 10 04 51 AM

I am starting mmserver in an SSH session to the host, with the simple toml provided in the setup docs:

[apps.lutris]
command = ["lutris"]
xwayland = true
force_1x_scale = false
isolate_home = false

The mmserver binary reports a good startup:

./mmserver -C lutris.toml --bind 10.0.15.59:9599
2024-12-17T15:06:42.603820Z INFO mmserver::vulkan: selected gpu: "NVIDIA GeForce GTX 1060 3GB" (0)
2024-12-17T15:06:42.694870Z INFO mmserver::vulkan: hardware encoding support: (h264: true, h265: true, av1: false)
2024-12-17T15:06:42.699841Z INFO mmserver: listening on 10.0.15.59:9599

A simple telnet to the port will report connection refused, however:

telnet 10.0.15.59 9599
Trying 10.0.15.59...
telnet: connect to address 10.0.15.59: Connection refused
telnet: Unable to connect to remote host

The macOS GUI app does automatically see the host and presented application, and I can shutdown mmserver, switch to a new application toml, and see the application name change in the GUI app. I have tried starting mmserver both logged in and out of the desktop session, that doesn't seem to effect the behavior.

Is there a test application that is quick to launch that might circumvent this timeout, something like VLC? What would the toml for that look like? Is serving an application from mmserver using flatpak supported (so something like: command = ["flatpak", "run", "com.mydomain.myapp"])?

@colinmarc
Copy link
Owner

colinmarc commented Dec 17, 2024

Do you have Xwayland on your path? It seems to be hanging there - which is a bug, but maybe that's the root issue.

Edit: I forgot to say - thank you for the detailed bug report!

@colinmarc
Copy link
Owner

A simple telnet to the port will report connection refused, however:

That's because the protocol is based on QUIC, which is based on UDP. Telnet uses TCP.

Is there a test application that is quick to launch that might circumvent this timeout, something like VLC? What would the toml for that look like? Is serving an application from mmserver using flatpak supported (so something like: command = ["flatpak", "run", "com.mydomain.myapp"])?

I don't think the app is the issue. I haven't tried flatpak, but please let me know if it works!

@colinmarc
Copy link
Owner

I was able to reproduce this by moving /usr/bin/Xwayland to Xwayland-bak. It should now print an error message instead of hanging forever (the fix will be in the next release).

The actual fix is to make sure Xwayland is installed and on your PATH. You need it to run X11 applications (mmserver is based on wayland).

@ishcabittle
Copy link
Author

That's interesting, Xwayland seems to be in my path, right where you would expect:

ishcabittle@bazzite:~$ echo $PATH
/var/home/ishcabittle/.local/bin:/var/home/ishcabittle/bin:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
ishcabittle@bazzite:~$ which Xwayland
/usr/bin/Xwayland

Is there a command line flag I can include that will increase the timeout on the server side? Or is that client side... if so, I'll try the command line client on the mac.

@colinmarc
Copy link
Owner

You're welcome to try the commandline client on your mac, but the bug is definitely server side. Whatever error is happening is never being printed (it's actually the code to print the error that had an infinite loop 😓 ).

Let me push a server release and then we can see if we get more information.

@colinmarc colinmarc reopened this Dec 17, 2024
@ishcabittle
Copy link
Author

Just tried 0.6.1, the GUI error message changes to:

Screenshot 2024-12-17 at 2 23 50 PM

with an error in the server CLI:

2024-12-17T19:17:52.280190Z ERROR mmserver::server::handlers: failed to launch session: spawning Xwayland: failed to spawn XWayland: timed out waiting for forked child (phase 1): Connection timed out (os error 110)

@colinmarc
Copy link
Owner

colinmarc commented Dec 17, 2024

Hm, weird. Can you upload the full bug report dir please?

@colinmarc
Copy link
Owner

colinmarc commented Dec 17, 2024

The other thing you can try is:

strace -ff -o mmserver.trace ./mmserver ...

And then upload a zip of the mmserver.trace directory. Please just try a single time and then upload the results so that there's not as many logs to wade through.

@ishcabittle
Copy link
Author

Sorry, I forgot my tools and procedures:
bug-report-2.zip

Stack trace output:
mmserver-trace.zip
There are more than a few files there, I promise I only tried the one time in that group.

@colinmarc
Copy link
Owner

These are the failing syscalls (in particular the last one):

mmserver.trace.78583-89-fsopen("tmpfs", FSOPEN_CLOEXEC)         = 90
mmserver.trace.78583-90-fsconfig(90, FSCONFIG_SET_STRING, "mode", "0777", 0) = 0
mmserver.trace.78583-91-fsconfig(90, FSCONFIG_SET_STRING, "size", "1g", 0) = 0
mmserver.trace.78583-92-fsconfig(90, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = 0
mmserver.trace.78583:93:fsmount(90, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOSUID|MOUNT_ATTR_NOEXEC|MOUNT_ATTR_NOATIME) = 91
mmserver.trace.78583-94-mkdirat(AT_FDCWD, "/home", 000)         = -1 EEXIST (File exists)
mmserver.trace.78583:95:move_mount(91, "", AT_FDCWD, "/home", MOVE_MOUNT_F_EMPTY_PATH|MOVE_MOUNT_T_EMPTY_PATH) = -1 EINVAL (Invalid argument)

@colinmarc
Copy link
Owner

Just so I have it, what's your kernel version?

As a heads up, I'm pretty stumped by this and it may take me a few days to figure out a repro/fix.

@colinmarc colinmarc changed the title Launching application from GUI client results in timout message move_mount: EINVAL Dec 17, 2024
@colinmarc
Copy link
Owner

Here's a possible fix: af519ee

I'm not going to push a release with it until I do some more digging. If you're down to build mmserver, then you could see if it fixes for you :)

@ishcabittle
Copy link
Author

Just so I have it, what's your kernel version?

As a heads up, I'm pretty stumped by this and it may take me a few days to figure out a repro/fix.

My kernel is 6.11.10-304.bazzite.fc41.x86_64

@ishcabittle
Copy link
Author

Here's a possible fix: af519ee

I'm not going to push a release with it until I do some more digging. If you're down to build mmserver, then you could see if it fixes for you :)

Ef it, I'll give it a go!

@ishcabittle
Copy link
Author

Hey, I've built af519ee from source and given it a try - sorry to report still no joy.

This is run from the bazzite box:
mmserver-2-trace.zip
bug-report-3.zip

I'm going to give it a shot from the ubuntu box I built mm-server on, it might help if we determine if this is an issue with bazzite itself.

@ishcabittle
Copy link
Author

Another run, this time commit af519ee from a Ubuntu 24.04 box:

mmserver-3-trace.zip
bug-report-4.zip

@colinmarc
Copy link
Owner

The ubuntu issue is #29 - please see the fix in that thread. (I'll add a startup warning so it's more obvious.)

@colinmarc
Copy link
Owner

Hey, I've built af519ee from source and given it a try - sorry to report still no joy.

I missed a callsite 😅 Can you try 2e60536?

@ishcabittle
Copy link
Author

Sure thing, I'll report back.

@ishcabittle
Copy link
Author

here's 2e60536's bug reports:

mmserver-4-trace.zip
bug-report-5.zip

This is before the

sudo sysctl -w kernel.apparmor_restrict_unprivileged_unconfined=0
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0

fix, after which created a new error in the console:

2024-12-20T16:06:23.783906Z INFO mmserver::session: launching session session_id=153534 application="steam-gamepadui"
2024-12-20T16:06:23.891981Z INFO mmserver::compositor::child: child process exited exit_status=1
2024-12-20T16:06:23.893640Z ERROR mmserver::server::handlers: failed to launch session: child process exited with status: 1

and here are the bug reports after running the apparmor commands:

bug-report-6.zip
mmserver-5-trace.zip

@colinmarc
Copy link
Owner

here's 2e60536's bug reports:

mmserver-4-trace.zip bug-report-5.zip

This is before the

sudo sysctl -w kernel.apparmor_restrict_unprivileged_unconfined=0 sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0

fix, after which created a new error in the console:

2024-12-20T16:06:23.783906Z INFO mmserver::session: launching session session_id=153534 application="steam-gamepadui" 2024-12-20T16:06:23.891981Z INFO mmserver::compositor::child: child process exited exit_status=1 2024-12-20T16:06:23.893640Z ERROR mmserver::server::handlers: failed to launch session: child process exited with status: 1

and here are the bug reports after running the apparmor commands:

bug-report-6.zip mmserver-5-trace.zip

Sorry, I'm a bit mixed up now. This is all on ubuntu? Or are we back on bazzite?

bug-report-6.zip

This is just steam exiting with an error - I don't know why, but you can try with some other program to see if that is happier.

@ishcabittle
Copy link
Author

Sticking with Ubuntu for the time being, easier to build and run on one box than to switch between the two.

I'll try something else, one sec.

@colinmarc
Copy link
Owner

Ok, looks like something related to snap:

newfstatat(AT_FDCWD, "/proc/1/root/snap", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
getrandom("\xef\xf1\x81\xc3\xdb\x6a\xd3\x7e", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x5fd4d43c4000
brk(0x5fd4d43e5000)                     = 0x5fd4d43e5000
umask(000)                              = 002
openat(AT_FDCWD, ".", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
chdir("/")                              = 0
getresuid([1000], [1000], [1000])       = 0
getresgid([1000], [1000], [1000])       = 0
write(2, "need to run as root or suid", 27) = 27

I'm happy to help figure out how to make snap programs work (maybe in a separate issue), but for this that's out of scope.

For this issue thread, the thing I'm most interested in is the move_mount call failing on bazzite.

@colinmarc colinmarc changed the title move_mount: EINVAL move_mount: EINVAL on bazzite Dec 20, 2024
@ishcabittle
Copy link
Author

Ah, okay, I will switch back to bazzite right after this next attempt on ubuntu.

@ishcabittle
Copy link
Author

Latest bug report of 2e60536 on bazzite:

bug-report-7.zip

@colinmarc
Copy link
Owner

I can't tell exactly without the strace output, but it looks like my shot in the dark fix didn't do anything. 🙁

@ishcabittle
Copy link
Author

Trace:

mmserver-trace-5.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants