Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check endpoint which can be used with nginx' auth_request function #694

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

SuperSandro2000
Copy link

This allows to use annubis via nginx' auth_request function.

This is a simplified example extracted from my config:

location /.within.website/ {
        proxy_pass http://127.0.0.1:8923;
        auth_request off;
}
location @redirectToAuth2ProxyLogin {
        return 307 /.within.website/?redir=$scheme://$host$request_uri;
        auth_request off;
}
auth_request /.within.website/x/cmd/anubis/api/check;
error_page 401 = @redirectToAuth2ProxyLogin;

@SuperSandro2000 SuperSandro2000 marked this pull request as ready for review March 15, 2025 06:09
@Xe Xe self-requested a review March 15, 2025 19:23
@Xe Xe self-assigned this Mar 15, 2025
@baconwaifu
Copy link

If this had a way to return both an unauthorized status and the response body (I see no reason why the original path has to return a 200; cloudflare returns 403 when it intercepts with the challenge) then this would work rather seamlessly with traefik's ForwardAuth middleware as well.

@Xe
Copy link
Owner

Xe commented Mar 15, 2025

(I see no reason why the original path has to return a 200; cloudflare returns 403 when it intercepts with the challenge)

I hate to inform you that there is a reason to it, and the rationale kinda incredibly sucks. If AI scrapers see 4xx statuses, they just re-queue the page for trying again later. The only way to get them to actually go away and stay away is to lie and return HTTP 200 when the status is not semantically 200. This is surely a violation of some RFC somewhere, but sadly this is what works.

@baconwaifu
Copy link

I hate to inform you that there is a reason to it, and the rationale kinda incredibly sucks. If AI scrapers see 4xx statuses, they just re-queue the page for trying again later. The only way to get them to actually go away and stay away is to lie and return HTTP 200 when the status is not semantically 200. This is surely a violation of some RFC somewhere, but sadly this is what works.

That is utterly horrifying. And also just plain stupid on the part of the bots, considering that most non-ratelimit responses probably aren't going to change from time alone...

So that rules out changing the default proxy-mode response code. The other alternative is to have the webhook mode configurable to also return the challenge body, since Traefik expects the webhook to return the error page directly, rather than having an easy way to do an error_page like nginx can (or it can supply a redirect to the magic prefix, as 3XX also works to hijack the response)

@Xe
Copy link
Owner

Xe commented Mar 17, 2025

That is utterly horrifying. And also just plain stupid on the part of the bots, considering that most non-ratelimit responses probably aren't going to change from time alone.

Yep! This is the reality we got. The really annoying part is that this makes the user experience super bad, but this was not a decision I made out of choice. It is a decision made out of desperation. Amazonbot in particular speeds up its requests if you return a 5xx response.

@Xe
Copy link
Owner

Xe commented Mar 19, 2025

@SuperSandro2000 I'm going to manually recreate this PR in TecharoHQ/anubis as I am moving the project there (see #698 and TecharoHQ/anubis#1).

Seriously though, thank you for the contribution! It's what actually pushed me over the line to want to move this to a dedicated org/repo.

@SuperSandro2000
Copy link
Author

😅 if that was all it took, but seriously back: I debugged this until 5 am or so because matching nginx absolute and relative paths with auth_request and adding the right return codes to anubis for this was also not as easy as I would have liked it.

My time it currently a bit spotty but feel free to refactor/clean up/whatever this. I try to keep an eye on it but no promisses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants