Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple code for server startup, lifecycle hooks, FastAPI app, and routes #962

Merged
merged 4 commits into from
Feb 13, 2025

Conversation

stbaione
Copy link
Contributor

From an application standpoint, the code that we have to startup our server, add routes, add lifecycle hooks, etc. is very tightly-coupled and doesn't provide room for scale. It also makes it a bit difficult to debug and write unit tests for the individual components.

This separates the logic for starting the server, creating the FastAPI application, registering lifecycle hooks, and adding new routes.

It should make it pretty quick to expand upon each component individually, and will allow us to add more features to the server cleanly.

As a manual validation step (along with our CI), I ran my script that sends 30 requests concurrently, results for that here

@stbaione stbaione requested a review from renxida February 13, 2025 01:42
@renxida
Copy link
Contributor

renxida commented Feb 13, 2025

I:

  • agree we need to break up server.py
  • disagree that it has to be so many small chunks. People have difficulty reading large files and remembering everything, but navigating through multiple small files have a nonnegligible overhead also. Splitting by relevance feels better.

Suggestion:

  • split server.py in 2 pieces. server_app, and serving_manager.
    • app responsibilities:
      • parse args and split between shortfin args and fastapi server args
      • map routes
      • configure http related arguments
      • start the app
    • serving_manager responsibilities
      • take args from args and initialize a GenerateService
      • allow server_app to import a set of functions / a single context manager to initialize, use, and deinitialize this.

@renxida
Copy link
Contributor

renxida commented Feb 13, 2025

(maybe in a different PR)

@stbaione stbaione merged commit f2b91eb into nod-ai:main Feb 13, 2025
36 of 38 checks passed
monorimet pushed a commit that referenced this pull request Feb 13, 2025
…outes (#962)

From an application standpoint, the code that we have to startup our
server, add routes, add lifecycle hooks, etc. is very tightly-coupled
and doesn't provide room for scale. It also makes it a bit difficult to
debug and write unit tests for the individual components.

This separates the logic for starting the server, creating the FastAPI
application, registering lifecycle hooks, and adding new routes.

It should make it pretty quick to expand upon each component
individually, and will allow us to add more features to the server
cleanly.

As a manual validation step (along with our CI), I ran my script that
sends 30 requests concurrently, results for that
[here](https://gist.github.com/stbaione/23673f7a8945b9b18eb7ffcfcebed8bc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants