-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: grpc health service #4020
base: master
Are you sure you want to change the base?
Conversation
cmd/dex/serve.go
Outdated
go func() { | ||
var status healthgrpc.HealthCheckResponse_ServingStatus | ||
for { | ||
switch healthChecker.IsHealthy() { | ||
case true: | ||
status = healthgrpc.HealthCheckResponse_SERVING | ||
default: | ||
status = healthgrpc.HealthCheckResponse_NOT_SERVING | ||
} | ||
healthcheck.SetServingStatus("", status) | ||
time.Sleep(healthCheckPeriod) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could ae avoid using goroutines in goroutines here and use a separate group for the health endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code updated.
api.RegisterDexServer(grpcSrv, server.NewAPI(serverConfig.Storage, logger, version, serv)) | ||
|
||
grpcMetrics.InitializeMetrics(grpcSrv) | ||
if c.GRPC.Reflection { | ||
logger.Info("enabling reflection in grpc service") | ||
reflection.Register(grpcSrv) | ||
} | ||
ctx, cancelHealthcheck := context.WithCancel(context.Background()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need a ctx here?
I think group
does the trick with canceling the function.
Can you check it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need context for early breaking from the for { }
loop below when shutdown is requested.
Indeed, group
cancels the function, but after the gracefulStop
of grpc server times out (see my comment below). Meanwhile, the healhcheck loop keeps updating status if not properly stopped from the outside (which is not a big deal, but I prefer stopping stuff when it is not needed anymore).
I can use bare channel for receiving shutdown event, but context is more convenient.
|
||
group.Add(func() error { | ||
return grpcSrv.Serve(grpcListener) | ||
}, func(err error) { | ||
logger.Debug("starting graceful shutdown", "server", "grpc") | ||
grpcSrv.GracefulStop() | ||
ctx, cancel := context.WithTimeout(context.Background(), shutdownTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There you also probably do not need a context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This context with timeout is essential for correct shutdown of grpc server. which now provides streaming method grpc.health.v1.Health/Watch
. When stream is kept open by one or more client, gracefulStop
waits indefinitely until all streams are closed. That's why I added 1-minute timeout on gracefulStop, followed by grpcSrv.Stop()
when server doesn't stop on time.
Again, I can implement it using bare channels, but I believe that using context
lib is a more idiomatic approach.
Implements gRPC Health check using standard grpc library. Reuses the health as reported by gosundheit health checker. Grpc health server implements also streaming Watch method, so it was necessary to add Stop() failback to stop server even if some clients ignore NOT_SERVING status. Timeout for graceful shutdown was chosen to match timeouts for http and telemetry servers. Signed-off-by: Robert Moucha <robert.moucha@gooddata.com>
Overview
Implements gRPC Health check using standard grpc library.
Reuses the health as reported by gosundheit health checker. Grpc health server
implements also streaming Watch method, so it was necessary to add
Stop() failback to stop server even if some clients ignore NOT_SERVING
status. Timeout for graceful shutdown was chosen to match timeouts for
http and telemetry servers.
What this PR does / why we need it
Dex already provides health check endpoint via HTTP. For integrations based on
gRPC, notably if we want to use client-side health checks for higher reliability,
as described here, we also need
to report service health using
grpc.health.v1.Health
service.Special notes for your reviewer
Golang is not my mother tongue, so if there is something utterly wrong, please let me
know.