-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROD] Performance issue - extremely slow loading of riskwatch data on GO - please investigate #2184
Comments
@thenav56, @szabozoltan69, I'm not sure, if the recent disk storage issue could be the reason for this? @nanometrenat, do you still experience such long response times? |
It could be the reason. After having more space the mentioned two queries, like this, run fast. |
Hi there, it seems fine today - loading quickly like I would expect. Does the timing of the ticket correlate with when there were storage space issues? If so then that is presumably a valid explanation! Thanks |
Yes, I think so. Though there was not made any tickets for that, only discussed with @thenav56 . |
Great that incident earlier this week was resolved swiftly! @szabozoltan69 @thenav56 is the root cause also resolved? i.e. monitoring in place so we get alerted so can fix it in advance next time? If so then I will happily close this ticket - thanks again |
Hey @nanometrenat @szabozoltan69 @tovari, We had some issues with the background tasks running on the same server as the API server. A memory leak in the background tasks affected the API server. We've added memory usage limits to the workers, which should fix the issue.
We've also been working on fixing the memory leak and are currently testing this in nightly. We've integrated Sentry profiling and cron monitoring, and we'll be pushing these changes to staging and production soon. Let's keep this ticket open for now. Once we've pushed the changes to production, we can revisit and close it 😄 |
Update: We can now use sentry to track and fix performance issues. Also, added health-check to track running intances state |
Amazing... |
Thanks @thenav56 - brilliant news! Closing this ticket on the basis that the underlying issue has been resolved and also monitoring has been added. Thanks once again to all! |
Issue
Risk watch API calls are taking much too long - like, 2 whole minutes to load the "Countries by Risk" data.
For example, for Africa, I went to https://go.ifrc.org/regions/0/risk-watch/seasonal and the page itself loaded quickly, however the Countries by Risk was just showing as loading. Looking at devtools I can see that
https://go-risk.northeurope.cloudapp.azure.com/api/v1/seasonal/?region=0 and
https://go-risk.northeurope.cloudapp.azure.com/api/v1/risk-score/?region=0&limit=9999
each took two mins
See screenshots from Devtools below
Similarly, if I am on the Imminent events page and select one of the countries' events then it takes > 8 seconds to load that one event (though in this case I can see it's queuing for a while before it goes, not sure what that means)
https://go.ifrc.org/regions/0/risk-watch/imminent page - calls https://go-risk.northeurope.cloudapp.azure.com/api/v1/pdc/99638/exposure/ - took
I have been doing Teams calls etc. on this same internet connection, and also using other parts of GO fine, so not sure why this bit of GO is so slow.
Thanks for your help investigating!
cc @justinginnetti
Screenshots etc.
I have attached my .har file in Teams if useful for investigating.
Expected behaviour
Not sure of our SLA for API responses these days but I think this is too long in any case!
Thanks loads
The text was updated successfully, but these errors were encountered: