Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
switch overhead alert panels to filtered metric
Tracking of our grafana alert panels uncovered a form of "unfair inflation" of our overhead percentages, given - the tekton timestamps are rounded to seconds - certain user defined pipelines could succeed very quickly, in a manner of seconds, which is much much faster than RHTAP pipelines So we built single metrics for our two overheads which combine algorithms from the multipe metrics we were using before, but also ignored pipelines which ended very quickly. After monitoring for a few weeks, the results appear favorable, with these specifics: - if the super fast pipelines had 0 overhead (i.e. 0 to 459 milliseconds), then when are ovehead over time were small with the original metrics that included every pipelinerun, our filtered metric was slightly higher because they did not benefit from the averages being lowered with samples of 0 - if the super fast pipelines had any overhead (i.e. 500 milliseconds or greater), then when we saw larger than typical overheads with the original metrics that included every pipelinerun, our filtered metrics produced lower overhead results - and generally speaking, the improvements with the new metric were better in proportion to any degradations with the new metric
- Loading branch information