Files
grafana/pkg/services/ngalert/api/tooling
Steve Simpson 73873f5a8a Alerting: Optimize rule status gathering APIs when a limit is applied. (#86568)
* Alerting: Optimize rule status gathering APIs when a limit is applied.

The frontend very commonly calls the `/rules` API with `limit_alerts=16`. When
there are a very large number of alert instances present, this API is quite
slow to respond, and profiling suggests that a big part of the problem is
sorting the alerts by importance, in order to select the first 16.

This changes the application of the limit to use a more efficient heap-based
top-k algorithm. This maintains a slice of only the highest ranked items whilst
iterating the full set of alert instances, which substantially reduces the
number of comparisons needed. This is particularly effective, as the
`AlertsByImportance` comparison is quite complex.

I've included a benchmark to compare the new TopK function to the existing
Sort/limit strategy. It shows that for small limits, the new approach is
much faster, especially at high numbers of alerts, e.g.

100K alerts / limit 16: 1.91s vs 0.02s (-99%)

For situations where there is no effective limit, sorting is marginally faster,
therefore in the API implementation, if there is either a) no limit or b) no
effective limit, then we just sort the alerts as before. There is also a space
overhead using a heap which would matter for large limits.

* Remove commented test cases

* Make linter happy
2024-04-19 11:51:22 +02:00
..
2024-02-26 10:52:44 -07:00

What

This aims to define the unified alerting API as code. It generates OpenAPI definitions from go structs. It also generates server/route stubs based on our documentation.

Running

make - regenerate everything - documentation and server stubs. make serve - regenerate the Swagger document, and host rendered docs on port 80. view api

Requires

Why

The current state of Swagger extraction from golang is relatively limited. It's easier to generate server stubs from an existing Swagger doc, as there are limitations with producing a Swagger doc from a hand-written API stub. The current extractor instead relies on comments describing the routes, but the comments and actual implementation may drift, which we don't want to allow.

Instead, we use a hybrid approach - we define the types in Golang, with comments describing the routes, in a standalone package with minimal dependencies. From this, we produce a Swagger doc, and then turn the Swagger doc back into a full-blown server stub.

Stability

We have some endpoints that we document publicly as being stable, and others that we consider unstable. The stable endpoints are documented in api.json, where all endpoints are available in post.json.

To stabilize an endpoint, add the stable tag to its route comment:

// swagger:route GET /provisioning/contact-points provisioning stable RouteGetContactpoints