Commit Graph

12 Commits

Author SHA1 Message Date
Joe Blubaugh
22c937340e Revert "Alerting: Write and Delete multiple alert instances. (#54072)" (#54885)
This reverts commit 5e4fd94413.
2022-09-09 17:44:06 +02:00
Joe Blubaugh
5e4fd94413 Alerting: Write and Delete multiple alert instances. (#54072)
Prior to this change, all alert instance writes and deletes happened
individually, in their own database transaction. This change batches up
writes or deletes for a given rule's evaluation loop into a single
transaction before applying it.

Before:
```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8           398           2991381 ns/op         1133537 B/op      27703 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.619s
```

After:
```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8          1440            816484 ns/op          352297 B/op       6529 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.383s
```

So we cut time by about 75% and memory allocations by about 60% when
storing and deleting 100 instances.

This change also updates some of our tests so that they run successfully against postgreSQL - we were using random Int64s, but postgres integers, which our tables use, max out at 2^31-1
2022-09-02 11:17:20 +08:00
Joe Blubaugh
1cc034d960 Alerting: Add a "Reason" to Alert Instances to show underlying cause of state. (#49259)
This change adds a field to state.State and models.AlertInstance
that indicate the "Reason" that an instance has its current state. This
helps us account for cases where the state is "Normal" but the
underlying evaluation returned "NoData" or "Error", for example.

Fixes #42606

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-05-23 16:49:49 +08:00
Kyle Brandt
a735c51202 Alerting/Chore: Backend remove def_ columns from instance (#33875)
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
David Parrott
5072fefc22 allow saving pending alerts (#33667) 2021-05-04 09:24:20 -07:00
Kyle Brandt
c1034f3118 Alerting: Create instanceStore (#33587)
for https://github.com/grafana/alerting-squad/issues/129
2021-05-03 07:19:15 -04:00
Kyle Brandt
7823842c5d Alerting: Load annotations from rule into State cache (#33542)
for https://github.com/grafana/alerting-squad/issues/127
2021-04-30 20:23:12 +02:00
Kyle Brandt
b8f01fe034 Alerting: backend "ng" code cleanup (#33578) 2021-04-30 13:21:57 -04:00
David Parrott
567a6a09bd Alerting: Return RuleResponse for api/prometheus/grafana/api/v1/rules (#32919)
* Return RuleResponse for api/prometheus/grafana/api/v1/rules

* change TODO to note

Co-authored-by: gotjosh <josue@grafana.com>

* pr feedback

* test fixup

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-13 17:38:09 -04:00
Sofia Papagiannaki
daabf64aa1 [Alerting]: Update scheduler to evaluate rules created by the unified API (#32589)
* Update scheduler

* Fix tests

* Fixes after code review feedback

* lint - add uncommitted modifications

Co-authored-by: kyle <kyle@grafana.com>
2021-04-03 20:13:29 +03:00
David Parrott
2a8446e435 Alerting: Persist alerts on evaluation and shutdown. Warm cache from DB on startup (#32576)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* Save alert transitions and save all state on shutdown

* pr feedback

* WIP

* WIP

* Persist alerts on evaluation and shutdown. Warm cache on startup

* Filter non-firing alerts before sending to notifier

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-02 08:11:33 -07:00
Sofia Papagiannaki
4ce0a49eac AlertingNG: Split into several packages (#31719)
* AlertingNG: Split into several packages

* Move AlertQuery to models
2021-03-08 22:19:21 +02:00