grafana

Author	SHA1	Message	Date
Alexander Akhmetov	149f02aebe	Alerting: Add rule_group label to grafana_alerting_rule_group_rules metric (#88289 ) * Alerting: Add rule_group label to grafana_alerting_rule_group_rules metric (#62361) * Alerting: Delete rule group metrics when the rule group is deleted This commit addresses the issue where the GroupRules metric (a GaugeVec) keeps its value and is not deleted when an alert rule is removed from the rule registry. Previously, when an alert rule with orgID=1 was active, the metric was: grafana_alerting_rule_group_rules{org="1",state="active"} 1 However, after deleting this rule, subsequent calls to updateRulesMetrics did not update the gauge value, causing the metric to incorrectly remain at 1. The fix ensures that when updateRulesMetrics is called it also deletes the group rule metrics with the corresponding label values if needed.	2024-08-13 13:27:23 +02:00
Matthew Jacobson	3228b64fe6	Alerting: Resend resolved notifications for ResolvedRetention duration (#88938 ) * Simple replace of State.Resolved with State.ResolvedAt * Retain ResolvedAt time between Normal->Normal transition * Introduce ResolvedRetention to keep sending recently resolved alerts * Make ResolvedRetention configurable with resolved_alert_retention * Tick-based LastSentAt for testing of ResendDelay and ResolvedRetention * Do not reset ResolvedAt during Normal->Pending transition Initially this was done to be inline with Prom ruler. However, Prom ruler doesn't keep track of Inactive->Pending/Alerting using the same alert instance, so it's more understandable that they choose not to retain ResolvedAt. In our case, since we use the same cached instance to represent the transition, it makes more sense to retain it. This should help alleviate some odd situations where temporarily entering Pending will stop future resolved notifications that would have happened because of ResolvedRetention. * Pointers for ResolvedAt & LastSentAt To avoid awkward time.Time{}.Unix() defaults on persist	2024-06-20 16:33:03 -04:00
Diego Augusto Molina	9c29e1a783	Alerting: Fix data races and improve testing (#81994 ) * Alerting: fix race condition in (ngalert/sender.ExternalAlertmanager).Run Chore: Fix data races when accessing members of ngalert/state.FakeInstanceStore Chore: Fix data races in tests in ngalert/schedule and enable some parallel tests * Chore: fix linters * Chore: add TODO comment to remove loopvar once we move to Go 1.22	2024-02-14 12:45:39 -03:00
Yuri Tseretyan	131c72d655	Alerting: Fix scheduler to group folders by the unique key (orgID and UID) (#81303 )	2024-01-30 17:14:11 -05:00
Yuri Tseretyan	bad4f28d0d	Alerting: update test TestAlertingTicker to not rely on clock (#58544 ) * extract method processTick * make processTick return scheduled rules * move state manager tests to state manager * update test * move all tests into one file * remove unused fields	2022-11-09 15:08:57 -05:00
Alexander Weaver	e6f99fc418	Alerting: Decouple schedule package from store (#55858 ) * Separate out fake for scheduler tests * Delete extracted methods from older fake	2022-09-27 13:48:12 -05:00
Yuriy Tseretyan	02f8e99ca1	Alerting: move fake stores to store package (#45428 ) * make fake storage public * move fake storages to store package	2022-02-15 17:24:39 -05:00
George Robinson	67a3e1d6fd	Add context.Context to InstanceStore (#45049 )	2022-02-08 13:49:04 +00:00
George Robinson	a9399ab3cd	Alerting: Add context.Context to RuleStore (#45004 ) Alerting: Add context.Context to RuleStore	2022-02-08 08:52:03 +00:00
Yuriy Tseretyan	ed5c664e4a	Alerting: Stop firing of alert when it is updated (#39975 ) * Update API to call the scheduler to remove\update an alert rule. When a rule is updated by a user, the scheduler will remove the currently firing alert instances and clean up the state cache. * Update evaluation loop in the scheduler to support one more channel that is used to communicate updates to it. * Improved rule deletion from the internal registry. * Move alert rule version from the internal registry (structure alertRuleInfo) closer rule evaluation loop (to evaluation task structure), which will make the registry values immutable. * Extract notification code to a separate function to reuse in update flow.	2022-01-11 11:39:34 -05:00
gotjosh	357e9ed1ea	Alerting: Fix Annotation Creation when the alerting state changes (#42479 ) * Fix Annotation creation - Remove validation of panelID, now annotations are created irrespective on whether they're attached to a panel or not. - Alwasy attach the annotation to an AlertID * Fix annotation creation * fix tests	2021-12-01 11:04:54 +00:00
Yuriy Tseretyan	1b5b747885	Alerting: Additional Tests for State Manager (#41291 ) * rename fakeInstanceStore to FakeInstanceStore * update test for state manager to initialize instance store with FakeInstanceStore	2021-11-04 15:15:56 -04:00
Yuriy Tseretyan	6709359148	Alerting: Tests for rule evaluation routine (#40646 ) * add fake stores to record queries	2021-10-26 13:22:07 -04:00
Marcus Efraimsson	fa9857499b	Chore: GetDashboardQuery should be dispatched using DispatchCtx (#36877 ) * Chore: GetDashboardQuery should be dispatched using DispatchCtx * Fix after merge * Changes after review * Various fixes * Use GetDashboardCtx function instead of GetDashboard	2021-09-14 16:08:04 +02:00
David Parrott	7fbeefc090	Alerting: create wrapper for Alertmanager to enable org level isolation (#37320 ) Introduces org-level isolation for the Alertmanager and its components. Silences, Alerts and Contact points are not separated by org and are not shared between them. Co-authored with @davidmparrott and @papagian	2021-08-24 11:28:09 +01:00
gotjosh	f3f3fcc727	Alerting: Introduces `/api/v1/ngalert/alertmanagers` to expose discovered and dropped Alertmanager(s) (#37632 ) * Alerting: Expose discovered and dropped Alertmanagers Exposes the API for discovered and dropped Alertmanagers. * make admin config poll interval configurable * update after rebase * wordsmith * More wordsmithing * change name of the config * settings package too	2021-08-13 13:14:36 +01:00
gotjosh	f83cd401e5	Alerting: Send alerts to external Alertmanager(s) (#37298 ) * Alerting: Send alerts to external Alertmanager(s) Within this PR we're adding support for registering or unregistering sending to a set of external alertmanagers. A few of the things that are going are: - Introduce a new table to hold "admin" (either org or global) configuration we can change at runtime. - A new periodic check that polls for this configuration and adjusts the "senders" accordingly. - Introduces a new concept of "senders" that are responsible for shipping the alerts to the external Alertmanager(s). In a nutshell, this is the Prometheus notifier (the one in charge of sending the alert) mapped to a multi-tenant map. There are a few code movements here and there but those are minor, I tried to keep things intact as much as possible so that we could have an easier diff.	2021-08-06 13:06:56 +01:00

17 Commits