Commit Graph

282 Commits

Author SHA1 Message Date
Alexander Akhmetov 169bf2ce73 Alerting: Add feature toggle to use the old simplified routing hash generation (#111900)
* Revert "Alerting: Generate simplified routing routes with old fingerprint function (#111893)"

This reverts commit 0da9d49896.

* Add alertingUseNewSimplifiedRoutingHashAlgorithm flag

* Alerting: Add feature toggle to use the old simplified routing hash generation
2025-10-01 15:21:33 -04:00
Alexander Akhmetov 0da9d49896 Alerting: Generate simplified routing routes with old fingerprint function (#111893) 2025-10-01 18:45:36 +02:00
Yuri Tseretyan b8f23eacd4 Alerting: Migrate to integration schema (#111643)
* update tests to assert against snapshot
* remove channel_config package replaced by schemas from alerting module
* update  references to use new schema
2025-09-26 09:31:50 -04:00
Yuri Tseretyan c36b2ae191 Alerting: v0 schema for integrations (mimir) (#110908)
* generate schema for mimir integrations from schema on front-end
* review and fix the settings
* Update GetAvailableNotifiersV2 to return mimir as v0
* add version argument to GetSecretKeysForContactPointType
* update TestGetSecretKeysForContactPointType to include v0
* add type alias field to contain alternate types that different from Grafana's
* add support for msteamsv2
* update ConfigForIntegrationType to look for alternate type
* update IntegrationConfigFromType to use new result of ConfigForIntegrationType
* add reference to parent plugin to NotifierPluginVersion to allow getting plugin type by it's alias
* add tests to ensure consistency
* make API response stable
* add tests against snapshot + omit optional fields
2025-09-17 09:25:56 -04:00
Yuri Tseretyan 356521c9b9 Alerting: Annotation CanUse for receiver resource (#110839)
* add origin to receiver
* populate origin of the receiver
* set CanUse to false if origin is not Grafana
* set provenance if origin is imported
* set Grafana origin by default in conversion API
* set canUse annotation
* reject update\delete operations on resources with origin other than Grafana
* fail to create with wrong origin
2025-09-16 09:32:04 -04:00
Ryan McKinley afc08dbbbc Chore: go.mod updates (#110957) 2025-09-15 09:01:45 +00:00
Alexander Akhmetov fc3636acf2 Alerting: Fix bug where rules with identical mute/active intervals produced conflicting routes (#110935)
Alerting: Fix hash collision in NotificationSettings fingerprint
2025-09-11 13:44:06 +02:00
Moustafa Baiou f65e219b21 Alerting: Update prometheus api to reuse list query logic
This lets the prometheus api respect NoGroup query logic and treat non-grouped rules consistently.

Co-authored-by: William Wernert <william.wernert@grafana.com>
2025-09-10 09:30:56 -04:00
Moustafa Baiou ca8324e62a Alerting: Add support for alpha rules apis in legacy storage
Rules created in the new api makes the rule have no group in the database, but the rule is returned in the old group api with a sentinel group name formatted with the rule uid for compatiblity with the old api.
This makes the UI continue to work with the rules without a group, and the ruler will continue to work with the rules without a group.

Rules are not allowed to be created in the provisioning api with a NoGroup sentinel mask, but NoGroup rules can be manipulated through both the new and old apis.

Co-authored-by: William Wernert <william.wernert@grafana.com>
2025-09-10 09:30:56 -04:00
William Wernert 61adae16f2 Alerting: Ensure failed query validation returns the proper error code (#110717)
Ensure presave error is a validation error
2025-09-08 13:51:22 -04:00
Peter Štibraný 7fd9ab9481 Replace check for integration tests. (#110707)
* Replace check for integration tests.
* Revert changes in pkg/tsdb/mysql packages.
* Fix formatting of few tests.
2025-09-08 15:49:49 +02:00
Yuri Tseretyan ce55d70fa5 Alerting: Refactor notification legacy storage (#110619)
* make legacy store expose only model.Receiver
* use integration as provenance type provider
* use revision RenameReceiverInRoutes
* introduce function GetReceiversNames in config revision

---------

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2025-09-05 14:46:46 +00:00
Alexander Akhmetov 8a7c1f595a Alerting: Backend state filtering for history UI (#109647) 2025-09-03 17:47:03 +02:00
Yuri Tseretyan 15fab1cb99 Alerting: Update integration schema to support versions (#109969)
* add VersionedNotifierPlugin and method that converts NotifierPlugin to it

* return new schema if query parameter version=2

* add version to k8s model of integration

* fix open api snapshot

* add version to IntegrationConfig

* use current version on conversion

* create versioned integrations for test
2025-08-28 14:46:30 -04:00
Moustafa Baiou c73b3ccf6e Alerting: Fix copying of recording rule fields
Recording rule fields were not being copied correctly when duplicating an alert rule. This manifests as missing `TargetDataSourceUID` fields from the `Record` part of the rule when rules in a group are re-ordered.

Added some additional tests to ensure we cover the generation of recording rules in tests and fixed the copying logic to ensure all fields are copied correctly.
2025-08-28 14:07:00 -04:00
Moustafa Baiou 5724fae778 Alerting: Add Extended List Query for Alert Rules w/pagination (#109360)
* Alerting: Add Extended List Query for Alert Rules w/pagination

This adds an extended query which allows filtering by the kind of rule (Recording or Alerting) and supports pagination.

Pagination tokens will allow us to list all rules paginated, regardless of the rule group.

---------

Co-authored-by: William Wernert <william.wernert@grafana.com>
2025-08-26 08:20:47 -04:00
Yuri Tseretyan a2cae07ac7 Alerting: Remove method ReceiverService.ListReceivers (#109828) 2025-08-19 09:12:23 -04:00
Moustafa Baiou a4edc27044 Alerting: Add store level pagination of rules
This reintroduces store level pagination, without using it in the prometheus API yet.

Related to #108633

Co-authored-by: William Wernert <william.wernert@grafana.com>
2025-08-14 07:46:53 -07:00
Moustafa Baiou 1bb68a1151 Revert "Alerting: Add store level pagination of rules" (#109422)
Revert "Alerting: Add store level pagination of rules (#108633)"

This reverts commit 2f0190d775.
2025-08-08 18:39:04 +00:00
Alexander Akhmetov 89d6756c67 Alerting: Filter out private labels before writing recording rules (#109295) 2025-08-07 17:25:12 +02:00
Moustafa Baiou 16f8359d35 Alerting: Update Alert Rule to use int64 for MissingSeriesEvalsToResolve (#109306) 2025-08-06 21:45:48 -04:00
William Wernert 2f0190d775 Alerting: Add store level pagination of rules (#108633) 2025-08-01 12:54:13 -04:00
Serge Zaitsev a95fb3a37c Chore: Omit integration tests if short test flag is passed (#108777)
* omit integration tests if short test flag is passed

* Update pkg/services/ngalert/models/receivers_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/tests/api/alerting/api_ruler_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/tests/api/alerting/api_ruler_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/tests/api/alerting/api_ruler_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/tests/api/alerting/api_ruler_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/tests/api/alerting/api_ruler_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/services/ngalert/models/receivers_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/cmd/grafana-cli/commands/datamigrations/to_unified_storage_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* Update pkg/services/ngalert/models/receivers_test.go

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>

* fix the rest

* false positive

---------

Co-authored-by: Matheus Macabu <macabu@users.noreply.github.com>
2025-07-28 13:38:54 +02:00
Matthew Jacobson 0016b57486 Alerting: Add OAuth2 Support for Webhook Receiver (#106302)
* Add to available channels

* Export

* Fix bug in deeply nested secrets

BE: Slice re-use bug when traversing deeply.

FE: Only at most one level of nesting was being taken into account
when determining secureFields keys. This change adds a new field on
NotificationChannelOption: secureFieldKey. This is populated on API GET via
transform. This change gives us the option to hardcode secureFieldKey in the
backend and no longer calculate the key via settings topology.

* Update grafana/alerting to 3e20fda3b872

* Prettier

* Linting

* Fix IntegrationConfig test to catch secure field mismatch
2025-06-12 23:00:09 +02:00
Alexander Akhmetov e1ce9ceac1 Alerting: Simplify alert rule unique constraint violation errors (#106608)
Alerting: Simplify alert rule storage unique constraint violation errors
2025-06-12 13:27:08 +02:00
Alexander Akhmetov da88e5912f Alerting: Evaluate all imported from Prometheus rules sequentially (#106295)
What is this feature?

Makes all alert rules imported from a Prometheus YAML or Prometheus-compatible data source evaluate sequentially.

Why do we need this feature?

Currently only alert rules [imported via the API](https://grafana.com/docs/grafana-cloud/alerting-and-irm/alerting/alerting-rules/alerting-migration/migration-api/) are evaluated sequentially, because only they have the original alert rule definition in YAML. But alert rules can be imported [in the UI, and from a YAML file](https://grafana.com/docs/grafana-cloud/alerting-and-irm/alerting/alerting-rules/alerting-migration/), and they won't be evaluated sequentially which can lead to issues with recording rules.
2025-06-05 12:08:44 +02:00
Fayzal Ghantiwala 589046bcdc Alerting: Persist alert instance FiredAt field (#105927)
* Persist alert instance fired at

* Update protos and tests
2025-05-27 10:04:26 +01:00
Yuri Tseretyan 3e2296acd3 Alerting: Support for active time intervals in notification policies (#104252)
* add active_time_intervals to route model

* update k8s compat layer

* update notification policies service to validate active time intervals

* update integration tests

* update openapi

* add active time interval to model

* update route generator to include active time interval

* Update storage list and rename methods to handle active intervals

* update api model

* update provisioning and export models

* update ui to allow active timing config

* update i18n

* fix snapshots for ui tests

* run prettier

* Alerting: Active time intervals UI naming (#104402)

* update naming in UI

* update naming in the edit page title

* update translations

* update alerting module

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Sonia Aguilar <33540275+soniaAguilarPeiron@users.noreply.github.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2025-05-07 19:19:33 -04:00
Tito Lins a7fe77cdbf grafana-ruler: add new alert query fields (#104933) 2025-05-07 09:30:34 +02:00
Alexander Akhmetov 040a82c815 Alerting: Add an internal label to rules converted from Prometheus (#104475) 2025-04-24 18:33:09 +02:00
William Wernert 820c338414 Alerting: Ensure field validators return the proper type (#104050)
* Ensure field validators return the proper type

This ensures correct error propagation through services up to
the API layer.

* Move error wrapping up to call site
2025-04-21 16:15:09 +01:00
Mariell Hoversholm 757be6365a CI: Bump golangci-lint to 2.0.2 (#103572) 2025-04-10 14:42:23 +02:00
Yuri Tseretyan dc0083d879 Alerting: Sequential evaluation of rules in group (#98829)
* introduce RulesGroupComparer

* extract runJob method

* implement sequential evaluation

* Make sequence building testable & add comments

* Also run callback in recording rules + add tests

* Improve tests

* Address PR comments

---------

Co-authored-by: William Wernert <william.wernert@grafana.com>
2025-04-02 23:10:32 +03:00
maicon d8c5c2d3b8 K8s: Folders: Modify GetChildren to return only Folder References (#103072)
* Return FolderReference instead of Folder on GetChildren

Signed-off-by: Maicon Costa <maiconscosta@gmail.com>

---------

Signed-off-by: Maicon Costa <maiconscosta@gmail.com>
2025-04-02 01:30:17 -03:00
Alexander Akhmetov f49a88ab72 Alerting: Add MissingSeriesEvalsToResolve to the APIs (#102150)
What is this feature?

A follow-up for #101184, adds AlertRule.MissingSeriesEvalsToResolve to the APIs.

missing_series_evals_to_resolve must be specified too and it must be > 0.

POST /api/ruler/grafana/api/v1/rules/{folderUID} works in the following way:

    If missing_series_evals_to_resolve is not sent or null, the rule keeps its existing value
    If missing_series_evals_to_resolve > 0: updates to that value
    If missing_series_evals_to_resolve = 0: resets to default (nil).
    AlertRule.MissingSeriesEvalsToResolve can't be 0, so I used it to reset

In the Provisioning API, the value is just set if present and > 0. Otherwise it's reset:

PUT to /api/v1/provisioning/alert-rules/{UID}:

    If missing_series_evals_to_resolve is nil, it's reset to the default value
    If missing_series_evals_to_resolve > 0, it's updated
2025-03-26 13:34:53 +01:00
Alexander Akhmetov f7aa17f2e4 Alerting: Add default values to AlertRule.Data queries in Prometheus conversion (#102843)
What is this feature?

Prometheus conversion: ensures that AlertRule.Data queries always have default parameters set (intervalMs, maxDataPoints). Without this, updates of the same rule can cause version increments.

Why do we need this feature?

Currently, when converting Prometheus rules to Grafana alerts, some default parameters are not explicitly set in the query model. This creates a problem during rule updates:

When a user updates a rule that hasn't changed, we still detect differences in the AlertQuery.Model because the newly converted rules are missing the default fields, such as intervalMs and maxDataPoints. This causes unnecessary version increments of alert rules.
2025-03-26 11:46:49 +01:00
Yuri Tseretyan e39b17d701 Alerting: Remove constraints for uniqueness of rule title (#102067)
* fix having duplicated names in same group in the UI

---------

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2025-03-18 13:27:44 -04:00
Alexander Akhmetov 695ac91290 Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
Alexander Akhmetov 7dd6f52630 Alerting: Add MissingSeriesEvalsToResolve option to the AlertRule (#101184) 2025-03-11 22:12:06 +01:00
Steve Simpson b7dcfcedcb Alerting: Extend recording rule definitions/interfaces with data source. (#101678)
Extend the recording rule definition to include the target data source, allowing
configuration of where the output of the recording rule is written to. Also
extends the relevant interfaces in preparation for the next set of changes.
2025-03-06 14:09:17 +01:00
Alexander Akhmetov d44728f4e5 Alerting: Metric to count imported from Prometheus rules (#100847) 2025-03-05 14:02:28 +01:00
Yuri Tseretyan 879b121136 Alerting: Add GUID to alert rule tables (#101321)
* add column guid to alert rule table and rule_guid to rule version table
+ populate the new field with UUID
* update storage and domain models
* patch GUID
* ignore GUID in fingerprint tests
2025-02-28 09:47:25 -05:00
Alexander Akhmetov ae2074ef55 Alerting: Fix updating Prometheus definition in the metadata (#101440)
Initially, Metadata had only the EditorSettings, and HasMetadata was used to understand if the incoming update request had metadata in the body because it could be omitted if it was empty. For example, when the rule is updated via the provisioning API or has only false values. If it was in the request, we used that; if not, we used the metadata from the existing rule from the database. If the rule was updated via the AlertRuleService, we didn't change Metadata at all if the rule already existed.

But now, Metadata also has the Prometheus rule definition, and we always need to update it with the new version of the AlertRuleService when the rule exists in the DB and has the same UID. HasMetadata is renamed to HasEditorSettings to keep the old behaviour only for EditorSettings.

Now, the provisioning API and the conversion API will overwrite everything except EditorSettings with the new data.
2025-02-28 13:11:49 +02:00
Alexander Akhmetov 6eb335a8ce Alerting: API to read rule groups using mimirtool (#100674) 2025-02-25 15:49:08 +01:00
Alexander Akhmetov b641fd64f9 Alerting: API to create rule groups using mimirtool (#100558)
What is this feature?

Adds an API endpoint to create alert rules with mimirtool:

- POST /convert/prometheus/config/v1/rules/{NamespaceTitle} - Accepts a single rule group in a Prometheus YAML format and creates or updates a Grafana rule group from it.

The endpoint uses the conversion package from #100224.

Key parts

The API works similarly to the provisioning API. If the rule does not exist, it will be created, otherwise updated. Any rules not present in the new group will be deleted, ensuring the group is fully synchronized with the provided configuration.

Since the API works with namespace titles (folders), the handler automatically creates a folder in the root based on the provided title if it does not exist. It also requires a special header, X-Grafana-Alerting-Datasource-UID. This header specifies which datasource to use for the new rules.

If the rule group's evaluation interval is not specified, it uses the DefaultRuleEvaluationInterval from settings.
2025-02-25 11:26:36 +01:00
Matthew Jacobson b78a63b0ad Alerting: Use new image TokenProvider and send image url in annotation (#99989)
* Send new annotation containing image url

* Use new image TokenProvider with TokenStore

New abstraction GetImage no longer needs to support parsing both token and
url from annotations, as remote AM will use the new URLProvider. Instead, we
use the new generic TokenProvider and give it a TokenStore backed by the
grafana database.

That means we revert back to always using token simplifying code and security
considerations.

* Upgrade grafana/alerting to merged commit SHA
2025-02-20 12:47:40 -05:00
Alexander Akhmetov 3cc4320aa9 Alerting: Add rule conversion package (#100224) 2025-02-12 19:38:48 +02:00
Yuri Tseretyan 4cac3158c7 Alerting: Fix alert rule copy to include metadata (#100212)
* copy metadata

* add tests for copy and generator

* extract copy rule to a production method and update usages

* fix tests
2025-02-11 09:46:02 -05:00
Moustafa Baiou 7dee4d1808 Alerting: Allow specifying uid for new rules added to groups (#99858)
When modifying rule groups the `uid` can be specified but only if the rule already existed in the DB. If the rule is new the update would be rejected.

This updates the RuleGroup provisioning apis to allow specifying the `uid` when creating/updating rule groups. 

Additionally, the RuleGroupIdx was not being updated when rules were reordered in the group.

Context: https://github.com/grafana/terraform-provider-grafana/pull/1971#issuecomment-2599223897
Relates to: https://github.com/grafana/terraform-provider-grafana/issues/1928

Fixes: #98283
2025-02-10 10:28:34 -05:00
Yuri Tseretyan 1b8db233a7 Alerting: Rule Version API to Ignore versions without diff (#100093) 2025-02-10 09:20:35 -05:00