Commit Graph

1743 Commits

Author SHA1 Message Date
Alexander Akhmetov d44728f4e5 Alerting: Metric to count imported from Prometheus rules (#100847) 2025-03-05 14:02:28 +01:00
Yuri Tseretyan 67b44ad22a Alerting: Fix state reason (#101530)
---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-03-04 17:05:41 +02:00
Alexander Akhmetov 60827fe499 Alerting: Return 403 if no datasource access or quota has been exceeded (#101522) 2025-03-04 10:04:47 +01:00
Alexander Akhmetov c7c68322b1 Alerting: Allow specifying a folder for Prometheus rule import (#101406)
What is this feature?

Allows the creation of alert rules with mimirtool in a specified folder.

Why do we need this feature?

Currently, the APIs for mimirtool create namespaces and rule groups in the root folder without the ability to set a custom folder. For example, it could be a special "Imported" folder, etc.

This PR makes it possible with a special header: mimirtool ... --extra-headers="X-Grafana-Alerting-Folder-UID=123". If it's not present, the root folder is used, otherwise, the specified one is used.

mimirtool does not support nested folder structures, while Grafana allows folder nesting. To keep compatibility, we return only direct child folders of the working folder (as namespaces) with rule groups and rules that are directly in these child folders as if there are no nested folders.

For example, given this folder structure in Grafana:

```
	grafana/
	├── production/
	│   ├── service1/
	│   │   └── alerts/
	│   └── service2/
	└── testing/
	    └── service3/
```

If the working folder is "grafana":

    Only namespaces "production" and "testing" are returned
    Only rule groups directly within these folders are included

If the working folder is "production":
   -  Only namespaces "service1" and "service2" are returned
    Only rule groups directly within these folders are included
2025-03-03 17:59:01 +01:00
Matthew Jacobson 2466685a41 Alerting: Improve template testing by trying non-root scopes (#101471)
Expand template testing to try additional scopes if the root scope fails.
This mitigates errors for definitions like pagerduty.default.instances,
which require the .Alerts scope. Added support for .Alerts and .Alert
scopes.
2025-02-28 20:27:27 +02:00
Yuri Tseretyan 1d54850a68 Alerting: Get alert rule versions by GUID (#101469)
* get alert rule versions by GUID

* protect guid field from accidental update
2025-02-28 11:27:46 -05:00
Yuri Tseretyan 879b121136 Alerting: Add GUID to alert rule tables (#101321)
* add column guid to alert rule table and rule_guid to rule version table
+ populate the new field with UUID
* update storage and domain models
* patch GUID
* ignore GUID in fingerprint tests
2025-02-28 09:47:25 -05:00
Alexander Akhmetov ae2074ef55 Alerting: Fix updating Prometheus definition in the metadata (#101440)
Initially, Metadata had only the EditorSettings, and HasMetadata was used to understand if the incoming update request had metadata in the body because it could be omitted if it was empty. For example, when the rule is updated via the provisioning API or has only false values. If it was in the request, we used that; if not, we used the metadata from the existing rule from the database. If the rule was updated via the AlertRuleService, we didn't change Metadata at all if the rule already existed.

But now, Metadata also has the Prometheus rule definition, and we always need to update it with the new version of the AlertRuleService when the rule exists in the DB and has the same UID. HasMetadata is renamed to HasEditorSettings to keep the old behaviour only for EditorSettings.

Now, the provisioning API and the conversion API will overwrite everything except EditorSettings with the new data.
2025-02-28 13:11:49 +02:00
Ryan McKinley 806c043e45 UnifiedStorage: Rename Batch processing to Bulk (#101413) 2025-02-28 08:41:08 +03:00
Moustafa Baiou bc4be187af Alerting: Fix evaluation of rules with no-op math expressions
When you use a math expression with out any operators, the dataFrame pointer is identical between the expression result and the input query/expression.

This was resulting in the values returned from an evaluation overshadowing each other, depending on the order of the processing of the result map.

For example:
```
A: some_metric
B: reduce of A
C: math expression -> "${B}"
D: Threshold evaluation of C -> "C > 0"
```
With a value of 1 for `some_metric`, might result in a evaluation result of one of the following (somewhat at random):
1. { B: 1, D: 1 }
2. { C: 1, D: 1}

While you would expect to see:
{ B: 1, C: 1, D: 1 }
2025-02-27 17:04:18 -05:00
Alexander Akhmetov ef86582dfc Alerting: API paths for cortextool to import Loki rules (#101409)
Alerting: Legacy rules paths for cortextool
2025-02-27 17:20:49 +01:00
Alexander Akhmetov d947433d19 Alerting: API to delete rule groups using mimirtool (#100687)
* Alerting: API to delete rule groups using mimirtool
2025-02-27 13:04:47 +01:00
Yuri Tseretyan 32fde6dba4 Alerting: Update scheduler to provide full specification to rule update channel (#101375)
update scheduler's aler rule to accept regular Evaluation in update channel

This makes it accept the full rule definition, which is required in reset state.
2025-02-26 14:39:39 -05:00
Alexander Akhmetov af7fafd03a Alerting: Add rule group name to the rule title when converting Prometheus rules (#101310)
Alerting: Add alert rule name to the title when converting Prometheus rules
2025-02-26 11:52:21 +01:00
Alexander Akhmetov 6eb335a8ce Alerting: API to read rule groups using mimirtool (#100674) 2025-02-25 15:49:08 +01:00
Pepe Cano 2585fec99e Alerting: Clarify that the AWS SNS subject field cannot be empty (#100780)
* Alerting: Clarify that the AWS SNS subject field cannot be empty

* minor copy change
2025-02-25 12:06:38 +01:00
Alexander Akhmetov 03e94e7a3e Alerting: Update grafana/alerting (#101215)
* Update grafana/alerting from 9d7e00921e44 to 2acbeef29642

* Change the package for the TLSClient

* Fix TestContactPointFromContactPointExports test
2025-02-25 11:32:28 +01:00
Alexander Akhmetov b641fd64f9 Alerting: API to create rule groups using mimirtool (#100558)
What is this feature?

Adds an API endpoint to create alert rules with mimirtool:

- POST /convert/prometheus/config/v1/rules/{NamespaceTitle} - Accepts a single rule group in a Prometheus YAML format and creates or updates a Grafana rule group from it.

The endpoint uses the conversion package from #100224.

Key parts

The API works similarly to the provisioning API. If the rule does not exist, it will be created, otherwise updated. Any rules not present in the new group will be deleted, ensuring the group is fully synchronized with the provided configuration.

Since the API works with namespace titles (folders), the handler automatically creates a folder in the root based on the provided title if it does not exist. It also requires a special header, X-Grafana-Alerting-Datasource-UID. This header specifies which datasource to use for the new rules.

If the rule group's evaluation interval is not specified, it uses the DefaultRuleEvaluationInterval from settings.
2025-02-25 11:26:36 +01:00
Santiago b58d616495 Alerting: Handle err-mimir-max-label-names-per-series as a user error in the prom writer (#101214) 2025-02-24 15:43:19 +01:00
Alexander Akhmetov 9dac0c9eeb Alerting: Add math node to the converted Prometheus rules (#101097) 2025-02-22 12:36:58 +01:00
Alexander Akhmetov 5a6d9a99f3 Alerting: Generate stable UIDs for alert rules in Prometheus conversion (#100973) 2025-02-22 11:06:42 +01:00
Yuri Tseretyan d1dfa0576b Alerting: Support Jira Integration (#100480) 2025-02-21 12:51:38 -05:00
Matthew Jacobson b78a63b0ad Alerting: Use new image TokenProvider and send image url in annotation (#99989)
* Send new annotation containing image url

* Use new image TokenProvider with TokenStore

New abstraction GetImage no longer needs to support parsing both token and
url from annotations, as remote AM will use the new URLProvider. Instead, we
use the new generic TokenProvider and give it a TokenStore backed by the
grafana database.

That means we revert back to always using token simplifying code and security
considerations.

* Upgrade grafana/alerting to merged commit SHA
2025-02-20 12:47:40 -05:00
Matthew Jacobson 75c4c5ca0f Alerting: Upgrade grafana/alerting to 92d5f29 (#100982)
* Alerting: Upgrade grafana/alerting to 92d5f29

Includes:
- Add more context to log in PipelineAndStateTimestampCoordinationStage (#277)
- Update Alertmanager fork to latest commit (#279)
- Copy http client from Grafana (#281)

* Satisfy signature change from grafana/alerting #281 (http client)
2025-02-19 18:49:46 +02:00
Ryan McKinley 5a40c84568 DualWriter: Support managed DualWriter (#100881) 2025-02-19 17:50:39 +03:00
Stephanie Hingtgen 67be9aeed6 K8s: Search fallback: Support all sort by methods (#100776) 2025-02-18 12:30:11 -06:00
Alexander Akhmetov cbae35c28b Alerting: Delete protobuf alert rule state on alert rule deletion (#100736) 2025-02-14 16:56:14 +01:00
Peter Štibraný 1856d47e47 Remove GetResourceClient hack from unified package. (#100636)
* Remove GetResourceClient hack from unified package.
2025-02-14 12:34:52 +01:00
Yuri Tseretyan 9dd75aee32 Alerting: Refactor State Transition (part 2 of n) (#99985)
* split create to create and patch and move to state

patch will be refactored further

* move setNextState to state transition

* move tests

* split tests for patch function
2025-02-13 09:45:16 -05:00
Alexander Akhmetov 3cc4320aa9 Alerting: Add rule conversion package (#100224) 2025-02-12 19:38:48 +02:00
Alexander Akhmetov 9593e51da7 Alerting: conversion API structure (#100258) 2025-02-12 08:13:21 +01:00
Stephanie Hingtgen df84d928e2 K8s: Folders: Fix legacy search (#100393) 2025-02-11 13:14:25 -06:00
Yuri Tseretyan 28f21e0a0d Alerting: Do not record rule version if no difference (#100364) 2025-02-11 09:46:26 -05:00
Yuri Tseretyan 4cac3158c7 Alerting: Fix alert rule copy to include metadata (#100212)
* copy metadata

* add tests for copy and generator

* extract copy rule to a production method and update usages

* fix tests
2025-02-11 09:46:02 -05:00
Moustafa Baiou 7dee4d1808 Alerting: Allow specifying uid for new rules added to groups (#99858)
When modifying rule groups the `uid` can be specified but only if the rule already existed in the DB. If the rule is new the update would be rejected.

This updates the RuleGroup provisioning apis to allow specifying the `uid` when creating/updating rule groups. 

Additionally, the RuleGroupIdx was not being updated when rules were reordered in the group.

Context: https://github.com/grafana/terraform-provider-grafana/pull/1971#issuecomment-2599223897
Relates to: https://github.com/grafana/terraform-provider-grafana/issues/1928

Fixes: #98283
2025-02-10 10:28:34 -05:00
Yuri Tseretyan 1b8db233a7 Alerting: Rule Version API to Ignore versions without diff (#100093) 2025-02-10 09:20:35 -05:00
Fayzal Ghantiwala 7ae8058c8b Alerting: Return 404 when /api/ruler/grafana/api/v1/rules/{Namespace}/{Groupname} does not exist (#100264)
* Return a 404 when rule group doesn't exist

* Update tests

* Update swagger doc and tests
2025-02-07 16:24:28 +00:00
Matthew Jacobson ccb0e9222a Alerting: Upgrade grafana/alerting to use EmbeddedContents (#99983)
* Upgrade grafana/alerting to include EmbeddedContents for email images
2025-02-06 11:29:43 -05:00
Yuri Tseretyan f7d476e408 Alerting: Remove id and org_id from grafana alert rule API model (#100139) 2025-02-05 23:13:22 +02:00
Yuri Tseretyan 33b11d5c76 Alerting: Remove ID and OrgID from hash calculation (#100140) 2025-02-05 14:15:02 -05:00
Yuri Tseretyan 68f1730461 Alerting: set updated_by for system owned operations (#100068) 2025-02-04 14:23:15 -05:00
Yuri Tseretyan ac41c19350 Alerting: Rule version history API (#99041)
* implement store method to read rule versions

* implement request handler

* declare a new endpoint

* fix fake to return correct response

* add tests

* add integration tests

* rename history to versions

* apply diff from swagger CI step

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-02-03 13:26:18 -05:00
Yuri Tseretyan 807f94b2c7 Alerting: Remove feature toggle alertingNoNormalState (#99905) 2025-02-03 17:32:50 +02:00
Alexander Akhmetov d6c1e3bb45 Alerting: Use org store to read organization IDs (#99938) 2025-02-03 15:38:16 +01:00
Alexander Akhmetov f45265b5f7 Alerting: Read from both proto and simple DB instance stores on startup (#99855) 2025-01-31 23:34:00 +01:00
Yuri Tseretyan 0be6e1bb86 Alerting: Extra dedup stage in Grafana Alertmanager (#99825)
* add feature flags

* update alerting module

* update grafana alertmanager to configure the extra dedup stage

---------

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2025-01-31 11:12:38 -05:00
Santiago 39f212a965 Alerting: Call RLock() before reading sendAlertsTo map (#99812)
* Alerting: Call RLock() before reading sendAlertsTo map

* defer unlocking

* drive-tru fix for another lock

* less time holding the lock in SyncAndApplyConfigFromDatabase
2025-01-31 12:43:02 +01:00
Yuri Tseretyan 7007342704 Alerting: k8s receivers api encrypt existing unencrypted secureFields on update (#99784)
* apply security patch: v11.5.x/305-202501232115.patch

commit 874ce8d12caad3742857ca86d2da7d5f81f3f825
Author: Matt Jacobson <matthew.jacobson@grafana.com>
Date:   Thu Jan 23 16:14:28 2025 -0500

    linting

commit c4b6d9194cc8b79e252e562a27a2d09a42d7a5e8
Author: Matt Jacobson <matthew.jacobson@grafana.com>
Date:   Thu Jan 23 14:56:35 2025 -0500

    CVE-2024-11741 - victorops url
2025-01-30 00:48:26 +02:00
Garret Wyman cf177776bf Alerting: Adding color option for slack receiver (#99615) 2025-01-30 00:12:16 +02:00
Moustafa Baiou b820fd6bef Alerting: Fix Alertmanager configuration updates (#99610)
* Alerting: Fix Alertmanager configuration updates

Alertmanager configuration updates would behave inconsistently when performing no-op updates with `mysql` as the store.

In particular this bug manifested as a failure to reload the provisioned alertmanager configuration components with no changes to the configuration itself. This would result in a 500 error with mysql store only.

The core issue is that we were relying on the number of rows affected by the update query to determine if the configuration was found in the db or not.
While this behavior works for certain sql dialects, mysql does not return the number of rows matched by the update query but rather the number of rows actually updated.

Also discovered and fixed the mismatched `xorm` tag for the `CreatedAt` field to match the actual column name in the db.

References: https://dev.mysql.com/doc/refman/8.4/en/update.html
2025-01-29 23:00:45 +02:00