Commit Graph

152 Commits

Author SHA1 Message Date
Alexander Akhmetov
9f5b05f936 Alerting: Add metadata field with editor_settings to alert rule (#93245) 2024-09-19 16:43:41 +02:00
Matthew Jacobson
32f06c6d9c Alerting: Receiver API complete core implementation (#91738)
* Replace global authz abstraction with one compatible with uid scope

* Replace GettableApiReceiver with models.Receiver in receiver_svc

* GrafanaIntegrationConfig -> models.Integration

* Implement Create/Update methods

* Add optimistic concurrency to receiver API

* Add scope to ReceiversRead & ReceiversReadSecrets

migrates existing permissions to include implicit global scope

* Add receiver create, update, delete actions

* Check if receiver is used by rules before delete

* On receiver name change update in routes and notification settings

* Improve errors

* Linting

* Include read permissions are requirements for create/update/delete

* Alias ngalert/models to ngmodels to differentiate from v0alpha1 model

* Ensure integration UIDs are valid, unique, and generated if empty

* Validate integration settings on create/update

* Leverage UidToName to GetReceiver instead of GetReceivers

* Remove some unnecessary uses of simplejson

* alerting.notifications.receiver -> alerting.notifications.receivers

* validator -> provenanceValidator

* Only validate the modified receiver

stops existing invalid receivers from preventing modification of a valid
receiver.

* Improve error in Integration.Encrypt

* Remove scope from alert.notifications.receivers:create

* Add todos for receiver renaming

* Use receiverAC precondition checks in k8s api

* Linting

* Optional optimistic concurrency for delete

* make update-workspace

* More specific auth checks in k8s authorize.go

* Add debug log when delete optimistic concurrency is skipped

* Improve error message on authorizer.DecisionDeny

* Keep error for non-forbidden errutil errors
2024-08-26 10:47:53 -04:00
Matthew Jacobson
ba800692c6 Alerting: Persist AlertInstance ResolvedAt & LastSentAt (#89135)
* Alerting: Persist AlertInstance ResolvedAt & LastSentAt

* Fix test

* Modify existing tests

* Fix merge conflicts from nullable LastSentAt & ResolvedAt
2024-07-12 12:26:58 -04:00
Alexander Weaver
36ef611cf4 Alerting: Add database migration for recording rule fields (#87012)
* Create recording rule fields in model

* Add migration

* Write to database, support in version table

* extend fingerprint

* Force fields to be empty on validate

* Another storage spot, tests for fingerprint

* Explicitly set defaults in provisioning API

* Tests for main API validation

* Add diff tests even though fields are unpopulated for now

* Use struct tag approach instead of FromDB/ToDB hooks as it better handles nulls when deserializing

* test for deser

* Backout RecordTo for now since it's not decided in the doc

* back out of migration too

* Drop datasourceref for now

* address linter complaints

* Try a single outer struct with all fields embedded
2024-05-09 12:12:44 -05:00
Yuri Tseretyan
1eebd2a4de Alerting: Support for simplified notification settings in rule API (#81011)
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated

* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.

* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings

* update GET API so only admins can see auto-gen
2024-02-15 09:45:10 -05:00
Yuri Tseretyan
f6a46744a6 Alerting: Support hysteresis command expression (#75189)
Backend: 

* Update the Grafana Alerting engine to provide feedback to HysteresisCommand. The feedback information is stored in state.Manager as a fingerprint of each state. The fingerprint is persisted to the database. Only fingerprints that belong to Pending and Alerting states are considered as "loaded" and provided back to the command.
   - add ResultFingerprint to state.State. It's different from other fingerprints we store in the state because it is calculated from the result labels.
  -  add rule_fingerprint column to alert_instance
   - update alerting evaluator to accept AlertingResultsReader via context, and update scheduler to provide it.
   - add AlertingResultsFromRuleState that implements the new interface in eval package
   - update getExprRequest to patch the hysteresis command.

* Only one "Recovery Threshold" query is allowed to be used in the alert rule and it must be the Condition.


Frontend:

* Add hysteresis option to Threshold in UI. It's called "Recovery Threshold"
* Add test for getUnloadEvaluatorTypeFromCondition
* Hide hysteresis in panel expressions

* Refactor isInvalid and add test for it
* Remove unnecesary React.memo
* Add tests for updateEvaluatorConditions

---------

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2024-01-04 11:47:13 -05:00
Matthew Jacobson
6644e5e676 Alerting: Fix migration that is brittle to version downgrades (#78976) 2023-12-01 15:18:41 -05:00
Matthew Jacobson
cdad712547 Alerting: Keep track of individual org migration status (#78369)
* Alerting: Keep track of individual org migration status

Save migration status per migrated org.

Change the meaning (and key/value) of the org_id=0 entry 
to store the current (previous) config value used by alerting. 
This is so we can know when to upgrade/downgrade by 
comparing with the new config value in 
UnifiedAlerting.IsEnabled.
2023-11-30 10:25:59 -05:00
Matthew Jacobson
82f3127e23 Alerting: Move legacy alert migration from sqlstore migration to service (#72702) 2023-10-12 13:43:10 +01:00
Alexander Weaver
f6649d7a97 Revert "Alerting: Remove vendored models in migration service" (#76387)
Revert "Alerting: Remove vendored models in migration service (#74503)"

This reverts commit 6a8649d544.
2023-10-11 14:21:21 -05:00
Matthew Jacobson
6a8649d544 Alerting: Remove vendored models in migration service (#74503)
This PR replaces the vendored models in the migration with their equivalent ngalert models. It also replaces the raw SQL selects and inserts with service calls.

It also fills in some gaps in the testing suite around:

    - Migration of alert rules: verifying that the actual data model (queries, conditions) are correct 9a7cfa9
    - Secure settings migration: verifying that secure fields remain encrypted for all available notifiers and certain fields migrate from plain text to encrypted secure settings correctly e7d3993

Replacing the checks for custom dashboard ACLs will be replaced in a separate targeted PR as it will be complex enough alone.
2023-10-11 17:22:09 +01:00
George Robinson
ed7d29f2b9 Alerting: Migrate old alerting templates to Go templates (#62911)
* Migrate old alerting templates to use $labels

* Fix imports

* Add test coverage and separate rewriting to Go templates

* Fix lint

* Check for additional closing braces

* Add logging of invalid message templates

* Fix tests

* Small fixes

* Update comments

* Panic on empty token

* Use logtest.Fake

* Fix lint

* Allow for spaces in variable names by not tokenizing spaces

* Add template function to deduplicate Labels in a Value map

* Fix behavior of mapLookupString

* Reference deduplicated labels in migrated message template

* Fix behavior of deduplicateLabelsFunc

* Don't create variable for parent logger

* Add more tests for deduplicateLabelsFunc

* Remove unused function

* Apply suggestions from code review

Co-authored by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

* Give label val merge function better name

* Extract template migration and escape literal tokens

* Consolidate + simplify template migration

---------

Co-authored-by: William Wernert <william.wernert@grafana.com>
2023-10-02 11:25:33 -04:00
Serge Zaitsev
58f6648505 Chore: capitalise messages for alerting (#74335) 2023-09-04 18:46:34 +02:00
Ryan McKinley
025b2f3011 Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
Yuri Tseretyan
143683bd05 Alerting: Add more clear error to migration when rule cannot be parsed (#72374) 2023-07-27 09:47:04 -04:00
Matthew Jacobson
c7eb7fb58a Alerting: Keep legacy alert rule maxDataPoints and intervalMs during migration (#71989)
Keep legacy alert rule max-data-points and interval during migration
2023-07-24 13:36:34 -04:00
Matthew Jacobson
8c6cdf51fc Alerting: No longer silence paused alerts during legacy migration (#71596)
* Alerting: No longer silence paused alerts during legacy migration

Now that we migrate paused legacy alerts to paused UA alert rules, we no longer need to silence them.
2023-07-17 09:37:53 -04:00
Jo
d6c468c1c2 Auth: Add empty role definition (#64694)
* Allow setting role as None

Co-authored-by: gamab <gabi.mabs@gmail.com>

Seeking for places where role.None would be used

Co-authored-by: Jguer <joao.guerreiro@grafana.com>

Adding None role to the frontend

Co-authored-by: Jguer <joao.guerreiro@grafana.com>

unify org role declaration and remove from add permission

fix backend test

fix backend lint

* remove role none from frontend

* Simplify checks

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>

* nits

---------

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>
2023-07-06 15:40:06 +02:00
Matthew Jacobson
00d5f7fed7 Alerting: Convert 'Both' type Prometheus queries to 'Range' in migration (#70781)
* Alerting: Convert 'Both' type Prometheus queries to 'Range' in migration
2023-06-28 14:02:57 -04:00
Santiago
ff9eff49bd Alerting: Bump grafana/alerting and refactor the ImageStore/Provider to provide image URL/bytes (#70182)
* implement alerting.images.Provider interface in our ImageStore

* add URLExists() method to fakeConfigStore

* make linter happy

* update integration tests
2023-06-21 20:53:30 -03:00
Dan Cech
a6279b2d62 Database: Make dialects independent of xorm Engine (#69955)
* make dialects independent of xorm Engine

* goimports
2023-06-14 16:13:36 -04:00
Alexander Weaver
0ed5d3bdf2 Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes" (#69265)
Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)"

This reverts commit 72a187b0be.
2023-05-30 11:33:33 -05:00
Santiago
72a187b0be Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)
* (WIP) Refactor the ImageStore interface to work with our latest alerting repository

* update alerting package

* refactor, new URLExists method in ImageProvider

* tests for the new methods

* fix linter warnings

* use alertingImages as an alias for grafana/alerting/images

* logs about image uris and not found images

* nerf image not found logs

* extract duplicated code to getImageFromURI() method

* refactor getImageFromURI()

* add index on url

* add comment about migration log

* sync generated files
2023-05-30 11:25:55 -03:00
Yuri Tseretyan
3af95bebe1 Alerting: Migrate unknown NoData\Error settings to the default (#68403)
* use default execution if legacy is not known

* update docs

* Update docs/sources/alerting/migrating-alerts/migrating-legacy-alerts.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/migrating-alerts/migrating-legacy-alerts.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2023-05-24 13:09:17 -04:00
Yuri Tseretyan
0ce7f7eaf4 Alerting: Migration to not fail if alert_configuration table is not empty (#67924) 2023-05-05 12:03:53 -04:00
Yuri Tseretyan
61484fa826 Alerting: Mention title of alert rule that caused migration to fail (#67451)
* add debug log for migration of alert rules
* add alert rule name and some information to conversion error
2023-05-01 10:32:16 -04:00
Yuri Tseretyan
a8b4a4bb45 Alerting: Update alerting module to 20230418161049-5f374e58cb32 + refactoring (#66622)
* update to alerting 20230418161049-5f374e58cb32
* rename renamed structs in https://github.com/grafana/alerting/pull/73
* update ValidateContactPoint to use BuildReceiverConfiguration
* update logger factory according to changes
* rewrite integration builder
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-04-25 13:39:46 -04:00
Ieva
035bf29146 RBAC: Remove the option to disable RBAC and add automated permission migrations for instances that had RBAC disabled (#66652)
* RBAC: Stop reading enabeld from ini file and always set to true

* Migrations: Add a migration for rbac to reset data migrations if rbac
was disabled

* If rbac was disabled we reset the data and data migrations that rbac
  has to perform to get it to a correct state

* Migrator: Store migration logs on migrator and add function to clear it from the
in-memory stored logs

* update tests

---------

Co-authored-by: Karl Persson <kalle.persson@grafana.com>
2023-04-19 16:34:19 +01:00
Yuri Tseretyan
afd52d0866 Alerting: use alerting GrafanaReceiver and BuildReceiverConfiguration in Grafana (#65224)
* replace receiver errors with one from alerting
* add the converter to alerting models
* update buildReceiverIntegration to accept GrafanaReceiver
---------

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-04-13 12:25:32 -04:00
Yuri Tseretyan
7b2f44762e Alerting: Update migration to put alerts to the default folder if dashboard folder is missing (#65577)
* extract function

* use context logger

* put alert to general folder if folder is missing

* move folderHelper init

* add test

* Update pkg/services/sqlstore/migrations/ualert/ualert.go

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>

---------

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2023-03-31 18:40:21 +02:00
ismail simsek
3cfb7ac0dd Chore: Remove expr imports (#64543)
* Remove expr imports

* Add comment
2023-03-23 21:55:54 +01:00
George Robinson
030f6c948f Alerting: Fix migration pauses all alert rules on PostgreSQL (#63951)
This commit fixes a serious bug in Grafana 9.4.1 where on upgrade
a migration would pause all existing alert rules and change the
default value of the column to true.
2023-03-01 17:32:29 +00:00
Alex Moreno
a05bf41ff9 Alerting: Fix boolean default in migration from false to 0 (#63952)
Fix boolean default in migration from false to 0
2023-03-01 16:58:30 +00:00
Ryan McKinley
b1e58eb47e Chore: Replace short UID generation with more standard UUIDs (#62731) 2023-02-06 20:44:37 -05:00
Yuri Tseretyan
f066e8cdcd Alerting: Update to alerting 20230203015918-0e4e2675d7aa (after refactoring) (#62823)
* add alerting prefix to some packages from alerting that have similar names in prometheus alertmanager
2023-02-03 11:36:49 -05:00
idafurjes
00d954320f Chore: rename Id to ID in alert notification models (#62868) 2023-02-03 15:46:55 +01:00
George Robinson
f49efa6e27 Alerting: Pause dash alerts on migration (#62798)
* Alerting: Pause dash alerts on migration
2023-02-02 16:49:05 -05:00
Gilles De Mey
c3cd4a720f Alerting: Adds a default value to the last_applied column (#62818) 2023-02-02 19:35:28 +00:00
Santiago
ba731f7865 Alerting: Mark AM configuration as applied (#61330)
* Mark AM configuration as applied

* add missing checks, make linter happy

* fix deadlock, mark as valid on save and on load

* mark configurations only if needed

* check error after applyConfig()

* code review comments

* code review changes

* more code review changes

* clean HistoricConfigFromAlertConfig function
2023-02-02 14:45:17 -03:00
idafurjes
23c27cffb3 Chore: Rename Id to ID in alerting models (#62777)
* Chore: Rename Id to ID in alerting models

* Add xorm tags for datasource

* Add xorm tag for uid
2023-02-02 17:22:43 +01:00
ismail simsek
91221bc436 Expressions: Fixes the issue showing expressions editor (#62510)
* Use suggested value for uid

* update the snapshot

* use __expr__

* replace all -100 with __expr__

* update snapshot

* more changes

* revert redundant change

* Use expr.DatasourceUID where it's possible

* generate files
2023-01-31 18:50:10 +01:00
suntala
51bef166c2 Chore: Remove Result field from serviceaccounts, ualert (#62476)
* Chore: Remove Result field from serviceaccounts
* Chore: Remove Result field from ualert
2023-01-31 09:51:55 +01:00
Serge Zaitsev
d6d4097567 Chore: Fix goimports grouping in alerting (#62424)
* fix goimports

* fix goimports order
2023-01-30 09:55:35 +01:00
Matthew Jacobson
8379a29b53 Alerting: Improve comments on alert table migration immutability (#62161)
* Alerting: Improve comments around alert migration immutability

* Reunite alerting config history migrations
2023-01-26 16:13:08 -05:00
Alex Moreno
531b439cf1 Alerting: Add alert pausing feature (#60734)
* Add field in alert_rule model, add state to alert_instance model, and state to eval

* Remove paused state from eval package

* Skip paused alert rules in scheduler

* Add migration to add is_paused field to alert_rule table

* Convert to postable alerts only if not normal, pernding, or paused

* Handle paused eval results in state manager

* Add Paused state to eval package

* Add paused alerts logic in scheduler

* Skip alert on scheduler

* Remove paused status from eval package

* Apply suggestions from code review

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Remove state

* Rethink schedule and manager for paused alerts

* Change return to continue

* Remove unused var

* Rethink alert pausing

* Paused alerts storing annotations

* Only add one state transition

* Revert boolean method renaming refactor

* Revert take image refactor

* Make registry errors public

* Revert method extraction for getting a folder title

* Revert variable renaming refactor

* Undo unnecessary changes

* Revert changes in test

* Remove IsPause check in PatchPartiLAlertRule function

* Use SetNormal to set state

* Fix text by returning to old behaviour on alert rule deletion

* Add test in schedule_unit_test.go to test ticks with paused alerts

* Add coment to clarify usage of context.Background()

* Add comment to clarify resetStateByRuleUID method usage

* Move rule get to a more limited scope

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* rum gofmt on pkg/services/ngalert/schedule/schedule.go

* Remove defer cancel for context

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/testing.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* skip scheduler rule state clean up on paused alert rule

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Fix mock in test

* Add (hopefully) final suggestions

* Use error channel from recordAnnotationsSync to cancel context

* Run make gen-cue

* Place pause alert check in channel update after version check

* Reduce branching un update channel select

* Add if for error and move code inside if in state manager ResetStateByRuleUID

* Add reason to logs

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Do not delete alert rule routine, just exit on eval if is paused

* Reduce branching and create-close a channel to avoid deadlocks

* Separate state deletion and state reset (includes history saving)

* Add current pause state in rule route in scheduler

* Split clearState and bring errCh closer to RecordStatesAsync call

* Change rule to ruleMeta in RecordStatesAsync

* copy state to be able to modify it

* Add timeout to context creation

* Shorten the timeout

* Use resetState is rule is paused and deleteState if rule is not paused

* Remove Empty state reason

* Save every rule change in historian

* Add tests for DeleteStateByRuleUID and ResetStateByRuleUID

* Remove useless line

* Remove outdated comment

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
2023-01-26 18:29:10 +01:00
Kristin Laemmert
e8b8a9e276 chore: move dashboard_acl models into dashboard service (#62151) 2023-01-26 08:46:30 -05:00
Kristin Laemmert
40feee0d17 chore: move alert-related models (#61716)
* chore: move alert notification models into the alerting service (alerting/models)
2023-01-23 08:19:25 -05:00
idafurjes
b573b19ca3 Chore: Remove dashboards from models pkg (#61578)
* Copy dashboard models to dashboard pkg

* Use some models from current pkg instead of models

* Adjust api pkg

* Adjust pkg services

* Fix lint

* Chore: Remove dashboards models

* Remove dashboards from models pkg

* Fix lint in tests

* Fix lint in tests 2

* Fix for import in auth

* Remove newline

* Revert unused fix
2023-01-18 13:52:41 +01:00
Matthew Jacobson
63ba3ccb58 Alerting: Improve legacy migration to include send reminder & frequency (#60275)
* Alerting: Improve legacy migration to include send reminder & frequency

Legacy channel frequency is migrated to the channel's migrated route's
repeat interval if send reminder is true. If send reminder is false, we
pseudo-disable the repeat interval by setting it to a large value (1y).

If there were no default channels, the root notification policy is still
created with the default 4h repeat interval.
2023-01-10 23:01:43 -05:00
Alexander Weaver
0e7640475f Alerting: Store alertmanager configuration history in a separate table in the database (#60492)
* Update config store to split between active and history tables

* Migrations to fix up indexes

* Implement migration from old format to new

* Move add migrations call

* Delete duplicated rows

* Explicitly map fields

* Quote the column name because it's a reserved word

* Lift migrations to top

* Use XORM for nearly everything, avoid any non trivial raw SQL

* Touch up indexes and zero out IDs on move

* Drop TODO that's already completed

* Fix assignment of IDs
2023-01-04 10:43:26 -06:00