Commit Graph

93 Commits

Author SHA1 Message Date
Grot (@grafanabot)
5a35850514 [v9.3.x] Alerting: Update migration to put alerts to the default folder if dashboard folder is missing (#66592)
* Alerting: Update migration to put alerts to the default folder if dashboard folder is missing (#65577)

* extract function

* use context logger

* put alert to general folder if folder is missing

* move folderHelper init

* add test

* Update pkg/services/sqlstore/migrations/ualert/ualert.go

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>

---------

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
(cherry picked from commit 7b2f44762e)

* rename ID to Id and dashboards.Dashboard to models.Dashboard

---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2023-04-15 08:09:15 +02:00
ismail simsek
ead3a002df [v9.3.x] Expressions: Fixes the issue showing expressions editor (#62627)
Expressions: Fixes the issue showing expressions editor (#62510)

* Use suggested value for uid

* update the snapshot

* use __expr__

* replace all -100 with __expr__

* update snapshot

* more changes

* revert redundant change

* Use expr.DatasourceUID where it's possible

* generate files

(cherry picked from commit 91221bc436)
2023-01-31 14:31:33 -05:00
Grot (@grafanabot)
9f639403e6 [v9.3.x] Alerting: Prevent uid collision in migration when db is case-insensitive (#60835)
Alerting: Prevent uid collision in migration when db is case-insensitive (#60494)

* Alerting: Prevent short uid collision in legacy migration when db is case-insensitive

Two factors come into play that cause sporadic uid conflicts during legacy alert migration:
- MySQL and MySQL-compatible backends use case-insensitive collation.
- Our short uid generator is not a uniform RNG and generates uids in such a way that generations in quick succession have a higher probability of creating similar uids.

Normally we would be guaranteed unique short uid generation, however if the source alphabet contains
duplicate characters (for example, if we use case-insensitive comparison) this guarantee is void.

Generating even ~1000 uids in quick succession is nearly guaranteed to create a case-insensitive
duplicate.

(cherry picked from commit 570b62091c)

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2022-12-29 16:15:12 -05:00
Alexander Weaver
0dfd78c88c Attempt to preserve UID from migrated channel (#57639) 2022-10-31 11:29:07 -05:00
Matthew Jacobson
0db339d82f Alerting: Improve notification policies created during migration (#52071)
* Alerting: Improve notification policies created during migration

Previously, migrated legacy alerts were connected to notification policies through
a `rule_uid` label in a 1:1 fashion. While this correctly mimicked pre-migration routing,
it didn't create a notification policy structure that is easy to view/modify. In addition,
having one policy per migrated alert is, in some ways, counter to the recommended approach of
Unified Alerting.

This change replaces `rule_uid`-based migrated notification policies with a private
label called `__contacts__`. This label stores a list of double quoted strings containing the names of
all contact points an AlertRule should route to (based on legacy notification channels). Finally,
one notification policy is created per contact point with each matching AlertRules via regex on this
`__contacts__` label.

The result is a simpler, clearer, and easier to modify notification policy structure, with the
added benefit that you can see which contact points an AlertRule is being routed to from the
AlertRule creation page.
2022-10-18 00:47:39 -04:00
Yuriy Tseretyan
3487e68d15 Alerting: Fix migration to create rules with group index 1 (#56511) 2022-10-07 17:20:01 -04:00
Yuriy Tseretyan
e2f1201382 Alerting: Fix migration to not add label "alertname" (#56509)
* do not add label alertname because it is overridden in state manager anyway
* update state manager to not consider labels with same value as dupe
2022-10-07 15:06:53 -04:00
Jean-Philippe Quéméner
f3a307778a Alerting: cache general folder in migration based on org id (#55620) 2022-09-23 18:22:45 +02:00
Emil Tullstedt
647997cc4c Alerting: Fix flaky test (#55551)
The length of the identifier from the underlying library is 9 or more characters depending on the rate at which the identifiers are generated. See https://pkg.go.dev/github.com/teris-io/shortid

The test previously made the assumption that the length will always be 10, which would intermittently fail.
2022-09-21 14:17:25 +02:00
Alexander Weaver
9f45e2e706 Alerting: Fix legacy migration crash when rule name is too long (#55053)
* Extract standardized UID field length to constant

* Extract default length to constant

* Truncate rule names that are too long

* Add tests for name normalization

* Fix whitespace lint error

* Another linter fix

* Empty commit to kick build
2022-09-13 13:53:09 -05:00
Emil Tullstedt
b287047052 Chore: Upgrade Go to 1.19.1 (#54902)
* WIP

* Set public_suffix to a pre Ruby 2.6 version

* we don't need to install python

* Stretch->Buster

* Bump versions in lib.star

* Manually update linter

Sort of messy, but the .mod-file need to contain all dependencies that
use 1.16+ features, otherwise they're assumed to be compiled with
-lang=go1.16 and cannot access generics et al.

Bingo doesn't seem to understand that, but it's possible to manually
update things to get Bingo happy.

* undo reformatting

* Various lint improvements

* More from the linter

* goimports -w ./pkg/

* Disable gocritic

* Add/modify linter exceptions

* lint + flatten nested list

Go 1.19 doesn't support nested lists, and there wasn't an obvious workaround.
https://go.dev/doc/comment#lists
2022-09-12 12:03:49 +02:00
Valério Valério
b5142832fa Alerting: Fix saving of screenshots uploaded with a signed url (#53933)
The URL of screenshots uploaded to external image storages can be optionally signed, resulting in a long string (800+ chars).
2022-08-24 12:40:50 +01:00
idafurjes
6afad51761 Move SignedInUser to user service and RoleType and Roles to org (#53445)
* Move SignedInUser to user service and RoleType and Roles to org

* Use go naming convention for roles

* Fix some imports and leftovers

* Fix ldap debug test

* Fix lint

* Fix lint 2

* Fix lint 3

* Fix type and not needed conversion

* Clean up messages in api tests

* Clean up api tests 2
2022-08-10 11:56:48 +02:00
Sofia Papagiannaki
ae101bf935 Alerting: Fix migration (#53253) 2022-08-03 11:41:18 -04:00
idafurjes
f5cace8bbd Rename Acl to ACL (#52342)
* Rename Acl to ACL

* Fix yaml files

* Add xorm tags and fix test
2022-07-18 15:14:58 +02:00
Matthew Jacobson
434e94ef2b Alerting: Update default route groupBy to [grafana_folder, alertname] (#50052)
* Alerting: Update default route groupBy to [grafana_folder, alertname]

Default group by for new routes and migrations is now [grafana_folder, alertname]
2022-07-11 12:24:43 -04:00
Kristin Laemmert
9de00c8eb2 chore/backend: move dashboard errors to dashboard service (#51593)
* chore/backend: move dashboard errors to dashboard service

Dashboard-related models are slowly moving out of the models package and into dashboard services. This commit moves dashboard-related errors; the rest will come in later commits.

There are no logical code changes, this is only a structural (package) move.

* lint lint lint
2022-06-30 09:31:54 -04:00
Emil Tullstedt
7d815a1db5 Alerting: Use google/uuid instead of gofrs/uuid (#51242) 2022-06-28 11:57:24 +02:00
Kristin Laemmert
945f015770 backend/datasources: move datasources models into the datasources service package (#51267)
* backend/datasources: move datasources models into the datasources service pkg
2022-06-27 12:23:15 -04:00
gotjosh
90646e7f41 Alerting: Don't stop the migration when alert rule tags are invalid (#51253)
* Alerting: Don't stop the migration when alert rule tags are invalid

As we migrate we expect the `alertRuleTags` on a dashboard alert to be a JSON object. However, it seems this is not really validated by Grafana and an user can change the format to something else that the JSON parser is not able to marshal into a `map[string]string`.

Let's do a bit better by "attempting" to parse the tags and if we can't we'll simple return an empty map. The data is still there so if the user wishes they can go back, fix the data and attemp the migration again.
2022-06-22 17:39:17 +01:00
Yuriy Tseretyan
4d02f73e5f Alerting: Persist rule position in the group (#50051)
Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration

API:
* set group index on update. Use the natural order of items in  the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.

UI:
* update UI to keep the order of alerts in a group
2022-06-22 10:52:46 -04:00
George Robinson
0bd12ab531 Alerting: Fix force_migration when alerting is disabled (#50431)
* Alerting: Fix force_migration when alerting is disabled

This commit fixes a bug where force_migration must be set to true
when both unified and legacy alerting is disabled.

* Update comment

* Fix typo in comment

Co-authored-by: Armand Grillet <armand.grillet@outlook.com>
2022-06-09 10:10:11 +02:00
Joe Blubaugh
30f035ca34 Alerting: Improve Unified Alerting Rollback Warning (#50470)
After migrating to unified alerting, users must explicitly allow rolling
back to legacy alerting by setting force_migration = true in config.
This updates the panic message to clarify why that's required and what
the consequences of rolling back will be.

Fixes #50469
2022-06-09 13:31:49 +08:00
idafurjes
e9f8d582c8 Chore: Remove dashboard version from models (#50287)
* Remove dashbpard version from models

* Fix lint

* Fix api & sqlstore tests

* Remove integration tags

* Fix lint again

* Add integration test to correct namespace

* Lont fix 2

* Change Id to ID in dashVersionMeta
2022-06-08 12:22:55 +02:00
George Robinson
3b7f871bf4 Alerting: Enable Unified Alerting for open source and enterprise (#49834)
This commit enables Unified Alerting for open source and enterprise unless disabled in configuration.
2022-05-30 16:47:15 +01:00
Joe Blubaugh
1cc034d960 Alerting: Add a "Reason" to Alert Instances to show underlying cause of state. (#49259)
This change adds a field to state.State and models.AlertInstance
that indicate the "Reason" that an instance has its current state. This
helps us account for cases where the state is "Normal" but the
underlying evaluation returned "NoData" or "Error", for example.

Fixes #42606

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-05-23 16:49:49 +08:00
Joe Blubaugh
12c25759da Alerting: Attach screenshot data to Slack notifications. (#49374)
This change extracts screenshot data from alert messages via a private annotation `__alertScreenshotToken__` and attaches a URL to a Slack message or uploads the data to an image upload endpoint if needed.

This change also implements a few foundational functions for use in other notifiers.
2022-05-23 14:24:20 +08:00
Joe Blubaugh
687e79538b Alerting: Add a general screenshot service and alerting-specific image service. (#49293)
This commit adds a pkg/services/screenshot package for taking and uploading screenshots of Grafana dashboards. It supports taking screenshots of both dashboards and individual panels within a dashboard, using the rendering service.

The screenshot package has the following services, most of which can be composed:

BrowserScreenshotService (Takes screenshots with headless Chrome)
CachableScreenshotService (Caches screenshots taken with another service such as BrowserScreenshotService)
NoopScreenshotService (A no-op screenshot service for tests)
SingleFlightScreenshotService (Prevents duplicate screenshots when taking screenshots of the same dashboard or panel in parallel)
ScreenshotUnavailableService (A screenshot service that returns ErrScreenshotsUnavailable)
UploadingScreenshotService (A screenshot service that uploads taken screenshots)

The screenshot package does not support wire dependency injection yet. ngalert constructs its own version of the service. See https://github.com/grafana/grafana/issues/49296

This PR also adds an ImageScreenshotService to ngAlert. This is used to take screenshots with a screenshotservice and then store their location reference for use by alert instances and notifiers.
2022-05-22 22:33:49 +08:00
Yuriy Tseretyan
d87fdc1037 Alerting: Update migration to migrate only alerts that belong to existing org\dashboard (#49192)
* Update migration to migrate only alerts that belong to existing org\dashboard
2022-05-18 16:00:08 -04:00
Matthew Jacobson
5c32a6b6f6 Alerting: Fix flaky migration test (#48595)
* Fix flaky migration test
2022-05-18 13:23:13 -04:00
Yuriy Tseretyan
00ef1acb93 Alerting: Create folder for alerting when start from the scratch (#48866)
* create folderHelper struct
2022-05-13 11:49:04 -04:00
Yuriy Tseretyan
f85e758972 unhide alert rule's data sources during migraiton (#48559) 2022-05-04 09:31:05 -04:00
Jean-Philippe Quéméner
0a87ef06af Alerting: add safeguard for migrations that might cause dataloss (#48526)
* Alerting: add safeguard for migrations that might cause dataloss

* add test for panic

* add documentation
2022-05-02 10:38:42 +02:00
Jean-Philippe Quéméner
9e3a01a1be Alerting: skip flaky test (#48500) 2022-04-29 12:32:30 +02:00
Matthew Jacobson
0301d956da Alerting: Create fewer contact points on migration (#47291)
* Alerting: Create fewer contact points on migration

Previously a new contact point was created for every unique combination
of channels attached to any legacy alert. This was very hard to maintain,
requiring modifications in every generated contact point.

This change deduplicates the generated contact points to a more
reasonable state. There should now only be one contact point per legacy
channel, and we attached multiple contact points to a route by nesting
them. The sole exception to this is if there were multiple default
legacy channels, in which case we create a redundant contact point
containing all of them used only in the root policy. This allows for a
much simpler notification policy structure.

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2022-04-26 16:17:30 +02:00
Matthew Jacobson
8972418113 Alerting: Add integration test for AddDashAlertMigration (#47730)
* Alerting: Add integration test for AddDashAlertMigration

* Add more targeted test cases

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Reorganize file and improve comments

* Replace custom sort+trim with go-cmp

* Add test for AddDashAlertMigration

* Rename test cases to standard format

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Remove test-only snapshots of PostableUserConfig et al.

* Organize imports

* Fix linting

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2022-04-21 04:02:23 +02:00
Jean-Philippe Quéméner
a80f04c949 Alerting: add collision safe update function for alertmanager configurations (#46692)
* Alerting: add collision safe update function for alertmanager configurations

* fix typo

* use bootstrap func for tests

* move hash calculation to store

* remove icons lol

* remove removed field
2022-03-23 09:31:46 +01:00
Jean-Philippe Quéméner
e135b8531a Alerting: refactor receiver validation to be reusable (#46103) 2022-03-15 00:27:10 +01:00
Yuriy Tseretyan
016d9e14ed Add missing option "OK" for Error state (#45262)
* Add missing OK option to models
* add ok to legacy legacy UI does not support it but it is possible to do so via provisioning.
* use enums in migration so linter would catch missing cases
2022-03-02 19:07:55 -05:00
Santiago
a9de33601c make send_alerts_to field nullable (#45572)
* make send_alerts_to field nullable

* set nullable to true since we have a default value
2022-02-18 01:45:34 -03:00
Yuriy Tseretyan
c59567a236 Alerting: support ok state in alert migration (#45264) 2022-02-10 21:57:43 +01:00
Alexander Weaver
935059a376 Alerting: Create basic storage layer for provisioning (#44679)
* Simplistic store API for provenance lookups on arbitrary types

* Add a few notes in comments

* Improved type safety for provisioned objects

* Clean-up TODOs for future PRs

* Clean up provisioning model

* Clean up tests

* Restrict allowable types in interface

* Fix linter error

* Move AlertRule domain methods to same file as AlertRule definition

* Update pkg/services/ngalert/models/provisioning.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Complete interface rename

* Pass context through store API

* More idiomatic method names

* Better error description

* Improve code-docs

* Use ORM language instead of raw sql

* Add support for records in different orgs

* ResourceTypeID -> ResourceType since it's not an ID

Co-authored-by: George Robinson <george.robinson@grafana.com>
2022-02-04 13:23:19 -06:00
Santiago
04d93751b8 Alerting: send alerts to external, internal, or both alertmanagers (#40341)
* (WIP) send alerts to external, internal, or both alertmanagers

* Modify admin configuration endpoint, update swagger docs

* Integration test for admin config updated

* Code review changes

* Fix alertmanagers choice not changing bug, add unit test

* Add AlertmanagersChoice as enum in swagger, code review changes

* Fix API and tests errors

* Change enum from int to string, use 'SendAlertsTo' instead of 'AlertmanagerChoice' where necessary

* Fix tests to reflect last changes

* Keep senders running when alerts are handled just internally

* Check if any external AM has been discovered before sending alerts, update tests

* remove duplicate data from logs

* update comment

* represent alertmanagers choice as an int instead of a string

* default alertmanagers choice to all alertmanagers, test cases

* update definitions and generate spec
2022-02-01 20:36:55 -03:00
Serge Zaitsev
84a5910e56 Chore: Remove bus from ngalert (#44465)
* pass notification service down to the notifiers

* add ns to all notifiers

* remove bus from ngalert notifiers

* use smaller interfaces for notificationservice

* attempt to fix the tests

* remove unused struct field

* simplify notification service mock

* trying to resolve issues in the tests

* make linter happy

* make linter even happier

* linter, you are annoying
2022-01-26 16:42:40 +01:00
Yuriy Tseretyan
ce0ef0ef5e create only one folder per dashboard with acl (#44283) 2022-01-21 10:24:41 -05:00
Yuriy Tseretyan
9139f61105 Alerting: Update alert rule migration to use expanded queries (#42493)
* move targetData to target

* use constants instead of literals.

* Update comments and add tests

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-12-01 11:45:27 +00:00
George Robinson
1b26d4d88e Alerting: Create DatasourceError alert if evaluation returns error (#41869)
* Alerting: Create DatasourceError alert if evaluation returns error

* Alerting: Add docs for DatasourceError alert

* Alerting: Fix DatasourceError alert does not have dashboard_uid label

* Alerting: Add break when datasource_uid found

* Alerting: Update TestProcessEvalResults
2021-11-25 11:46:47 +01:00
Armand Grillet
6523486122 Alerting: Make Unified Alerting enabled by default for those who do not use legacy alerting (#42200)
* update AlertingEnabled and UnifiedAlertingSettings.Enabled to be pointers
* add a pseudo migration to fix the AlertingEnabled and UnifiedAlertingSettings.Enabled if the latter is not defined
* update the default configuration file to make default value for both 'enabled' flags be undefined

Misc
* update Migrator to expose DB engine. This is needed for a ualert migration to access the database while the list of migrations is created.
* add more verbose failure when migrations do not match

Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2021-11-24 14:56:07 -05:00
ying-jeanne
54de1078c8 remove the global log error/warn etc functions (#41404)
* remove the global log error/warn etc functions and use request context logger whenever possible
2021-11-08 17:56:56 +01:00
Yuriy Tseretyan
610643a668 Alerting: Special alert instance if rule is in state NoData (#40540)
* do not suppress NoData state
* extract conversion of state to postable alert + tests
* create a special alert instance if nodata 
* use NoData when converting from Keep Last State instead of Alerting
* add silence during migration if NoData is mapped to KeepLastState.
2021-11-04 16:42:34 -04:00