Commit Graph

115 Commits

Author SHA1 Message Date
ying-jeanne 54de1078c8 remove the global log error/warn etc functions (#41404)
* remove the global log error/warn etc functions and use request context logger whenever possible
2021-11-08 17:56:56 +01:00
Tania B 5652bde447 Encryption: Use secrets service (#40251)
* Use secrets service in pluginproxy

* Use secrets service in pluginxontext

* Use secrets service in pluginsettings

* Use secrets service in provisioning

* Use secrets service in authinfoservice

* Use secrets service in api

* Use secrets service in sqlstore

* Use secrets service in dashboardshapshots

* Use secrets service in tsdb

* Use secrets service in datasources

* Use secrets service in alerting

* Use secrets service in ngalert

* Break cyclic dependancy

* Refactor service

* Break cyclic dependancy

* Add FakeSecretsStore

* Setup Secrets Service in sqlstore

* Fix

* Continue secrets service refactoring

* Fix cyclic dependancy in sqlstore tests

* Fix secrets service references

* Fix linter errors

* Add fake secrets service for tests

* Refactor SetupTestSecretsService

* Update setting up secret service in tests

* Fix missing secrets service in multiorg_alertmanager_test

* Use fake db in tests and sort imports

* Use fake db in datasources tests

* Fix more tests

* Fix linter issues

* Attempt to fix plugin proxy tests

* Pass secrets service to getPluginProxiedRequest in pluginproxy tests

* Fix pluginproxy tests

* Revert using secrets service in alerting and provisioning

* Update decryptFn in alerting migration

* Rename defaultProvider to currentProvider

* Use fake secrets service in alert channels tests

* Refactor secrets service test helper

* Update setting up secrets service in tests

* Revert alerting changes in api

* Add comments

* Remove secrets service from background services

* Convert global encryption functions into vars

* Revert "Convert global encryption functions into vars"

This reverts commit 498eb19859.

* Add feature toggle for envelope encryption

* Rename toggle

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
Co-authored-by: Joan López de la Franca Beltran <joanjan14@gmail.com>
2021-11-04 18:47:21 +02:00
Yuriy Tseretyan a1e1a728ad Alerting: Update references to alertmanager (#40904)
* update module reference for alertmanager
* remove workaround
2021-10-29 10:03:51 -04:00
Santiago c9654c4bc0 Fix issues with Slack contact points (#40953)
* recipient validation regex modified, validation at creation/modification implemented

* Remove validation for recipient, fix tests

* Log level changed from Warn to Error
2021-10-27 13:58:37 -03:00
Skye bce1011361 Alerting: Option for Discord notifier to use webhook name (#40463)
* Added an option to discord notifier to use discord's webhook name (useful for customizing notifications).

* Support ngalert system with discord username toggle

* Added ngalert discord test

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Docs updated with discord username setting

* Fix api integration test

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-10-26 14:55:10 -04:00
Santiago 1095f69740 Fix slack contact point panic (code review changes) (#40770)
* panic when unexpected Slack response fixed, tests added

* fix linting errors, add test

* Code review changes

* Apply PR comments

Co-authored-by: Armand Grillet <armand.grillet@outlook.com>
2021-10-22 17:23:22 +02:00
gotjosh 74fb491b6a Alerting: Validate contact point configuration during migration to Unified Alerting (#40717)
* Alerting: Validate contact point configuration during the migration

This minimises the chances of generating broken configuration as part of the migration. Originally, we wanted to generate it and not produce a hard stop in Grafana but this strategy has the chance to avoid delivering notifications for our users.

We now think it's better to hard stop the migration and let the user take care of resolving the configuration manually.
2021-10-22 10:11:06 +01:00
George Robinson 967721068e Alerting: Support custom annotations and labels when testing contact points
Support custom annotations and labels when testing contact points
2021-10-21 13:47:06 +01:00
Santiago e54fe220e5 Fix panic when Slack API sends unexpected response (#40721)
* panic when unexpected Slack response fixed, tests added

* fix linting errors, add test
2021-10-21 08:12:15 +02:00
Yuriy Tseretyan 36e12e2b26 Remove flakiness of google chat tests (#40592) 2021-10-19 16:18:44 -04:00
Jean-Philippe Quéméner 153c356993 Alerting: delete orphaned records from kvstore (#40337) 2021-10-14 12:04:00 +02:00
gotjosh 48d73cb148 Alerting: Fixes a bug when trying to sync broken alertmanager config (#40338)
* Alerting: Fixes a bug when trying to sync broken alertmanager config

Broken alertmanager configuration has the potential to be introduced as part of a migration e.g. due to incompatible data between what grafana accepts and what the Alertmanager expects. When this happens, we expect an eventually consistent behaviour where we'll keep trying to apply the configuration until it works.

As part of change in https://github.com/grafana/grafana/pull/39237 we introduced a regression that modified this behaviour and instead tried to create a new Alertmanager for that organization everytime, which eventually ended up in a panic due to a duplicate metrics being registered.

This PR fixes that and introduces a test to catch further regressions.

* Remove disable orgs
2021-10-12 18:10:08 +01:00
Jean-Philippe Quéméner e1dfec49f9 Alerting: cleanup alert resources on org removal (#39938) 2021-10-12 12:05:02 +02:00
George Robinson 8318e45452 Alerting: Fix error message in ngalert when notifications cannot be sent to alertmanager (#40158) 2021-10-11 14:50:50 +01:00
Jean-Philippe Quéméner d9c0220824 Alerting: add organziation ID to the ngAlert webhook payload (#40189)
* Alerting: add organziation ID to the ngAlert webhook payload
2021-10-08 14:52:44 +02:00
Yuriy Tseretyan 5836def6c2 Alerting: declare constants for __dashboardUid__ and __panelId__ literals (#39976) 2021-10-07 17:30:06 -04:00
Joan López de la Franca Beltran 722c414fef Encryption: Refactor securejsondata.SecureJsonData to stop relying on global functions (#38865)
* Encryption: Add support to encrypt/decrypt sjd

* Add datasources.Service as a proxy to datasources db operations

* Encrypt ds.SecureJsonData before calling SQLStore

* Move ds cache code into ds service

* Fix tlsmanager tests

* Fix pluginproxy tests

* Remove some securejsondata.GetEncryptedJsonData usages

* Add pluginsettings.Service as a proxy for plugin settings db operations

* Add AlertNotificationService as a proxy for alert notification db operations

* Remove some securejsondata.GetEncryptedJsonData usages

* Remove more securejsondata.GetEncryptedJsonData usages

* Fix lint errors

* Minor fixes

* Remove encryption global functions usages from ngalert

* Fix lint errors

* Minor fixes

* Minor fixes

* Remove securejsondata.DecryptedValue usage

* Refactor the refactor

* Remove securejsondata.DecryptedValue usage

* Move securejsondata to migrations package

* Move securejsondata to migrations package

* Minor fix

* Fix integration test

* Fix integration tests

* Undo undesired changes

* Fix tests

* Add context.Context into encryption methods

* Fix tests

* Fix tests

* Fix tests

* Trigger CI

* Fix test

* Add names to params of encryption service interface

* Remove bus from CacheServiceImpl

* Add logging

* Add keys to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Add missing key to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Undo changes in markdown files

* Fix formatting

* Add context to secrets service

* Rename decryptSecureJsonData to decryptSecureJsonDataFn

* Name args in GetDecryptedValueFn

* Add template back to NewAlertmanagerNotifier

* Copy GetDecryptedValueFn to ngalert

* Add logging to pluginsettings

* Fix pluginsettings test

Co-authored-by: Tania B <yalyna.ts@gmail.com>
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
2021-10-07 17:33:50 +03:00
gotjosh 6572017ec7 Alerting: Allow more characters in label names so notifications are sent (#38629)
Remove validation for labels to be accepted in the Alertmanager, This helps with datasources that produce non-compatible labels.

Adds an "object_matchers" to alert manager routers so we can support labels names with extended characters beyond prometheus/openmetrics. It only does this for the internal Grafana managed Alert Manager.

This requires a change to alert manager, so for now we use grafana/alertmanager which is a slight fork, with the intention of going back to upstream.

The frontend handles the migration of "matchers" -> "object_matchers" when the route is edited and saved. Once this is done, downgrades will not work old versions will not recognize the "object_matchers".

Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
2021-10-04 15:06:40 +02:00
Yuriy Tseretyan 4dadb8fc51 Alerting: Remove extra field orgId from notifier.Alertmanager (#39870) 2021-10-01 09:54:37 -04:00
Sofia Papagiannaki 012d4f0905 Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
Sofia Papagiannaki f6f3a54742 Alerting: tune rule evaluation via configuration (#35623)
* Alerting: Configure max evaluation retries

* Alerting: Enforce minimum rule evaluation interval

* Alerting: Disable rule evaluation from configuration

* Update docs

* Alerting: Configure rule evaluation timeout

* Move options on unified_alerting config section

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-28 13:00:16 +03:00
Yuriy Tseretyan 05eb30e323 Alerting: Move alertmanager default config to UnifiedAlertingSettings (#39597) 2021-09-23 13:52:20 -04:00
Yuriy Tseretyan 1910d85ae0 Alerting: Optimization of fetching data in multiorg alertmanager (#39237)
* Add method GetAllLatestAlertmanagerConfiguration to DBStore
* add method ApplyConfig to AlertManager
* update multiorg alert manager to load all alertmanager configs at once
2021-09-21 11:01:23 -04:00
gotjosh 2ad82b9354 Alerting: Move the unified alerting settings to its own struct (#39350) 2021-09-20 10:12:21 +03:00
Yuriy Tseretyan e1aae0549e Provide reader to alertmanager silence instead of file path (#39305) 2021-09-17 14:12:27 -04:00
gotjosh 7db97097c9 Alerting: Support Unified Alerting with Grafana HA (#37920)
* Alerting: Support Unified Alerting in Grafana's HA mode.
2021-09-16 15:33:51 +01:00
Santiago 0d2e68537c Alerting: Cleanup template, silence and notification files created du… (#39007)
* Alerting: Cleanup template, silence and notification files created during tests

* Create tempdir for testing, delete afterwards and check for errors

* Refactoring error checks

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-09-15 18:48:52 -03:00
gotjosh 2b1d3d27e4 Alerting: Fix bug not creating filepath for silences/nflog if it does not exist (#39174)
We created this filepath just as we're about persist the templates - with the latest change, we now need to create it sooner.
2021-09-14 14:40:59 +01:00
gotjosh a2f4344bf2 Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00
Marcus Efraimsson 2cc0788187 Chore: Disable backend test for now since it adds 10 minutes extra in CI (#39150)
Ref #38586
2021-09-13 19:37:26 +02:00
gotjosh 39a3bb8a1c Alerting: Persist notification log and silences to the database (#39005)
* Alerting: Persist notification log and silences to the database

This removes the dependency of having persistent disk to run grafana alerting. Instead of regularly flushing the notification log and silences to disk we now flush the binary content of those files to the database encoded as a base64 string.
2021-09-09 17:25:22 +01:00
Yuriy Tseretyan 6c2884ac37 Alerting: Fix notifier tests to close the temp file (#38992) 2021-09-09 09:56:42 -04:00
gotjosh 2f27a5240b Alerting: Fix flake on test receiver tests (#38511)
* Alerting: Fix flake on test receiver tests

* Make the actual result from the API be sorted

* Use the correct letters
2021-08-24 17:22:11 +01:00
David Parrott 7fbeefc090 Alerting: create wrapper for Alertmanager to enable org level isolation (#37320)
Introduces org-level isolation for the Alertmanager and its components.

Silences, Alerts and Contact points are not separated by org and are not shared between them.

Co-authored with @davidmparrott and @papagian
2021-08-24 11:28:09 +01:00
Domas cb9912ec0a Alerting: button to test contact point (#37475) 2021-08-18 10:16:35 +03:00
George Robinson 3ca00f90b5 Contact point testing (#37308)
This commit adds contact point testing to ngalerts via a new API
endpoint. This endpoint accepts JSON containing a list of
receiver configurations which are validated and then tested
with a notification for a test alert. The endpoint returns JSON
for each receiver with a status and error message. It accepts
a configurable timeout via the Request-Timeout header (in seconds)
up to a maximum of 30 seconds.
2021-08-17 13:49:05 +01:00
Sofia Papagiannaki 04d5dcb7c8 Alerting: modify DB table, accessors and migration to restrict org access (#37414)
* Alerting: modify table and accessors to limit org access appropriately

* Update migration to create multiple Alertmanager configs

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* replace mg.ClearMigrationEntry()

mg.ClearMigrationEntry() would create a new session.
This commit introduces a new migration for clearing an entry from migration log for replacing  mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction.
It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log.

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-12 16:04:09 +03:00
SLAMA 5986d99f51 Alerting frontend : fix line notifier (#37744)
- Fixes #37425
- change `line` type string to uppercase
2021-08-10 18:59:53 +01:00
gotjosh f83cd401e5 Alerting: Send alerts to external Alertmanager(s) (#37298)
* Alerting: Send alerts to external Alertmanager(s)

Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:

- Introduce a new table to hold "admin" (either org or global)
  configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
  "senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
  shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.

There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
2021-08-06 13:06:56 +01:00
Ganesh Vernekar e8ac802e4f Alerting: Remove unused fields in Pagerduty struct (#37198)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-27 10:41:48 +05:30
Ganesh Vernekar a65975cca0 Alerting: Remove the fixed wait for notification delivery (#37203)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-26 15:15:09 +02:00
Djairho Geuens 4cadbba686 Email: Allow configuration of content types for email notifications (#34530)
* Alerting: Allow configuration of content types for email notifications

* Fix lint error

* Improves email templates

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve code comments

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve email template

* Remove unnecessary predeclaration

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Adds handling for unrecognized content type

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Move utility function outside of util package

* Fixes syntax

* Remove unused package

* Fix lint error

* improve email templates

* Fix test

* Alerting: Allow configuration of content types for email notifications

* Fix lint error

* Improves email templates

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve code comments

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve email template

* Remove unnecessary predeclaration

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Adds handling for unrecognized content type

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Move utility function outside of util package

* Fixes syntax

* Remove unused package

* Fix lint error

* improve email templates

* Fix test

* Fix comment style

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Fix template formatting

* Add test and improve error handling

* Fix test

* Fix formatting

* Fix formatting

* Improve documentation and regenerates txt template

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: Djairho Geuens <djairho.geuens@ae.be>
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-07-19 13:31:51 +03:00
Sofia Papagiannaki f308ba91e3 Alerting: Improve receiver initialisation errors (#36814)
* Alerting: Improve receiver initialisation errors
2021-07-19 11:58:35 +03:00
Sofia Papagiannaki afe6e793ff Alerting: deactivate an Alertmanager configuration (#36794)
* Alerting: deactivate an Alertmanager configuration

Implement DELETE /api/alertmanager/grafana/config/api/v1/alerts
by storing the default configuration which stops existing cnfiguration
from being in use.

* Apply suggestions from code review
2021-07-16 20:07:31 +03:00
Ganesh Vernekar 8efe1856e2 Alerting: A better and cleaner way to know if Alertmanager is initialised (#36659)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-12 18:53:01 +05:30
Ganesh Vernekar e19c690426 Alerting: Fix potential panic in Alertmanager when starting up (#36562)
* Alerting: Fix potential panic in Alertmanager when starting up

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-12 11:15:16 +05:30
Ganesh Vernekar 94d2520a84 Alerting: Allow space in label and annotation names (#36549)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-08 18:26:09 +05:30
Ganesh Vernekar 8fe58fc2e3 Alerting: Add additional newlines to Microsoft Teams notification message where necessary (#36126)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-24 20:52:00 +05:30
Kyle Brandt 19f764739b Alerting: Change __value__ label to __value_string__ annotation and add ValueString variable in notifications (#36032)
* Alerting: Allow __value__ label in notifications
was being removed by removePrivateItems
discoverd in #36020, but issue is not about that specifically

* __value__ label to __value_string__ annotation
and .ValueString extended property for notifications
2021-06-24 12:45:49 +05:30
Ganesh Vernekar 9a5c1f06df Alerting: Template all possible variables in notification channels (#35749)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-22 15:12:54 +05:30