Commit Graph

376 Commits

Author SHA1 Message Date
Yuriy Tseretyan 288e8eeb15 Alerting: Do not update rule in database if it was not changed (#45980)
* do not include update if no diff
* refactor calculate changes to include diff (and log)

Co-authored-by: George Robinson <george.robinson@grafana.com>
2022-03-04 16:16:33 -05:00
Gilles De Mey a9b1a964b0 Alerting: adds support for federated rules (#46037) 2022-03-04 10:16:13 +01:00
Yuriy Tseretyan 016d9e14ed Add missing option "OK" for Error state (#45262)
* Add missing OK option to models
* add ok to legacy legacy UI does not support it but it is possible to do so via provisioning.
* use enums in migration so linter would catch missing cases
2022-03-02 19:07:55 -05:00
Konrad Lalik aeec087065 Alerting: Fix silence url in notifications (#46031)
* Update silence url generation

* Update tests

* Update test to the new silence params format

* Fix tests
2022-03-02 13:06:35 +01:00
George Robinson 789cfc31e3 Alerting: Fix use of > instead of >= when checking the For duration (#46011) 2022-03-01 17:06:42 +00:00
Alexander Weaver 703d7deeda Alerting: Add integration tests for dual-stage email templating (#43484)
* Resolve merge conflicts

* Remove cruft from local exploration

* Move integration tests to intercept using new abstraction layer instead of channel

* Fix linter error after rebase
2022-03-01 10:49:49 -06:00
Yuriy Tseretyan 4502e40ed8 Alerting: Revert Revert "Alerting: Calculate diff for two AlertRules" (#46034)
* Revert "Revert "Alerting: Calculate diff for two AlertRules (#45877)" (#46023)"

This reverts commit 82aa5acba6.

* remove flakiness
2022-03-01 11:10:29 -05:00
Jean-Philippe Quéméner 82aa5acba6 Revert "Alerting: Calculate diff for two AlertRules (#45877)" (#46023)
This reverts commit 4e19d7df63.
2022-03-01 13:40:47 +01:00
Yuriy Tseretyan 4e19d7df63 Alerting: Calculate diff for two AlertRules (#45877)
* add custom diff reporter DiffReporter that reports only paths that have a difference
* create Diff method for AlertRule that returns DiffReport, which is an alias for []Diff

Tests:
* create copy method for AlertRule in testing
* create GenerateAlertQuery method in testing
2022-02-28 17:13:53 +01:00
Selene 2c90dcf3c0 Dashboard Alert Extractor: Create service for dashboard extractor and remove bus (#45518)
* Create DashAlertService service

* Remove no used dashboard service from plugin's manager that generates dependency cycle in Enterprise

* Remove bus for dashboard permissions

* Remove bus from dashboard extractor service

* Add missing argument

* Fix wire

* Fix lint

* More goimports

* Use datasource service instead sql calls

* Fix integration test
2022-02-28 09:54:56 +01:00
Armand Grillet 1b2c4dca61 Capitalize Webhook contact point type (#45942) 2022-02-26 10:45:05 +01:00
George Robinson f87bfdf2ff Update comment for scheduler_behind_seconds metric (#45918) 2022-02-25 15:43:08 +01:00
Peter Holmberg 4ef58e595c Alerting: Add validation to slack contact point (#45618)
* add requiredifempty

* rename field, fix logic

* update mockdata

* remove logs

* update test

* fix json expected payload in e2e tests

* fix test

* fix test again

Co-authored-by: Jean-Philippe Quémémer <jeanphilippe.quemener@grafana.com>
2022-02-25 15:10:21 +01:00
George Robinson 6cccbb5a09 Fix incorrect metric values for scheduler_behind_seconds (#45830) 2022-02-25 11:40:30 +00:00
George Robinson 2ca79ca0c7 Rename evalCtx to avoid confusion with context.Context (#45144) 2022-02-25 11:09:20 +01:00
George Robinson feae959c9d Alerting: Create annotation if Firing alert is removed (#45703)
This commit changes staleResultsHandler to create an annotation if the current state is Alerting and the result is being removed from the state cache as it has not been updated since 2x the evaluation interval.
2022-02-24 16:25:28 +00:00
George Robinson 8d57318941 Alerting: Use expanded labels in dashboard annotations (#45726) 2022-02-24 10:58:54 +00:00
Nathan Rodman f9701d78b1 Alerting: add field for custom slack endpoint (#45751)
* add field for custom slack endpoint

* add test for using custom endpoint

* Update pkg/services/ngalert/notifier/channels/slack.go

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>

* specify description for endpoint

* remove brittle string constants

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2022-02-23 14:33:23 -08:00
Yuriy Tseretyan f75bea481d Alerting: validate rules and calculate changes in API controller (#45072)
* Update API controller
   - add validation of rules API model
   - add function to calculate changes between the submitted alerts and existing alerts
   - update RoutePostNameRulesConfig to validate input models, calculate changes and apply in a transaction

* Update DBStore
   - delete unused storage method. All the logic is moved upstream.
   - upsert to not modify fields of new by values from the existing alert
   - if rule has UID do not try to pull it from db. (it is done upstream)

* Add rule generator
2022-02-23 11:30:04 -05:00
Yuriy Tseretyan e44ea3d589 Alerting: Rename setting AlertForDuration to DefaultRuleEvaluationInterval (#45569)
* fix AlertForDuration to DefaultRuleEvaluationInterval

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2022-02-18 10:05:06 -05:00
Selene d5b98772ed Dashboards: Refactor service to make it injectable by wire (#44588)
* Add providers to folder and dashboard services

* Refactor folder and dashboard services

* Move store implementation to its own file due wire cannot allow us to cast to SQLStore

* Add store in some places and more missing dependencies

* Bad merge fix

* Remove old functions from tests and few fixes

* Fix provisioning

* Remove store from http server and some test fixes

* Test fixes

* Fix dashboard and folder tests

* Fix library tests

* Fix provisioning tests

* Fix plugins manager tests

* Fix alert and org users tests

* Refactor service package and more test fixes

* Fix dashboard_test tets

* Fix api tests

* Some lint fixes

* Fix lint

* More lint :/

* Move dashboard integration tests to dashboards service and fix dependencies

* Lint + tests

* More integration tests fixes

* Lint

* Lint again

* Fix tests again and again anda again

* Update searchstore_test

* Fix goimports

* More go imports

* More imports fixes

* Fix lint

* Move UnprovisionDashboard function into dashboard service and remove bus

* Use search service instead of bus

* Fix test

* Fix go imports

* Use nil in tests
2022-02-16 14:15:44 +01:00
Yuriy Tseretyan 02f8e99ca1 Alerting: move fake stores to store package (#45428)
* make fake storage public
* move fake storages to store package
2022-02-15 17:24:39 -05:00
Yuriy Tseretyan 095ea44e97 Alerting: Move BaseInterval and MinInterval to UnifiedAlerting config (#45107)
* use base interval if legacy value is less than the base interval
2022-02-11 16:13:49 -05:00
George Robinson 4e3a72fc2a Add context.Context to AlertingStore (#45069) 2022-02-09 09:22:09 +00:00
Yuriy Tseretyan ea236c276e add missing option to swagger spec (#45070) 2022-02-08 10:09:37 -05:00
George Robinson 67a3e1d6fd Add context.Context to InstanceStore (#45049) 2022-02-08 13:49:04 +00:00
Sofia Papagiannaki 35fe58de37 API: Extract OpenAPI specification from source code using go-swagger (#40528)
* API: Using go-swagger for extracting OpenAPI specification from source code

* Merge Grafana Alerting spec

* Include enterprise endpoints (if enabled)

* Serve SwaggerUI under feature flag

* Fix building dev docker images

* Configure swaggerUI

* Add missing json tags

Co-authored-by: Ying WANG <ying.wang@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
2022-02-08 13:38:43 +01:00
George Robinson a9399ab3cd Alerting: Add context.Context to RuleStore (#45004)
Alerting: Add context.Context to RuleStore
2022-02-08 08:52:03 +00:00
Alexander Weaver 935059a376 Alerting: Create basic storage layer for provisioning (#44679)
* Simplistic store API for provenance lookups on arbitrary types

* Add a few notes in comments

* Improved type safety for provisioned objects

* Clean-up TODOs for future PRs

* Clean up provisioning model

* Clean up tests

* Restrict allowable types in interface

* Fix linter error

* Move AlertRule domain methods to same file as AlertRule definition

* Update pkg/services/ngalert/models/provisioning.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Complete interface rename

* Pass context through store API

* More idiomatic method names

* Better error description

* Improve code-docs

* Use ORM language instead of raw sql

* Add support for records in different orgs

* ResourceTypeID -> ResourceType since it's not an ID

Co-authored-by: George Robinson <george.robinson@grafana.com>
2022-02-04 13:23:19 -06:00
Yuriy Tseretyan ddfe2dce74 Alerting: Split grafana and lotex routes (#44742)
* split Lotex and Grafana routes
* update template to use authorize function for every route
2022-02-04 12:42:04 -05:00
idafurjes 7a23700e1a Remove unused GetDashboard method (#44890)
* Remove unused GetDashboard method

* Uncomment test

* Fix dashboard service integration test

* Remove comment
2022-02-04 17:21:06 +01:00
George Robinson 9df43abbb5 Fix evaluation of alert rules for datasources with custom headers (#44862)
* Fix evaluation of alert rules for datasources with custom headers

* Fix unit tests

* Fix integration tests

* Evaluator fields should be package private
2022-02-04 14:56:37 +01:00
Yuriy Tseretyan 984c95de63 Do not store EvaluationString in Evaluation. (#44606)
* do not store evaluation string in Evaluation.
* reduce number of buckets to store for a single state
2022-02-02 19:18:20 +01:00
George Robinson 924deda589 Fix Discord Webhook URL for invalid template (#44763)
This commit fixes an issue where an invalid template for Discord would change the Webhook URL to "" and cause "unsupported protocol scheme" errors.
2022-02-02 14:28:41 +01:00
Santiago 04d93751b8 Alerting: send alerts to external, internal, or both alertmanagers (#40341)
* (WIP) send alerts to external, internal, or both alertmanagers

* Modify admin configuration endpoint, update swagger docs

* Integration test for admin config updated

* Code review changes

* Fix alertmanagers choice not changing bug, add unit test

* Add AlertmanagersChoice as enum in swagger, code review changes

* Fix API and tests errors

* Change enum from int to string, use 'SendAlertsTo' instead of 'AlertmanagerChoice' where necessary

* Fix tests to reflect last changes

* Keep senders running when alerts are handled just internally

* Check if any external AM has been discovered before sending alerts, update tests

* remove duplicate data from logs

* update comment

* represent alertmanagers choice as an int instead of a string

* default alertmanagers choice to all alertmanagers, test cases

* update definitions and generate spec
2022-02-01 20:36:55 -03:00
George Robinson 5e2280ceee Add metrics to ngalert scheduler (#44602)
This pull request adds metrics to the ngalert scheduler so we can see how long it takes to evaluate a tick.
2022-01-31 16:56:43 +00:00
idafurjes 12420260ef Remove bus from org invite api (#44530)
* Remove bus from org invite api

* Fix lint

* Remove comment
2022-01-31 17:24:52 +01:00
Serge Zaitsev 84a5910e56 Chore: Remove bus from ngalert (#44465)
* pass notification service down to the notifiers

* add ns to all notifiers

* remove bus from ngalert notifiers

* use smaller interfaces for notificationservice

* attempt to fix the tests

* remove unused struct field

* simplify notification service mock

* trying to resolve issues in the tests

* make linter happy

* make linter even happier

* linter, you are annoying
2022-01-26 16:42:40 +01:00
Jean-Philippe Quéméner 8ee3f59cd4 Alerting: recognize Cortex datasources correctly in the frontend (#44316)
* Alerting: always use msg field for user facing errors

* fix: revert front-end Cortex detection

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2022-01-21 15:44:11 +01:00
ying-jeanne 7422789ec7 Remove Macaron ParamsInt64 function from code base (#43810)
* draft commit

* change all calls

* Compilation errors
2022-01-15 00:55:57 +08:00
Yuriy Tseretyan ed5c664e4a Alerting: Stop firing of alert when it is updated (#39975)
* Update API to call the scheduler to remove\update an alert rule. When a rule is updated by a user, the scheduler will remove the currently firing alert instances and clean up the state cache. 
* Update evaluation loop in the scheduler to support one more channel that is used to communicate updates to it.
* Improved rule deletion from the internal registry. 
* Move alert rule version from the internal registry (structure alertRuleInfo) closer rule evaluation loop (to evaluation task structure), which will make the registry values immutable.
* Extract notification code to a separate function to reuse in update flow.
2022-01-11 11:39:34 -05:00
Yuriy Tseretyan ea478dec22 Alerting: Remove bridge between log15 and go-kit logger (#43769)
* remove bridge between log15 and go-kit logger.

* fix tests
2022-01-07 09:40:09 +01:00
ying-jeanne a8eef45a44 Logger migration from log15 to gokit/log (#41636)
* migrate log15 to gokit/log

* fix console log

* update some unittest

* fix all unittest

* fix the build

* Update pkg/infra/log/log.go

Co-authored-by: Yuriy Tseretyan <tceretian@gmail.com>

* general type vector

* correct the level key

Co-authored-by: Yuriy Tseretyan <tceretian@gmail.com>
2022-01-06 22:28:05 +08:00
Alexander Weaver fd583a0e3b Alerting: Allow customization of Google chat message (#43568)
* Allow customizable googlechat message via optional setting

* Add optional message field in googlechat contact point configurator

* Fix strange error message on send if template fails to fully evaluate

* Elevate template evaluation failure logs to Warn level

* Extract default.title template embed from all channels to shared constant
2022-01-05 09:47:08 -06:00
idafurjes 8e6d6af744 Rename DispatchCtx to Dispatch (#43563) 2021-12-28 17:36:22 +01:00
idafurjes 7936c4c522 Rename AddHandlerCtx to AddHandler (#43557) 2021-12-28 16:08:07 +01:00
idafurjes 56c3875bb9 Chore: Remove context.TODO (#43458)
* Remove context.TODO() from services

* Fix live test
2021-12-28 10:26:18 +01:00
Alexander Weaver 56b3dc5445 Alerting: Allow configuration of non-ready alertmanagers (#43063)
* Create API test for overwriting invalid alertmanager config

* Avoid requiring alertmanager readiness for config changes

* AlertmanagerSrv depends on functionality rather than concrete types

* Add test for non-ready alertmanagers

* Additional cleanup and polish

* Back out previous integration test changes

* Refactor of tests incorrectly caused a test to become redundant

* Use pre-existing fake secret service

* Drop unused interface

* Test against concrete MultiOrgAlertmanager re-using fake infra from other tests

* Fix linter error

* Empty commit to rerun checks
2021-12-27 17:01:17 -06:00
Alexander Weaver 9abdaf251f Alerting: Fix global state sensitivity in notifier channel tests (#43508) 2021-12-27 11:58:17 -06:00
idafurjes b8852ef6a3 Chore: Remove context.TODO() (#43409)
* Remove context.TODO() from services

* Fix live test

* Remove context.TODO
2021-12-22 11:02:42 +01:00