Commit Graph

27 Commits

Author SHA1 Message Date
Grot (@grafanabot) 52d6afbae7 [Alerting]: namespace fixes (#34470) (#34489)
* [Alerting]: forbid viewers for updating rules if viewers can edit

check for CanSave instead of CanEdit

* Clear ngalert tables when deleting the folder

* Apply suggestions from code review

* Log failure to check save permission

Co-authored-by: gotjosh <josue@grafana.com>
(cherry picked from commit 23939eab10)

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
2021-05-24 10:42:56 +03:00
Owen Diehl baca873a84 extracts alertmanager from DI, including migrations (#34071)
* extracts alertmanager from DI, including migrations

* includes alertmanager Run method in ngalert

* removes 3s test shutdown timeout

* lint
2021-05-13 14:01:38 -04:00
Kyle Brandt babb17afd6 Alerting/Chore: Move tests from tests package (#34059)
Instead put in package folder but with package name suffixed with _test
This enables code coverage within the pkg while still allow the tests to operate from external to package perspective (only exported things).
2021-05-13 10:05:33 -04:00
Kyle Brandt a735c51202 Alerting/Chore: Backend remove def_ columns from instance (#33875)
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
David Parrott b1a8c67689 Alerting return evaluation errors to /rules (#33663)
* Set and return errors produced by evaluation results

* test fixup
2021-05-04 13:08:12 -04:00
David Parrott 39099bf3c0 Alerting nested state cache (#33666)
* nest cache by orgID, ruleUID, stateID

* update accessors to use new cache structure

* test and linter fixup

* fix panic

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* add comment to identify what's going on with nested maps in cache

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-05-04 09:57:50 -07:00
Kyle Brandt c1034f3118 Alerting: Create instanceStore (#33587)
for https://github.com/grafana/alerting-squad/issues/129
2021-05-03 07:19:15 -04:00
Kyle Brandt 759a0cd71b Build: Fix with cleanup call maybe? (#33590) 2021-04-30 13:02:37 -07:00
Kyle Brandt 7823842c5d Alerting: Load annotations from rule into State cache (#33542)
for https://github.com/grafana/alerting-squad/issues/127
2021-04-30 20:23:12 +02:00
Kyle Brandt b8f01fe034 Alerting: backend "ng" code cleanup (#33578) 2021-04-30 13:21:57 -04:00
Owen Diehl 5e48b54549 Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Kyle Brandt 914443c816 Alerting: Fix state cache id duplication (#33480) 2021-04-28 11:42:19 -04:00
David Parrott 788bc2a793 Alerting: refactor state tracker (#33292)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* rename state tracker

* set EvaluationDuration on Result

* default labels set as constants

* separate cache and state from manager

* use RWMutex in cache
2021-04-23 21:32:25 +02:00
David Parrott ca79206498 Alerting: Handle NoData and Error evaluation results (#33194)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* not those annotations
2021-04-23 20:47:52 +02:00
gotjosh de0802cf3b Alerting: Fixes the integration test currently failing at master (#33233)
* Alerting: Fixes the integration test currently failing at master

* Skip the state tracker test for now
2021-04-21 14:57:17 -04:00
David Parrott 4be1d84f23 Alerting: Enhancements to /rules (#33085)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* pr feedback

* Do not initialize mutex unnecessarily

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* linter

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-21 09:30:03 -07:00
Sofia Papagiannaki 87a70af7eb [Alerting]: Fix updating rule group and add tests (#33074)
* [Alerting]: Fix updating rule group and add test

* Fix updating rule labels
* Set default values for rule no data and error states
if they are missing
* Add test for updating rule

* Test updating annotations

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* add test for posting an unknown rule UID

* Fix alert rule validation and add tests

* Remove org id from PostableGrafanaRule

This field was not used; each rule gets the organisation of the user making
the rerquest

* Update pkg/tests/api/alerting/api_alertmanager_test.go

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-21 17:22:58 +03:00
Owen Diehl e37a780e14 Inhouse alerting api (#33129)
* init

* autogens AM route

* POST dashboards/db spec

* POST alert-notifications spec

* fix description

* re inits vendor, updates grafana to master

* go mod updates

* alerting routes

* renames to receivers

* prometheus endpoints

* align config endpoint with cortex, include templates

* Change grafana receiver type

* Update receivers.go

* rename struct to stop swagger thrashing

* add rules API

* index html

* standalone swagger ui html page

* Update README.md

* Expose GrafanaManagedAlert properties

* Some fixes

- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties

* am alerts routes

* rename prom swagger section for clarity, remove example endpoints

* Add missing json and yaml tags

* folder perms

* make folders POST again

* fix grafana receiver type

* rename fodler->namespace for perms

* make ruler json again

* PR fixes

* silences

* fix Ok -> Ack

* Add id to POST /api/v1/silences (#9)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add POST /api/v1/alerts (#10)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* fix silences

* Add testing endpoints

* removes grpc replace directives

* [wip] starts validation

* pkg cleanup

* go mod tidy

* ignores vendor dir

* Change response type for Cortex/Loki alerts

* receiver unmarshaling tests

* ability to split routes between AM & Grafana

* api marshaling & validation

* begins work on routing lib

* [hack] ignores embedded field in generation

* path specific datasource for alerting

* align endpoint names with cloud

* single route per Alerting config

* removes unused routing pkg

* regens spec

* adds datasource param to ruler/prom route paths

* Modifications for supporting migration

* Apply suggestions from code review

* hack for cleaning circular refs in swagger definition

* generates files

* minor fixes for prom endpoints

* decorate prom apis with required: true where applicable

* Revert "generates files"

This reverts commit ef7e975584.

* removes server autogen

* Update imported structs from ngalert

* Fix listing rules response

* Update github.com/prometheus/common dependency

* Update get silence response

* Update get silences response

* adds ruler validation & backend switching

* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response

* Distinct gettable and postable grafana receivers

* Remove permissions routes

* Latest JSON specs

* Fix testing routes

* inline yaml annotation on apirulenode

* yaml test & yamlv3 + comments

* Fix yaml annotations for embedded type

* Rename DatasourceId path parameter

* Implement Backend.String()

* backend zero value is a real backend

* exports DiscoveryBase

* Fix GO initialisms

* Silences: Use PostableSilence as the base struct for creating silences

* Use type alias instead of struct embedding

* More fixes to alertmanager silencing routes

* post and spec JSONs

* Split rule config to postable/gettable

* Fix empty POST /silences payload

Recreating the generated JSON specs fixes the issue
without further modifications

* better yaml unmarshaling for nested yaml docs in cortex-am configs

* regens spec

* re-adds config.receivers

* omitempty to align with prometheus API behavior

* Prefix routes with /api

* Update Alertmanager models

* Make adjustments to follow the Alertmanager API

* ruler: add for and annotations to grafana alert (#45)

* Modify testing API routes

* Fix grafana rule for field type

* Move PostableUserConfig validation to this library

* Fix PostableUserConfig YAML encoding/decoding

* Use common fields for grafana and lotex rules

* Add namespace id in GettableGrafanaRule

* Apply suggestions from code review

* fixup

* more changes

* Apply suggestions from code review

* aligns structure pre merge

* fix new imports & tests

* updates tooling readme

* goimports

* lint

* more linting!!

* revive lint

Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-04-19 14:26:04 -04:00
Kyle Brandt 2c862678ab AlertingNG/SSE: Datasource UID and UID/ID only (drop name) (#33039)
SSE still will support ID until dashboard/frontend always requests UID
update *.http examples

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-16 15:29:19 +02:00
David Parrott 555da77527 Dparrott/labels on alert rule (#33057)
* move state tracker tests to /tests

* set default labels on alerts

* handle empty labels in result.Instance

* create annotation on transition to alerting state
2021-04-16 15:11:40 +02:00
David Parrott 567a6a09bd Alerting: Return RuleResponse for api/prometheus/grafana/api/v1/rules (#32919)
* Return RuleResponse for api/prometheus/grafana/api/v1/rules

* change TODO to note

Co-authored-by: gotjosh <josue@grafana.com>

* pr feedback

* test fixup

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-13 17:38:09 -04:00
Sofia Papagiannaki daabf64aa1 [Alerting]: Update scheduler to evaluate rules created by the unified API (#32589)
* Update scheduler

* Fix tests

* Fixes after code review feedback

* lint - add uncommitted modifications

Co-authored-by: kyle <kyle@grafana.com>
2021-04-03 20:13:29 +03:00
Kyle Brandt 948da25c13 ngalerting: represent nil/empty labels the same (#32652)
was getting duplicates of [] and null before
2021-04-02 13:49:45 -04:00
David Parrott 2a8446e435 Alerting: Persist alerts on evaluation and shutdown. Warm cache from DB on startup (#32576)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* Save alert transitions and save all state on shutdown

* pr feedback

* WIP

* WIP

* Persist alerts on evaluation and shutdown. Warm cache on startup

* Filter non-firing alerts before sending to notifier

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-02 08:11:33 -07:00
David Parrott b1cb74c0c9 Alerting: Send alerts from state tracker to notifier, logging, and cleanup task (#32333)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* pr feedback

* pr feedback

Pull in compact.go and reconcile differences

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-30 09:37:56 -07:00
David Parrott d33a77a67f Alerting: add state tracker to alerting evaluation (#32298)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup
2021-03-24 15:34:18 -07:00
Sofia Papagiannaki 4ce0a49eac AlertingNG: Split into several packages (#31719)
* AlertingNG: Split into several packages

* Move AlertQuery to models
2021-03-08 22:19:21 +02:00