* Wire up to full alert rule struct
* Extract group change detection logic to dedicated file
* GroupDiff -> GroupDelta for consistency
* Calculate deltas and handle backwards compatible requests
* Separate changes and insert/update/delete as needed
* Regenerate files
* Don't touch the DB if there are no changes
* Quota checking, delete unused file
* Mark modified records as provisioned
* Validation + a couple API layer tests
* Address linter errors
* Fix issue with UID assignment and rule creation
* Propagate top level group fields to all rules
* Tests for repeated updates and versioning
* Tests for quota and provenance checks
* Fix linter errors
* Regenerate
* Factor out some shared logic
* Drop unnecessary multiple nilchecks
* Use alternative strategy for rolling UIDs on inserted rules
* Fix tests, add back nilcheck, refresh UIDs during test
* Address feedback
* Add missing nil-check
* generalize error handling in forking request handlers
* remove MatchesBackend and change test to test Can
* add 404 to route specs
* change backendTypeByUID to getDatasourceByUID of expected type
* use common errors in api testing
* handle 401 in errorToResponse
* replace backend type error with "unexpected datasource type"
* update swagger spec
* Define query param and regenerate
* Add query struct for contact points
* Filter contact points by name in query
* Document that name filter is optional
* Define route and run codegen
* Wire up HTTP layer
* Update API layer and test fakes
* Implement reset of policy tree
* Implement service layer test and authorization bindings
* API layer testing
* Be more specific when injecting settings
* Add validator for mute timing and make it provisionable
* Add tests to ensure prometheus validators are running and errors are propagated
* Internal API for manipulating mute timings
* Define and generate API layer
* Wire up generated code
* Implement API handlers
* Tests for golang layer
* Fix reference bug
* Fix linter and auth tests
* Resolve semantic errors and regenerate
* Remove pointless comment
* Extract out provisioning path param keys, simplify
* Expected number of paths
* Support for documenting stable vs unstable alerting routes
* empty commit, restart drone
* Touch-up references in root makefile and drop trailing escape newline
* Rebase and regenerate
* Extend README with docs for this change
* Generate API for writing templates
* Persist templates app logic layer
* Validate templates
* Extract logic, make set and delete methods
* Drop post route for templates
* Fix response details, wire up remainder of API
* Authorize routes
* Mirror some existing tests on new APIs
* Generate mock for prov store
* Wire up prov store mock, add tests using it
* Cover cases for both storage paths
* Add happy path tests and fix bugs if file contains no template section
* Normalize template content with define statement
* Tests for deletion
* Fix linter error
* Move provenance field to DTO
* empty commit
* ID to name
* Fix in auth too
* Template service
* Add GET routes and implement them
* Generate mock for persist layer
* Unit tests for reading templates
* Set up composition root and get integration tests working
* Fix prealloc issue
* Extract setup boilerplate
* Update AuthorizationTest
* Rebase and resolve
* Fix linter error
* Base-line API for provisioning notification policies
* Wire API up, some simple tests
* Return provenance status through API
* Fix missing call
* Transactions
* Clarity in package dependencies
* Unify receivers in definitions
* Fix issue introduced by receiver change
* Drop unused internal test implementation
* FGAC hooks for provisioning routes
* Polish, swap names
* Asserting on number of exposed routes
* Don't bubble up updated object
* Integrate with new concurrency token feature in store
* Back out duplicated changes
* Remove redundant tests
* Regenerate and create unit tests for API layer
* Integration tests for auth
* Address linter errors
* Put route behind toggle
* Use alternative store API and fix feature toggle in tests
* Fixes, polish
* Fix whitespace
* Re-kick drone
* Rename services to provisioning
* Alerting: Remove internal labels from prometheus compatible API responses
* Appease the linter
* Fix integration tests
* Fix API documentation & linter
* move removal of internal labels to the models
* Add missing OK option to models
* add ok to legacy legacy UI does not support it but it is possible to do so via provisioning.
* use enums in migration so linter would catch missing cases
* API: Using go-swagger for extracting OpenAPI specification from source code
* Merge Grafana Alerting spec
* Include enterprise endpoints (if enabled)
* Serve SwaggerUI under feature flag
* Fix building dev docker images
* Configure swaggerUI
* Add missing json tags
Co-authored-by: Ying WANG <ying.wang@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
* (WIP) send alerts to external, internal, or both alertmanagers
* Modify admin configuration endpoint, update swagger docs
* Integration test for admin config updated
* Code review changes
* Fix alertmanagers choice not changing bug, add unit test
* Add AlertmanagersChoice as enum in swagger, code review changes
* Fix API and tests errors
* Change enum from int to string, use 'SendAlertsTo' instead of 'AlertmanagerChoice' where necessary
* Fix tests to reflect last changes
* Keep senders running when alerts are handled just internally
* Check if any external AM has been discovered before sending alerts, update tests
* remove duplicate data from logs
* update comment
* represent alertmanagers choice as an int instead of a string
* default alertmanagers choice to all alertmanagers, test cases
* update definitions and generate spec
Remove validation for labels to be accepted in the Alertmanager, This helps with datasources that produce non-compatible labels.
Adds an "object_matchers" to alert manager routers so we can support labels names with extended characters beyond prometheus/openmetrics. It only does this for the internal Grafana managed Alert Manager.
This requires a change to alert manager, so for now we use grafana/alertmanager which is a slight fork, with the intention of going back to upstream.
The frontend handles the migration of "matchers" -> "object_matchers" when the route is edited and saved. Once this is done, downgrades will not work old versions will not recognize the "object_matchers".
Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
Introduces org-level isolation for the Alertmanager and its components.
Silences, Alerts and Contact points are not separated by org and are not shared between them.
Co-authored with @davidmparrott and @papagian
* Alerting: Expose discovered and dropped Alertmanagers
Exposes the API for discovered and dropped Alertmanagers.
* make admin config poll interval configurable
* update after rebase
* wordsmith
* More wordsmithing
* change name of the config
* settings package too
* Alerting: Send alerts to external Alertmanager(s)
Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:
- Introduce a new table to hold "admin" (either org or global)
configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
"senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.
There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
* Alerting: Implement /status for the notification system
Implements the necessary plumbing to have a /status endpoint on the
notification system.
* Add API examples
* Update API specs
* Update prometheus/common dependency
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>