* update RenameReceiverInNotificationSettings in DbStore to check for provisioning
* implement renaming in receiver service and provisioning
* do not patch route when stitching
* fix bug in stitching because it returned new name but the old one was expected
* update receiver service to always return result converted from storage model this makes sure that UID and version are consistent with GET\LIST operations
* use provided metadata.name for UID of domain model because rename changes UID and request fails
* remove rename guard
* update UI to not disable receiver name when k8s api enabled
* create should calculate uid from name because new receiver does not have UID yet.
* Replace global authz abstraction with one compatible with uid scope
* Replace GettableApiReceiver with models.Receiver in receiver_svc
* GrafanaIntegrationConfig -> models.Integration
* Implement Create/Update methods
* Add optimistic concurrency to receiver API
* Add scope to ReceiversRead & ReceiversReadSecrets
migrates existing permissions to include implicit global scope
* Add receiver create, update, delete actions
* Check if receiver is used by rules before delete
* On receiver name change update in routes and notification settings
* Improve errors
* Linting
* Include read permissions are requirements for create/update/delete
* Alias ngalert/models to ngmodels to differentiate from v0alpha1 model
* Ensure integration UIDs are valid, unique, and generated if empty
* Validate integration settings on create/update
* Leverage UidToName to GetReceiver instead of GetReceivers
* Remove some unnecessary uses of simplejson
* alerting.notifications.receiver -> alerting.notifications.receivers
* validator -> provenanceValidator
* Only validate the modified receiver
stops existing invalid receivers from preventing modification of a valid
receiver.
* Improve error in Integration.Encrypt
* Remove scope from alert.notifications.receivers:create
* Add todos for receiver renaming
* Use receiverAC precondition checks in k8s api
* Linting
* Optional optimistic concurrency for delete
* make update-workspace
* More specific auth checks in k8s authorize.go
* Add debug log when delete optimistic concurrency is skipped
* Improve error message on authorizer.DecisionDeny
* Keep error for non-forbidden errutil errors
* Alerting: Persist silence state immediately on Create/Delete
Persists the silence state to the kvstore immediately instead of waiting for the
next maintenance run. This is used after Create/Delete to prevent silences from
being lost when a new Alertmanager is started before the state has persisted.
This can happen, for example, in a rolling deployment scenario.
* Fix test that requires real data
* Don't error if silence state persist fails, maintenance will correct
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated
* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.
* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings
* update GET API so only admins can see auto-gen
This PR replaces the vendored models in the migration with their equivalent ngalert models. It also replaces the raw SQL selects and inserts with service calls.
It also fills in some gaps in the testing suite around:
- Migration of alert rules: verifying that the actual data model (queries, conditions) are correct 9a7cfa9
- Secure settings migration: verifying that secure fields remain encrypted for all available notifiers and certain fields migrate from plain text to encrypted secure settings correctly e7d3993
Replacing the checks for custom dashboard ACLs will be replaced in a separate targeted PR as it will be complex enough alone.
* (WIP) Refactor the ImageStore interface to work with our latest alerting repository
* update alerting package
* refactor, new URLExists method in ImageProvider
* tests for the new methods
* fix linter warnings
* use alertingImages as an alias for grafana/alerting/images
* logs about image uris and not found images
* nerf image not found logs
* extract duplicated code to getImageFromURI() method
* refactor getImageFromURI()
* add index on url
* add comment about migration log
* sync generated files
* use tokens or urls in image annotations
* improve tests, fix some comments
* fix empty tokens
* code review changes, check for url before checking for token (support old token formats)
* Alerting: Add endpoint to revert to a previous alertmanager configuration
This endpoint is meant to be used in conjunction with /api/alertmanager/grafana/config/history to
revert to a previously applied alertmanager configuration. This is done by ID instead of raw config
string in order to avoid secure field complications.
* WIP
* skip invalid historic configurations instead of erroring
* add warning log when bad historic config is found
* remove unused custom marshaller for GettableHistoricUserConfig
* add id to historic user config, move limit check to store, fix typo
* swagger spec
* Mark AM configuration as applied
* add missing checks, make linter happy
* fix deadlock, mark as valid on save and on load
* mark configurations only if needed
* check error after applyConfig()
* code review comments
* code review changes
* more code review changes
* clean HistoricConfigFromAlertConfig function
* Extend kvstore to retrieve all items
* Fix comment
* Fix tests
* Change test order
* Move test outside to avoid order conditions
* Update Items to GetAll function and return a map
* Add explanation of map result
* Add description comment
Co-authored-by: Tania B <yalyna.ts@gmail.com>
Adds three functions:
`withStoredImages` iterates over a list of models.Alerts, extracting a stored image's data from storage, if available, and executing a user-provided function.
`withStoredImage` does this for an image attached to a specific alert.
`openImage` finds and opens an image file on disk.
Moves `store.Image` to `models.Image`
Simplifies `channels.ImageStore` interface and updates notifiers that use it to use the simpler methods.
Updates all pkg/alert/notifier/channels to use withStoredImage routines.
If an image token is present in an alert instance, the email notifier will attempt to find a public URL for the image token. If found, it will add that to the email as the `ImageLink` field. If only local file data is available, the notifier will attach the file to the outgoing email using the `EmbeddedImage` field.
This change extracts screenshot data from alert messages via a private annotation `__alertScreenshotToken__` and attaches a URL to a Slack message or uploads the data to an image upload endpoint if needed.
This change also implements a few foundational functions for use in other notifiers.
* Base-line API for provisioning notification policies
* Wire API up, some simple tests
* Return provenance status through API
* Fix missing call
* Transactions
* Clarity in package dependencies
* Unify receivers in definitions
* Fix issue introduced by receiver change
* Drop unused internal test implementation
* FGAC hooks for provisioning routes
* Polish, swap names
* Asserting on number of exposed routes
* Don't bubble up updated object
* Integrate with new concurrency token feature in store
* Back out duplicated changes
* Remove redundant tests
* Regenerate and create unit tests for API layer
* Integration tests for auth
* Address linter errors
* Put route behind toggle
* Use alternative store API and fix feature toggle in tests
* Fixes, polish
* Fix whitespace
* Re-kick drone
* Rename services to provisioning
* Alerting: add collision safe update function for alertmanager configurations
* fix typo
* use bootstrap func for tests
* move hash calculation to store
* remove icons lol
* remove removed field
* Create API test for overwriting invalid alertmanager config
* Avoid requiring alertmanager readiness for config changes
* AlertmanagerSrv depends on functionality rather than concrete types
* Add test for non-ready alertmanagers
* Additional cleanup and polish
* Back out previous integration test changes
* Refactor of tests incorrectly caused a test to become redundant
* Use pre-existing fake secret service
* Drop unused interface
* Test against concrete MultiOrgAlertmanager re-using fake infra from other tests
* Fix linter error
* Empty commit to rerun checks
* Add method GetAllLatestAlertmanagerConfiguration to DBStore
* add method ApplyConfig to AlertManager
* update multiorg alert manager to load all alertmanager configs at once
* Alerting: Persist notification log and silences to the database
This removes the dependency of having persistent disk to run grafana alerting. Instead of regularly flushing the notification log and silences to disk we now flush the binary content of those files to the database encoded as a base64 string.
Introduces org-level isolation for the Alertmanager and its components.
Silences, Alerts and Contact points are not separated by org and are not shared between them.
Co-authored with @davidmparrott and @papagian