Commit Graph

35 Commits

Author SHA1 Message Date
Yuri Tseretyan 0351a37e99 Alerting: Remote Alertmanager to calculate hash of the request payload instead of just the configuration v2 (#109139)
* Revert "Revert "Alerting: Remote Alertmanager to calculate hash of the reques…"

This reverts commit cbf256120e.

* log the decision

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-09-03 14:01:25 +00:00
Santiago 1914fb95ee Remote Alertmanager: Wire up remoteSecondaryAlertmanagerWithRemoteState and fix path (#109253) 2025-08-06 17:33:31 +02:00
Yuri Tseretyan cbf256120e Revert "Alerting: Remote Alertmanager to calculate hash of the request payload instead of just the configuration" (#109086)
Revert "Alerting: Remote Alertmanager to calculate hash of the request payloa…"

This reverts commit 32434810e1.
2025-08-01 21:34:31 +00:00
Yuri Tseretyan 32434810e1 Alerting: Remote Alertmanager to calculate hash of the request payload instead of just the configuration (#108632)
* update CreateGrafanaAlertmanagerConfig to accept UserGrafanaConfig move construction logic to alertmanager
* consolidate building UserGrafanaConfig into buildConfig
* use config to determine whether it needs to be send calculate hash of the entire request struct rather than configuration
2025-07-31 10:25:43 -04:00
Santiago 3c4c3b3f5c Remote Alertmanager: Remove unneeded SmtpFrom and StaticHeaders fields (#108781)
* Remote Alertmanager: Remove unneeded SmtpFrom and StaticHeaders fields

* remove smtpFrom (am struct)

* move smtp field (am struct)

* fix test
2025-07-29 12:14:42 +02:00
Yuri Tseretyan 5097dd5c7d Alerting: Send templates from extra configuration to remote Alertmanager (#107981)
* extract logging of MergedResult into method

* convert GetMergedTemplateDefinitions to return PostableApiTemplate

* update mergeExtracConfigs to return GrafanaAlertmanagerConfig

* pass by value, not pointer

* add template definition to payload

* update tests

* rename to Templates

* log merge results

* fix reference in workspace
2025-07-18 15:16:17 -04:00
Santiago 71a425f912 Remote Alertmanager: Fetch full state from Mimir (#107905)
* Remote Alertmanager: Add method to fetch the full state

* decode and parse the remote Alertmanager state string

* test
2025-07-15 11:38:38 +02:00
Santiago a314b99589 Remote Alertmanager: Use the same struct for Grafana state and Mimir full state (#107791)
Remote Alertmanager: Use the same struct for Grafana stat and Mimir full state
2025-07-11 10:10:30 +02:00
Yuri Tseretyan 4bb6926eee Alerting: Separate configuration model for remote Alertmanager Mimir client (#107741)
* replace PostableUserConfig with GrafanaAlertmanagerConfig to decouple from internal Grafana models
* update alertmanager + tests
* calculate hash of the GrafanaAlertmanagerConfig
2025-07-09 12:42:10 -04:00
Alexander Akhmetov b483a04aec Alerting: Send merged configuration to the remote alertmanager (#107004) 2025-07-02 21:35:24 +02:00
Santiago 3fe73b8de9 Remote Alertmanager: Send SMTP config (#106337)
* (WIP) Remote Alertmanager: Send SMTP config

* send SMTP configs separately

* bring back deleted fields

* actually send stuff over

* remove redundant type, fix comments

* smtp -> smtpConfig

* also send SmtpFrom an StaticHeaders separately

* tests

* restore defaults.ini
2025-06-13 12:44:39 -03:00
Santiago 6c3d89f390 Remote Alertmanager: Add timeouts to the HTTP client (#105279)
* Remote Alertmanage: Add timeouts to the HTTP client

* code review suggestions
2025-05-13 13:25:56 +02:00
Santiago b434925adc Remote Alertmanager: Add tracing to the HTTP client used for POSTing alerts and the readiness check (#105235) 2025-05-12 14:57:42 +03:00
Santiago 51d7aa2bef Remote Alertmanager: Configure SMTP From address (#104925)
* Remote Alertmanager: Configure SMTP From address

* include smtp from address in config comparison

* updte tests

* trigger build

* make linter happy

* trigger build

* fix test
2025-05-12 10:37:27 +02:00
beejeebus 8f79e4882f Replace usage of http.DefaultClient and http.DefaultTransport (#104135)
Remove usage of http.DefaultClient and http.DefaultTransport

Part of grafana/data-sources#484
2025-05-09 13:26:39 -04:00
Yuri Tseretyan c3f00eb403 Alerting: log body of unexpected response from Mimir (#102382)
log body of unexpected response

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-03-19 10:14:05 -04:00
Santiago 7d4895c3c9 Alerting: Use exponential backoff in the remote Alertmanager readiness check (#99756)
* Alerting: Use exponential backoff in the remote Alertmanager readiness check

* fix capitalized error

* remove unnecessary 'for'

* refactor, use time.After() instead of channel
2025-01-29 18:53:30 +01:00
Santiago 361312bbd7 Alerting: Expect 406s from the remote Alertmanager during the readiness check (#99507)
* Alerting: Expect 406s from the remote Alertmanager during the readiness check

* make it clear in the warning logs that we'll attempt to send the confgiuration/state without comparing in case of error pulling the current state/config
2025-01-24 16:26:57 +01:00
Santiago 4f8f82f5f1 Alerting: Fix remote Alertmanager readiness check path (#95063) 2024-10-21 17:24:51 +02:00
Santiago 4c15266a77 Alerting: Add X-Remote-Alertmanager header to the remote AM client (#94913) 2024-10-17 22:38:13 +02:00
Fayzal Ghantiwala e321dbb690 Alerting: Use remote Alertmanager to test templates and receivers when enabled (#91570)
* Initial impl

* Add code to test templates and receivers

* Fix linter

* Fix forked am tests

* Update mimir client

* Remove trailing whitespace

* re-trigger CI
2024-08-15 16:56:14 +01:00
Santiago fce03cd724 Alerting: Send static headers to the remote Alertmanager (#89846) 2024-07-01 17:48:40 +02:00
Santiago fe1309dd96 Alerting: Send external URL to the remote Alertmanager (#89701)
* Alerting: Send external URL to the remote Alertmanager

* test that the URL is sent to the remote Alertmanager

* AppURL -> ExternalURL
2024-06-26 14:02:02 +02:00
Steve Simpson 8eabef1f91 Alerting: Update remote alertmanager to use extended receivers API. (#89253)
* Alerting: Update remote alertmanager to use extended receivers API.

* Update integration test and Mimir image

* Update Mimir image in more places
2024-06-19 12:40:22 +03:00
Alexander Weaver 8491e02caf Alerting: Instrument outbound requests for Loki Historian and Remote Alertmanager with tracing (#89185)
* Add TracedClient

* Handle errors and status codes

* Wire up tracing to normal ASH and loki annotation mapping

* Add tracing to remote alertmanager

* one more spot

* and not or

* More consistency with other grafana traces, lower cardinality name
2024-06-14 13:24:12 -05:00
Santiago e41434c332 Alerting: Promote configuration in the remote Alertmanager (#87388) 2024-05-16 12:06:03 +02:00
Santiago a2ce8fefed Alerting: Use a struct when sending a Grafana AM configuration to the remote Alertmanager (#86451)
* Alerting: Use a struct when sending a Grafana AM configuration to the remote Alertmanager

* remove '-distroless' from mimir image name
2024-04-19 13:04:18 +02:00
Santiago 4ad6d66479 Alerting: Remove ID from UserGrafanaConfig struct (#84602)
* Alerting: Remove ID from UserGrafanaConfig struct

* user custom mimir image withoud id in grafana config

* change mimir image name
2024-03-19 12:47:13 +01:00
Santiago fbbda6c05e Alerting: Retry readiness check to the remote Alertmanager on 5xx status code responses (#81174) 2024-01-24 21:39:06 +01:00
Santiago 6c87d9a1e7 Alerting: Stop retries on 4xx status code responses (remote Alertmanager readiness check) (#80350) 2024-01-11 12:12:35 +01:00
Santiago 9e78faa7ba Alerting: Add metrics to the remote Alertmanager struct (#79835)
* Alerting: Add metrics to the remote Alertmanager struct

* rephrase http_requests_failed description

* make linter happy

* remove unnecessary metrics

* extract timed client to separate package

* use histogram collector from dskit

* remove weaveworks dependency

* capture metrics for all requests to the remote Alertmanager (both clients)

* use the timed client in the MimirAuthRoundTripper

* HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function

* refactor

* less git diff

* gauge for last readiness check in seconds

* initialize LastReadinesCheck to 0, tweak metric names and descriptions

* add counters for sync attempts/errors

* last config sync and last state sync timestamps (gauges)

* change latency metric name

* metric for remote Alertmanager mode

* code review comments

* move label constants to metrics package
2024-01-10 11:18:24 +01:00
gotjosh cc3c0a2cc2 Alerting: Refactor readiness check (#78799)
* Alerting: Refactor readiness check

Moves the readiness check to the mimir client and removes the need to assert that we have senders - it already has a queue and can hold notifications until we're ready to send them.
---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-12-12 15:34:54 +00:00
Santiago 73776f37eb Alerting: Send state to the remote Alertmanager (#78538)
* Alerting: Introduce a Mimir client as part of the Remote Alertmanager

Mimir client that understands the new APIs developed for mimir. Very much a WIP still.

* more wip

* appease the linter

* more linting

* add more code

* get state from kvstore, encode, send

* send state to the remote Alertmanager, extract fullstate logic into its own function

* pass kvstore to remote.NewAlertmanager()

* refactor

* add fake kvstore to tests

* tests

* use FileStore to get state

* always log 'completed state upload'

* refactor compareRemoteConfig

* base64-encode the state in the file store

* export silences and nflog filenames, refactor

* log 'completed state/config upload...' regardless of outcome

* add values to the state store in tests

* address code review comments

* log error from filestore

---------

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-11-29 12:49:39 +01:00
gotjosh 8120306fea Remote Alertmanager(refactor): Only parse the URL once (#78631)
* Remote Alertmanager(refactor): Only parse the URL once

Exactly what it says in the tin.

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* use the existing tests

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-11-24 11:05:13 +00:00
gotjosh 23fe8f4e9c Alerting: Introduce a Mimir client as part of the Remote Alertmanager (#78357)
* Alerting: Introduce a Mimir client as part of the Remote Alertmanager

This is our first attempt at making Grafana communicate use Mimir as a backend - it uses a new set of APIs that we've developed on the Mimir side to upload the grafana configuration and alertmanager state so that it can then be ported over.

Codewise, we've introduced a couple of things:

A client to isolate in its own package all the communication that happens with Mimir
A few changes to the remote/alertmanager to include uploading the configuration and state when it starts
A few refactors that align a bit better with the design approach that we're thinking
An integration tests again these newly developed APIs using a custom image

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-11-23 16:59:36 +00:00