Commit Graph

20 Commits

Author SHA1 Message Date
Yuri Tseretyan c3f00eb403 Alerting: log body of unexpected response from Mimir (#102382)
log body of unexpected response

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-03-19 10:14:05 -04:00
Santiago 7d4895c3c9 Alerting: Use exponential backoff in the remote Alertmanager readiness check (#99756)
* Alerting: Use exponential backoff in the remote Alertmanager readiness check

* fix capitalized error

* remove unnecessary 'for'

* refactor, use time.After() instead of channel
2025-01-29 18:53:30 +01:00
Santiago 361312bbd7 Alerting: Expect 406s from the remote Alertmanager during the readiness check (#99507)
* Alerting: Expect 406s from the remote Alertmanager during the readiness check

* make it clear in the warning logs that we'll attempt to send the confgiuration/state without comparing in case of error pulling the current state/config
2025-01-24 16:26:57 +01:00
Santiago 4f8f82f5f1 Alerting: Fix remote Alertmanager readiness check path (#95063) 2024-10-21 17:24:51 +02:00
Santiago 4c15266a77 Alerting: Add X-Remote-Alertmanager header to the remote AM client (#94913) 2024-10-17 22:38:13 +02:00
Fayzal Ghantiwala e321dbb690 Alerting: Use remote Alertmanager to test templates and receivers when enabled (#91570)
* Initial impl

* Add code to test templates and receivers

* Fix linter

* Fix forked am tests

* Update mimir client

* Remove trailing whitespace

* re-trigger CI
2024-08-15 16:56:14 +01:00
Santiago fce03cd724 Alerting: Send static headers to the remote Alertmanager (#89846) 2024-07-01 17:48:40 +02:00
Santiago fe1309dd96 Alerting: Send external URL to the remote Alertmanager (#89701)
* Alerting: Send external URL to the remote Alertmanager

* test that the URL is sent to the remote Alertmanager

* AppURL -> ExternalURL
2024-06-26 14:02:02 +02:00
Steve Simpson 8eabef1f91 Alerting: Update remote alertmanager to use extended receivers API. (#89253)
* Alerting: Update remote alertmanager to use extended receivers API.

* Update integration test and Mimir image

* Update Mimir image in more places
2024-06-19 12:40:22 +03:00
Alexander Weaver 8491e02caf Alerting: Instrument outbound requests for Loki Historian and Remote Alertmanager with tracing (#89185)
* Add TracedClient

* Handle errors and status codes

* Wire up tracing to normal ASH and loki annotation mapping

* Add tracing to remote alertmanager

* one more spot

* and not or

* More consistency with other grafana traces, lower cardinality name
2024-06-14 13:24:12 -05:00
Santiago e41434c332 Alerting: Promote configuration in the remote Alertmanager (#87388) 2024-05-16 12:06:03 +02:00
Santiago a2ce8fefed Alerting: Use a struct when sending a Grafana AM configuration to the remote Alertmanager (#86451)
* Alerting: Use a struct when sending a Grafana AM configuration to the remote Alertmanager

* remove '-distroless' from mimir image name
2024-04-19 13:04:18 +02:00
Santiago 4ad6d66479 Alerting: Remove ID from UserGrafanaConfig struct (#84602)
* Alerting: Remove ID from UserGrafanaConfig struct

* user custom mimir image withoud id in grafana config

* change mimir image name
2024-03-19 12:47:13 +01:00
Santiago fbbda6c05e Alerting: Retry readiness check to the remote Alertmanager on 5xx status code responses (#81174) 2024-01-24 21:39:06 +01:00
Santiago 6c87d9a1e7 Alerting: Stop retries on 4xx status code responses (remote Alertmanager readiness check) (#80350) 2024-01-11 12:12:35 +01:00
Santiago 9e78faa7ba Alerting: Add metrics to the remote Alertmanager struct (#79835)
* Alerting: Add metrics to the remote Alertmanager struct

* rephrase http_requests_failed description

* make linter happy

* remove unnecessary metrics

* extract timed client to separate package

* use histogram collector from dskit

* remove weaveworks dependency

* capture metrics for all requests to the remote Alertmanager (both clients)

* use the timed client in the MimirAuthRoundTripper

* HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function

* refactor

* less git diff

* gauge for last readiness check in seconds

* initialize LastReadinesCheck to 0, tweak metric names and descriptions

* add counters for sync attempts/errors

* last config sync and last state sync timestamps (gauges)

* change latency metric name

* metric for remote Alertmanager mode

* code review comments

* move label constants to metrics package
2024-01-10 11:18:24 +01:00
gotjosh cc3c0a2cc2 Alerting: Refactor readiness check (#78799)
* Alerting: Refactor readiness check

Moves the readiness check to the mimir client and removes the need to assert that we have senders - it already has a queue and can hold notifications until we're ready to send them.
---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-12-12 15:34:54 +00:00
Santiago 73776f37eb Alerting: Send state to the remote Alertmanager (#78538)
* Alerting: Introduce a Mimir client as part of the Remote Alertmanager

Mimir client that understands the new APIs developed for mimir. Very much a WIP still.

* more wip

* appease the linter

* more linting

* add more code

* get state from kvstore, encode, send

* send state to the remote Alertmanager, extract fullstate logic into its own function

* pass kvstore to remote.NewAlertmanager()

* refactor

* add fake kvstore to tests

* tests

* use FileStore to get state

* always log 'completed state upload'

* refactor compareRemoteConfig

* base64-encode the state in the file store

* export silences and nflog filenames, refactor

* log 'completed state/config upload...' regardless of outcome

* add values to the state store in tests

* address code review comments

* log error from filestore

---------

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-11-29 12:49:39 +01:00
gotjosh 8120306fea Remote Alertmanager(refactor): Only parse the URL once (#78631)
* Remote Alertmanager(refactor): Only parse the URL once

Exactly what it says in the tin.

Signed-off-by: gotjosh <josue.abreu@gmail.com>

* use the existing tests

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-11-24 11:05:13 +00:00
gotjosh 23fe8f4e9c Alerting: Introduce a Mimir client as part of the Remote Alertmanager (#78357)
* Alerting: Introduce a Mimir client as part of the Remote Alertmanager

This is our first attempt at making Grafana communicate use Mimir as a backend - it uses a new set of APIs that we've developed on the Mimir side to upload the grafana configuration and alertmanager state so that it can then be ported over.

Codewise, we've introduced a couple of things:

A client to isolate in its own package all the communication that happens with Mimir
A few changes to the remote/alertmanager to include uploading the configuration and state when it starts
A few refactors that align a bit better with the design approach that we're thinking
An integration tests again these newly developed APIs using a custom image

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-11-23 16:59:36 +00:00