grafana

Author	SHA1	Message	Date
Tito Lins	c29ed31c7a	alerting: set model refID if missing/mismatch (#114441 )	2025-11-26 17:59:22 +01:00
Alexander Akhmetov	1d4067216d	Alerting: Add search.folder filter to the Prometheus rules API (#114358 )	2025-11-25 12:19:31 +01:00
Steve Simpson	d50e2a94a3	Alerting: Implement setting of version message in convert API. (#114331 ) Part 2 of adding version messages to the `alert_rule_version` able. This allows setting the message via a header when using the Prometheus conversion API, which can be useful for e.g. linking changes back to source control.	2025-11-25 11:20:28 +01:00
Alexander Soelberg Heidarsson	fe367b570b	Alerting: Align OpenAPI definition with notification settings header (#114363 ) fix: Align OpenAPI definition with notification settings header Signed-off-by: Alexander Soelberg Heidarsson <89837986+alex5517@users.noreply.github.com>	2025-11-24 23:08:02 +01:00
William Wernert	2c18526a78	Alerting: Collate rule_group column as binary (#114365 ) * fix: collate rule_group column as utf8mb4_bin This ensures this column is sorted identically by the database and golang, which should eliminate issues related to a mismatch in sort order. * chore: make test stricter * fix: also add binary collation for postgres	2025-11-24 14:58:43 -05:00
Alexander Akhmetov	67ad43a784	Alerting: Fix rules API pagination when using in-memory filters (#114334 )	2025-11-24 12:48:22 +01:00
Matheus Macabu	b5335d9357	Alerting: Remove dependency on yaml/v2 package (#114348 )	2025-11-24 12:38:44 +01:00
Moustafa Baiou	8108d3c795	Alerting: Add tracing to prometheus rules api (#114282 ) To better observe and identify performance bottlenecks in the prometheus rules API. The following spans were added: - `api.prometheus.RouteGetRuleStatuses` - `api.prometheus.PrepareRuleGroupStatusesV2` The `api.prometheus.PrepareRuleGroupStatusesV2` includes attributes to capture the parameters used in the request to help with debugging and performance analysis.	2025-11-21 15:00:35 -05:00
Denis Vodopianov	0e460a267e	chore : Deprecating FeatureToggles.IsEnabled (#113062 ) * Deprecating features.IsEnabled * add one more nolint * add one more nolint * Give better hints to devs in the deprecation message of IsEnabledGlobally * adding more doc strings * fix linter after rebase * Extend deprecation message	2025-11-21 18:43:42 +01:00
Steve Simpson	cc6e037093	Alerting: Add message to alert_rule_version table (Part 1). (#114194 ) This adds a `message` column to the `alert_rule_version` table. This follows the pattern established for dashboards as closely as possible. A new type is introduced internally for passing the new `message` field around in a type-safe manner, but doing the same for the API types becomes very messy. In that case, a new field is added with omitempty. Note this PR is only: - The column addition - The "read" path; API for listing versions Subsequent PRs will add code to actually set the message when updating rules.	2025-11-20 10:00:27 +01:00
alerting-team[bot]	53c39ccda3	Alerting: Update alerting module to 3befd25883e0d17673e5590cc5c5702bbc031b16 (#114062 ) * [create-pull-request] automated change * fix module path for alerting notify test receivers --------- Co-authored-by: moustafab <27738648+moustafab@users.noreply.github.com> Co-authored-by: Moustafa Baiou <moustafa.baiou@grafana.com>	2025-11-19 21:24:08 +00:00
Dave Henderson	7264803016	chore(deps): Switch to maintained gopkg.in/yaml fork (#114131 ) Signed-off-by: Dave Henderson <dave.henderson@grafana.com>	2025-11-19 15:20:32 -05:00
Alexander Akhmetov	5a0e9e4183	Alerting: Add rule_uid filter support to the Prometheus-compatible list rules API (#114182 )	2025-11-19 20:01:40 +01:00
Moustafa Baiou	1f9ed37b1b	Alerting: Optimize cache metrics updates (#114134 )	2025-11-19 11:39:13 -05:00
Alexander Akhmetov	bbd0fbe9d0	Alerting: Update documentation for Prometheus list rules API (#114057 )	2025-11-18 15:56:54 +01:00
Alexander Akhmetov	633c9a9cb0	Alerting: Add rule_limit parameter to the list rules API (#114055 )	2025-11-18 15:56:40 +01:00
Alexander Akhmetov	da5af29218	Alerting: Add search.rule_group to search rules by rule group name (#113680 ) Co-authored-by: Konrad Lalik <konradlalik@gmail.com>	2025-11-17 20:38:42 +01:00
Yuri Tseretyan	9d928a3ac6	Alerting: Remove stale states in two steps (#114044 )	2025-11-17 18:43:29 +00:00
Alexander Akhmetov	6d7ce16883	Alerting: Add rule_type filter to the rules endpoint (#113701 ) Alerting: Add type filter parameter to the rules endpoint	2025-11-14 13:03:15 +01:00
Yuri Tseretyan	6603abc873	Alerting: Support for imported receivers in API (#112138 ) * add support for converting Mimir integrations to Integration * implement imported config revision * update service to load staged receivers if configured * make sure non-Grafana origin cannot be mutated * set access control metadata for imported origin * set includeImported from feature flag. Disabled for service used by provisioning * add tests for new functionality * add snapshot-based integration test	2025-11-13 15:35:21 +00:00
Alexander Akhmetov	44a92d252b	Alerting: Support rule title search on the backend (#113738 )	2025-11-13 15:52:14 +01:00
Moustafa Baiou	559dab8b1b	Alerting: Fix error when updating Alertmanager config with autogenerated receivers (#113710 ) If an alert rule with an invalid receiver is created it breaks the entire alertmanager configuration rather than preventing the save. This fixes the issue by erroring on save and apply, and logging invalid receivers only when applying the config after an update. Introduced in #111838	2025-11-11 16:53:36 +00:00
Seunghun Shin	c784de6ef5	Alerting: Add compressed periodic save for alert instances (#111803 ) What is this feature? This PR implements compressed periodic save for alert state storage, providing a more efficient alternative to regular periodic saves by grouping alert instances by rule UID and storing them using protobuf and snappy compression. When enabled via the state_compressed_periodic_save_enabled configuration option, the system groups alert instances by their alert rule, compresses each group using protobuf serialization and snappy compression, and processes all rules within a single database transaction at specified intervals instead of syncing after every alert evaluation cycle. Why do we need this feature? During discussions in PR #111357, we identified the need for a compressed approach to periodic alert state storage that could further reduce database load beyond the jitter mechanism. While the jitter feature distributes database operations over time, this compressed periodic save approach reduces the frequency of database operations by batching alert state updates at explicitly declared intervals rather than syncing after every alert evaluation cycle. This approach provides several key benefits: - Reduced Database Frequency: Instead of frequent sync operations tied to alert evaluation cycles, updates occur only at configured intervals - Storage Efficiency: Rule-based grouping with protobuf and snappy compression significantly reduces storage requirements The compressed periodic save complements the existing jitter mechanism by providing an alternative strategy focused on reducing overall database interaction frequency while maintaining data integrity through compression and batching. Who is this feature for? - Platform/Infrastructure teams managing large-scale Grafana deployments with high alert cardinality - Organizations looking to optimize storage costs and database performance for alerting workloads - Production environments with 1000+ alert rules where database write frequency is a concern	2025-11-07 11:51:48 +01:00
Moustafa Baiou	acf0da9b80	Make the ordering of test on case-sensitivity consistent across databases and charsets	2025-11-03 11:36:18 -05:00
Moustafa Baiou	6f7c525213	Alerting: Ensure case-sensitive ordering for alert rule group column The query which fetches alert rules in a paginated manner ordered by `rule_group` can result in strange and inconsistent ordering when the database uses a case-insensitive collation for the `rule_group` column. This can lead to scenarios where rules from different groups are interleaved in the results, making pagination unreliable and the returned number of rule_groups incorrect. Related to #88990	2025-11-03 11:36:18 -05:00
Yuri Tseretyan	a4df6c8bb9	Alerting: Prohibit receivers with empty name (#113064 )	2025-10-29 16:30:38 -04:00
William Wernert	75fb832826	Alerting: Ensure state history client has external labels set (#113101 ) * Ensure state history client has external labels set * Run `make update-workspace` * Add dep owner	2025-10-28 11:35:54 -04:00
Moustafa Baiou	ce246936c4	Alerting: Surface remote AM silence creation errors properly When creating silences in remote Alertmanager instances, all 4xx errors were treated as 500s. This change ensures that 4xx errors are properly surfaced as bad payload errors, allowing callers to handle them appropriately.	2025-10-27 14:21:46 -04:00
Yuri Tseretyan	5673d0b532	Alerting: Skip logging in case of invalid receivers during auto generating policies (#111838 ) * skip logging of invalid receivers during autogen * log warn instead of error	2025-10-27 11:03:06 -04:00
Denis Vodopianov	81683d554d	chore : Deprecating `FeatureToggles.IsEnabledGlobally` (#112885 ) * add deprecation on featuremgmt.IsEnabledGlobally * add nolint reason * add reasonable deprecation message * remove junk edits * add more nolints * addressing review comments * Update pkg/services/featuremgmt/models.go Co-authored-by: Dave Henderson <dave.henderson@grafana.com> --------- Co-authored-by: Dave Henderson <dave.henderson@grafana.com>	2025-10-24 12:02:53 -04:00
Yuri Tseretyan	8b7f119cad	Alerting: Provisioning to fix contact point type on save (#112246 ) fix contact point type on create\update	2025-10-23 11:11:36 -04:00
Yuri Tseretyan	5f9a51418c	Alerting: Fix unmarshalling of GettableStatus to include time intervals (#112602 ) * move test files into test-data * add test for the bug * populate time-intervals of gettableStatus config	2025-10-21 09:28:04 -04:00
Ieva	acbbfde256	AuthZ service: Expand the logic to also evaluate action sets (#112124 ) * expand AuthZ service logic to also evaluate action sets * handle folder creation * fix test * simplify mapper code Co-authored-by: gamab <gabi.mabs@gmail.com> * more accurate variable name Co-authored-by: gamab <gabi.mabs@gmail.com> * break alerting import cycle * Apply suggestion from @gamab --------- Co-authored-by: gamab <gabi.mabs@gmail.com> Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>	2025-10-08 13:37:12 +01:00
Santiago	3f4c9879c9	Remote Alertmanager: Add timeout to the remoteClient (#112157 )	2025-10-08 11:13:02 +00:00
Yuri Tseretyan	7d1c6b6bd2	Alerting: Replace IntegrationConfig with IntegrationSchemaVersion (#112010 ) * remove unused compat functions * update to alerting module from pr * replace IntegrationConfig with IntegrationSchemaVersion * safely resolve a string into integration type * change usages of integration config	2025-10-07 11:08:16 -04:00
Tito Lins	7e63a01a79	alerting: omit optional notification settings fields (#112049 )	2025-10-06 14:23:21 +02:00
Alexander Akhmetov	cd889fef9b	Alerting: Keep extra configurations on main config update (#106958 )	2025-10-06 09:28:38 +02:00
Yuri Tseretyan	d0f79ee60d	Alerting: Update alerting module + refactor (#111761 ) * update alerting module * replace compat with ones from alerting * update type references Receiver and Integration to Status update route in provisioning test that is invalid after recent change * use right type for LINE ingtegration	2025-10-03 10:37:49 -04:00
Yuri Tseretyan	22173da78d	Alerting: Use empty feature manager for creating test state (#111964 )	2025-10-02 19:46:59 +00:00
Alexander Akhmetov	169bf2ce73	Alerting: Add feature toggle to use the old simplified routing hash generation (#111900 ) * Revert "Alerting: Generate simplified routing routes with old fingerprint function (#111893)" This reverts commit `0da9d49896`. * Add alertingUseNewSimplifiedRoutingHashAlgorithm flag * Alerting: Add feature toggle to use the old simplified routing hash generation	2025-10-01 15:21:33 -04:00
Alexander Akhmetov	0da9d49896	Alerting: Generate simplified routing routes with old fingerprint function (#111893 )	2025-10-01 18:45:36 +02:00
Seunghun Shin	512c292e04	Alerting: Add jitter support for periodic alert state storage to reduce database load spikes (#111357 ) What is this feature? This PR implements a jitter mechanism for periodic alert state storage to distribute database load over time instead of processing all alert instances simultaneously. When enabled via the state_periodic_save_jitter_enabled configuration option, the system spreads batch write operations across 85% of the save interval window, preventing database load spikes in high-cardinality alerting environments. Why do we need this feature? In production environments with high alert cardinality, the current periodic batch storage can cause database performance issues by processing all alert instances simultaneously at fixed intervals. Even when using periodic batch storage to improve performance, concentrating all database operations at a single point in time can overwhelm database resources, especially in resource-constrained environments. Rather than performing all INSERT operations at once during the periodic save, distributing these operations across the time window until the next save cycle can maintain more stable service operation within limited database resources. This approach prevents resource saturation by spreading the database load over the available time interval, allowing the system to operate more gracefully within existing resource constraints. For example, with 200,000 alert instances using a 5-minute interval and 4,000 batch size, instead of executing 50 batch operations simultaneously, the jitter mechanism distributes these operations across approximately 4.25 minutes (85% of 5 minutes), with each batch executed roughly every 5.2 seconds. This PR provides system-level protection against such load spikes by distributing operations across time, reducing peak resource usage while maintaining the benefits of periodic batch storage. The jitter mechanism is particularly valuable in resource-constrained environments where maintaining consistent database performance is more critical than precise timing of state updates.	2025-09-29 11:22:36 +02:00
Yuri Tseretyan	b8f23eacd4	Alerting: Migrate to integration schema (#111643 ) * update tests to assert against snapshot * remove channel_config package replaced by schemas from alerting module * update references to use new schema	2025-09-26 09:31:50 -04:00
Yuri Tseretyan	24c10b4fb9	Alerting: Remove usages of ReceiverType (#111508 ) * remove usages of ReceiverType	2025-09-25 16:09:54 -04:00
Santiago	dab39c873f	Remote Alertmanager: Use the correct OrgID when creating the store (#111634 ) * Remote Alertmanager: Use the correct OrgID when creating the store * fix test	2025-09-25 16:53:07 +00:00
Santiago	345b72227f	Alert State History: Remove redundant JSON serialization when merging Loki streams (#111443 )	2025-09-23 20:56:37 +02:00
Santiago	04bc71fa6d	Alert State History: Skip invalid entries when merging streams (#111387 )	2025-09-22 12:29:39 +02:00
Yuri Tseretyan	f166968357	Alerting: Refactoring ConfigRevision methods (#111192 ) * make validateReceiver private * make functions and type alias private * move EncryptedReceivers and DecryptedReceivers to notifier package to reduce exposure of definitions package via legacy_storage * return receivers with Grafana origin after create\update * add tests for ConfigRevision methods	2025-09-19 09:46:35 -04:00
Santiago	8f9d8f1154	Remote Alertmanager: Fix log line in the Mimir client (#111293 )	2025-09-18 10:07:16 +00:00
Yuri Tseretyan	c36b2ae191	Alerting: v0 schema for integrations (mimir) (#110908 ) * generate schema for mimir integrations from schema on front-end * review and fix the settings * Update GetAvailableNotifiersV2 to return mimir as v0 * add version argument to GetSecretKeysForContactPointType * update TestGetSecretKeysForContactPointType to include v0 * add type alias field to contain alternate types that different from Grafana's * add support for msteamsv2 * update ConfigForIntegrationType to look for alternate type * update IntegrationConfigFromType to use new result of ConfigForIntegrationType * add reference to parent plugin to NotifierPluginVersion to allow getting plugin type by it's alias * add tests to ensure consistency * make API response stable * add tests against snapshot + omit optional fields	2025-09-17 09:25:56 -04:00

1 2 3 4 5 ...

2046 Commits