Seunghun Shin 512c292e04 Alerting: Add jitter support for periodic alert state storage to reduce database load spikes (#111357)
What is this feature?

This PR implements a jitter mechanism for periodic alert state storage to distribute database load over time instead of processing all alert instances simultaneously. When enabled via the state_periodic_save_jitter_enabled configuration option, the system spreads batch write operations across 85% of the save interval window, preventing database load spikes in high-cardinality alerting environments.

Why do we need this feature?

In production environments with high alert cardinality, the current periodic batch storage can cause database performance issues by processing all alert instances simultaneously at fixed intervals. Even when using periodic batch storage to improve performance, concentrating all database operations at a single point in time can overwhelm database resources, especially in resource-constrained environments.

Rather than performing all INSERT operations at once during the periodic save, distributing these operations across the time window until the next save cycle can maintain more stable service operation within limited database resources. This approach prevents resource saturation by spreading the database load over the available time interval, allowing the system to operate more gracefully within existing resource constraints.

For example, with 200,000 alert instances using a 5-minute interval and 4,000 batch size, instead of executing 50 batch operations simultaneously, the jitter mechanism distributes these operations across approximately 4.25 minutes (85% of 5 minutes), with each batch executed roughly every 5.2 seconds.

This PR provides system-level protection against such load spikes by distributing operations across time, reducing peak resource usage while maintaining the benefits of periodic batch storage. The jitter mechanism is particularly valuable in resource-constrained environments where maintaining consistent database performance is more critical than precise timing of state updates.
2025-09-29 11:22:36 +02:00
2025-09-17 11:07:20 +00:00
2020-11-05 17:20:40 +01:00
2025-06-16 17:29:07 +02:00
2025-03-04 11:00:35 +00:00
2025-09-17 11:07:20 +00:00
2023-06-22 09:43:38 +01:00
2021-04-20 19:03:30 +02:00
2025-08-20 10:08:03 +00:00

Grafana Logo (Light) Grafana Logo (Dark)

The open-source platform for monitoring and observability

License Drone Go Report Card

Grafana allows you to query, visualize, alert on and understand your metrics no matter where they are stored. Create, explore, and share dashboards with your team and foster a data-driven culture:

  • Visualizations: Fast and flexible client side graphs with a multitude of options. Panel plugins offer many different ways to visualize metrics and logs.
  • Dynamic Dashboards: Create dynamic & reusable dashboards with template variables that appear as dropdowns at the top of the dashboard.
  • Explore Metrics: Explore your data through ad-hoc queries and dynamic drilldown. Split view and compare different time ranges, queries and data sources side by side.
  • Explore Logs: Experience the magic of switching from metrics to logs with preserved label filters. Quickly search through all your logs or streaming them live.
  • Alerting: Visually define alert rules for your most important metrics. Grafana will continuously evaluate and send notifications to systems like Slack, PagerDuty, VictorOps, OpsGenie.
  • Mixed Data Sources: Mix different data sources in the same graph! You can specify a data source on a per-query basis. This works for even custom datasources.

Get started

Unsure if Grafana is for you? Watch Grafana in action on play.grafana.org!

Documentation

The Grafana documentation is available at grafana.com/docs.

Contributing

If you're interested in contributing to the Grafana project:

Get involved

This project is tested with BrowserStack.

License

Grafana is distributed under AGPL-3.0-only. For Apache-2.0 exceptions, see LICENSING.md.

S
Description
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
Readme AGPL-3.0 2.7 GiB
Languages
TypeScript 52.9%
Go 44.3%
CUE 0.7%
Rich Text Format 0.5%
HTML 0.4%
Other 1%