Alerting: Add alert instance labels to Loki log lines in addition to stream labels (#65403)
Add instance labels to log line
(cherry picked from commit de1637afe5)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
Alerting: Add "backend" label to state history writes metrics (#65395)
* Add backend label to state history writes metrics
* Update test expectations
(cherry picked from commit dd04757fc9)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
* Alerting: Fix stats that display alert count when using unified alerting (#64852)
* Alerting: Fix stats when using unified alerting
(cherry picked from commit 02a8f62021)
* bundle services is not needed for 9.4.x
---------
Co-authored-by: gotjosh <josue.abreu@gmail.com>
Alerting: Fix attachment of external labels to Loki state history log streams (#65140)
Fix attachment of external labels, add tests
(cherry picked from commit 07368dec74)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
* Alerting: Switch to snappy-compressed-protobuf for outgoing push requests to Loki (#65077)
* Encode with snappy, always
* JSON encoder type
* Headers
* Copy labels formatter from promtail
* Implement snappy-proto encoding
* Create encoder interface, test both encoders, choose snappy-proto by default
* Make encoder configurable at the LokiCfg level
* Export both encoders
* Touch up comment and tests
* Drop unnecessary conversions after move to plain strings to appease linter
(cherry picked from commit bf54f2672e)
* Sample fields got renamed between 9.4 and main
Alerting: Fix ambiguous handling of equals in labels when bucketing Loki state history streams (#65013)
* Use JSON instead of data.Labels string format as label repr
* Drop debug log line
(cherry picked from commit cc7e5ce62e)
Alerting: Elide requests to Loki if nothing should be recorded (#65011)
Exit early if no log streams or annotations
(cherry picked from commit e39d7f44c9)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
Vendor errors.Join from Go standard library to avoid version incompatibilities (#64985)
Vendor errors.Join from std lib
(cherry picked from commit 40c5713cbd)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
Alerting: Create new state history "fanout" backend that dispatches to multiple other backends at once (#64774)
* Rename RecordStatesAsync to Record
* Rename QueryStates to Query
* Implement fanout writes
* Implement primary queries
* Simplify error joining
* Add test for query path
* Add tests for writes and error propagation
* Allow fanout backend to be configured
* Touch up log messages and config validation
* Consistent documentation for all backend structs
* Parse and normalize backend names more consistently against an enum
* Touch-ups to documentation
* Improve clarity around multi-record blocking
* Keep primary and secondaries more distinct
* Rename fanout backend to multiple backend
* Simplify config keys for multi backend mode
(cherry picked from commit a31672fa40)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
SQLStore: Fix setting query tries for integration tests (#64944)
* SQLStore: Pass testinfra database configuration to the test database
* Add test
* Bypass gocyclo check for initTestDB
(cherry picked from commit f5cb8c660e)
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
Alerting: Handful of small adjustments to log levels and parameters (#64572)
Calculate duration earlier in scheduler
(cherry picked from commit 9bcf8819d3)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
SQLStore: Fix SQLite error propagation if query retries are disabled (#64904)
* SQLStore: Add test when query retrying is disabled
* Fix condition
* Add test cases for sqlite3.ErrLocked
(cherry picked from commit 41843464d1)
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
Navigation: handle case when there is no alerting node at all (#64941)
* handle case when there is no alerting node at all
* update backend tests
(cherry picked from commit f4c62a5c5d)
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
Navigation: Correctly create `Alerts and incidents` section when using legacy alerting (#64924)
check for legacy alerting node as well
(cherry picked from commit 54dd8943ca)
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
Alerting: Update scheduler to get updates only from database (#64635)
* stop using the scheduler's Update and Delete methods all communication must be via the database
* update scheduler's registry to calculate diff before re-setting the cache
* update fetcher to return the diff generated by registry
* update processTick to update rule eval routine if the rule was updated and it is not going to be evaluated at this tick.
* remove references to the scheduler from api package
* remove unused methods in the scheduler
(cherry picked from commit 85a954cd81)
# Conflicts:
# pkg/services/ngalert/schedule/schedule.go
# pkg/services/ngalert/schedule/schedule_unit_test.go
Alerting: Log error but don't fail initialization if state history connection test fails (#64699)
Don't return init error if ping fails, add tests
(cherry picked from commit faef3a8258)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
* Navigation: Fix Home logo always going to `/login` (#62658)
* only redirect to /login when anonymous access is disabled
* only search for dashboards when not logged in if anon access is enabled
* fix go logic
* add unit tests
(cherry picked from commit 3336327306)
* remove file i accidentally left in :/
* import correct method
* add metric encyclopedia feature toggle and component
* remove unused button
* move file, add test file
* add tests
* add pagination and tests
* test with 10,000,000 metrics
* remove unused import
* add filter by type
* search alphabetically and add switch to exclude metrics with no metadata
* add suggested functions and filter for functions
* allow user to select variables in encyclopedia
* fix style and tests
* add fuzzy search by either metric name or all metadata
* if missing metadata, remove metadata fuzzy search option, exclude metadata, and filter by type
* add encyclopedia feature tracking
* indicate that metrics are filtered by labels
* handle metric singular or plural
* add tooltips and fix language
* add filtering tests
* change 'search' to 'browse'
* remove functions filter and tests as not part of work flow
* add m.e. button and selected metric is a tag
* fix hanging search and update styles, padding, labels, and groupings
* small performance improvements
* fix tests
* add backend metrics query option
* add loading spinner for start load and backend search
* autofocus search input
* Update docs/sources/setup-grafana/configure-grafana/feature-toggles/index.md
Co-authored-by: Christopher Moyer <35463610+chri2547@users.noreply.github.com>
* run prettier
* run prettier
* fix text for feature toggle
* for license check since https://cla-assistant.io/check/grafana/grafana?pullRequest=<PR#> is not working
* fixing tests
* fix feature toggle docs
* fix feature toggle
* fix feature toggle
* add owner to feature toggle
---------
Co-authored-by: Christopher Moyer <35463610+chri2547@users.noreply.github.com>
(cherry picked from commit 9b6e531549)
Alerting: Fix intermittency when seeding database in rule store tests (#64322)
Force unique IDs when seeding database
(cherry picked from commit 4a1c18abf6)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
Alerting: Fix Classic Conditions $values variable (#64243)
This commit fixes a bug in the $values variable in notification
templates when using Classic Conditions. Since Classic Conditions
are not multi-dimensional, the values of each series that exceeded
the condition should be available as a RefID and offset. For example,
B0, B1, etc. However, this bug meant that instead just a single
condition would be printed as B, not B0.
(cherry picked from commit ed71012ced)
Co-authored-by: George Robinson <george.robinson@grafana.com>
Alerting: Expose Prometheus metrics for persisting state history (#63157)
* Create historian metrics and dependency inject
* Record counter for total number of state transitions logged
* Track write failures
* Track current number of active write goroutines
* Record histogram of how long it takes to write history data
* Don't copy the registerer
* Adjust naming of write failures metric
* Introduce WritesTotal to complement WritesFailedTotal
* Measure TransitionsFailedTotal to complement TransitionsTotal
* Rename all to state_history
* Remove redundant Total suffix
* Increment totals all the time, not just on success
* Drop ActiveWriteGoroutines
* Drop PersistDuration in favor of WriteDuration
* Drop unused gauge
* Make writes and writesFailed per org
* Add metric indicating backend and a spot for future metadata
* Drop _batch_ from names and update help
* Add metric for bytes written
* Better pairing of total + failure metric updates
* Few tweaks to wording and naming
* Record info metric during composition
* Create fakeRequester and simple happy path test using it
* Blocking test for the full historian and test for happy path metrics
* Add tests for failure case metrics
* Smoke test for full annotation persistence
* Create test for metrics on annotation persistence, both happy and failing paths
* Address linter complaints
* More linter complaints
* Remove unnecessary whitespace
* Consistency improvements to help texts
* Update tests to match new descs
(cherry picked from commit 19d01dff91)
fix: create temp user no longer sets ID to 0 for all users (#64149)
* fix: create temp user no longer sets ID to 0 for all users
The xorm tag added to the tempuser ID field caused xorm to create all temp users with ID 0. Removing that tag allows xorm to set the ID based on the database result instead. I also added a test which was failing before this.
Fixes#63995
(cherry picked from commit dbb72f2c6e)
Co-authored-by: Kristin Laemmert <mildwonkey@users.noreply.github.com>
* move analytics identifiers to backend
* implement hash function
* grab secret from env
* expose and retrieve intercom secret from config
* concat email with appUrl to ensure uniqueness
* revert to just using email
* Revert "revert to just using email"
This reverts commit 8f10f9b1bc.
* add docstring
(cherry picked from commit d61bcdf4ca)
SQLStore: Enable clientFoundRows for MySQL connections (#64070)
Enable clientFoundRows for MySQL connections
(cherry picked from commit 8ea71d37c2)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
Alerting: Use background context for maintenance function (#64065)
(cherry picked from commit e760f22402)
# Conflicts:
# pkg/services/ngalert/notifier/alertmanager.go
Alerting: Fix migration pauses all alert rules on PostgreSQL (#63951)
This commit fixes a serious bug in Grafana 9.4.1 where on upgrade
a migration would pause all existing alert rules and change the
default value of the column to true.
(cherry picked from commit 030f6c948f)
Co-authored-by: George Robinson <george.robinson@grafana.com>
Alerting: Fix boolean default in migration from false to 0 (#63952)
Fix boolean default in migration from false to 0
(cherry picked from commit a05bf41ff9)
Co-authored-by: Alex Moreno <alexander.moreno@grafana.com>
fix(dashboard version service): add DashboardUID to query and responses (#60800)
* fix(dashboard version service): add DashboardUID to query and responses
The DashboardUID was not populated in the response from Get and ListDashboardVersions. This adds the DashboardUID to the Get query (it was already in List) and populated the DashboardUID in the returned DashboardVersionDTOs.
(cherry picked from commit 42be0e106f)
Co-authored-by: Kristin Laemmert <mildwonkey@users.noreply.github.com>
Users: Fix org user always getting org id = 1 on auto assign false (#63708)
* fix org user always getting org id = 1 on auto assign false
* make tests explicit
* use correct cfg in service accounts
* fix api tests
* fix database test of ac
* fix InsertOrgUser returning affected rows as orgID
(cherry picked from commit c8db771939)
* Alerting: Instrument outgoing state history requests using weaveworks/common (#63600)
* Loki backend and client depend on a requester
* Instrument all requests to loki using weaveworks TimedClient
* Construct collector in metrics package
(cherry picked from commit e77621649d)
* Revert all changes to gomod and gosum
* Authn: Anon session service (#63052)
* add anon sessions package
* add usage stat fn
* implement count for cache
* add anonservice to authn broker
* lint
* add tests for remote cache count
* move anon service to services
* wrap tagging in goroutine
* make func used
(cherry picked from commit ff78103a24)
* add local cache to protect multiple writes to DB cache
(cherry picked from commit bdb084736b)
Alerting: Fix incorrect comment in eval.go (#63510)
This commit fixes an incorrect comment in the Result struct in eval.go
that I had written some time ago. The comment now documents the
actual behaviour and content of this field.
(cherry picked from commit f93a9c794d)
Co-authored-by: George Robinson <george.robinson@grafana.com>
Authn: Stat registration (#62934)
* reorganize auth usage stats
* usage stat privilege elevators
* stat count of modified role
* cfg related info
* add authn anon client
* kv store
* ensure anon enabled is collected even if client is not registered
* fix usage stats test
(cherry picked from commit 14a78b58e9)
Alerting: Get alert rules on faults (#61248) (#63051)
* Alerting: get alert rules on faults (#61248)
Two functions used to fetch alert rules from DB are updated:
- GetAlertRulesForScheduling
- ListAlertRules
Rows are scanned one by one so good ones are returned.
Common Error is logged with indication how many
rules failed on deserialization.
Resolved: #61248
* updates from review comments
(cherry picked from commit 56c8661929)
Co-authored-by: bla2ej <123992384+bla2ej@users.noreply.github.com>
Alerting: Fix a bug taking screenshots with Dashboard UID (#63220)
This commit fixes a bug where Grafana would fail to take a screenshot if
the same Dashboard UID was present across two or more different orgs.
(cherry picked from commit 1f984409a2)
Co-authored-by: George Robinson <george.robinson@grafana.com>
Provisioning: Parse boolean and numeric values from environment variables (#63085)
(cherry picked from commit a33e316f40)
Co-authored-by: Andres Martinez Gotor <andres.martinez@grafana.com>
Navigation: add a link to starred dashboards in the megamenu (#62685)
add a link to starred dashboards in the megamenu
(cherry picked from commit fc2f7f90f8)
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
Navigation: move Connections plugin to be just after apps (#62801)
move connections plugin to be just after apps
(cherry picked from commit c819e95687)
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
Alerting: Add label query parameters to state history endpoint (#62831)
* Allow equality-only matching of arbitrary labels via query params
* Pre-initialize map
(cherry picked from commit 9eeea8f5ea)
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
Alerting: implement loki query for alert state history (#61992)
* Alerting: implement loki query for alert state history
* extract selector building
* add unit tests for selector creation
* backup
* give selectors their own type
* build dataframe
* add some tests
* small changes after manual testing
* use struct client
* golint
* more golint
* Make RuleUID optional for Loki implementation
* Drop initial assumption that we only have one series
* Pare down to three columns, fix timestamp overflows, improve failure cases in loki responses
* Embed structred log lines in the dataframe as objects rather than json strings
* Include state history label filter
* Remove dead code
---------
Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>