Files
grafana/pkg/registry
Roberto Jiménez Sánchez 41276676eb Provisioning: add retry logic for transient errors in Kubernetes client (#114215)
* feat: add retry logic for transient errors in Kubernetes client

Add retry wrapper for dynamic.ResourceInterface that automatically retries
transient errors using Kubernetes' wait.ExponentialBackoff utility.

- Implements retry logic with exponential backoff for all Kubernetes API operations
- Detects transient errors: ServiceUnavailable, ServerTimeout, TooManyRequests,
  InternalError, Timeout, and network errors
- Uses wait.ExponentialBackoff from k8s.io/apimachinery/pkg/util/wait
- Respects context cancellation
- Includes comprehensive unit tests

Part of https://github.com/grafana/git-ui-sync-project/issues/634

* docs: add detailed documentation for defaultRetryBackoff

Document when retry attempts will happen, what errors trigger retries,
and the retry behavior (attempts, delays, exponential backoff, jitter).

* feat: add logging and increase retry attempts for Kubernetes client

- Add context logger to track retry attempts (Info for retries, Warn for exhaustion)
- Increase retry attempts from 5 to 8 steps (~10 seconds total retry window)
- Document when all retry attempts will fail:
  * API server completely unavailable/unreachable
  * Network connectivity issues persist beyond retry window
  * Consistent transient errors for entire retry duration
  * Context cancellation before retries complete

* chore: update retry client documentation

* fix: resolve linting issues in retry client

- Replace type assertions with errors.As for wrapped errors
- Remove deprecated Temporary() check (deprecated since Go 1.18)
- Update tests to remove temporary error test case

* fix: resolve staticcheck S1008 linting issue in retry_client.go

Simplify return statement to use errors.As directly instead of if-return pattern
2025-11-20 15:12:07 +01:00
..
2025-10-16 15:53:38 +02:00