Commit Graph

56 Commits

Author SHA1 Message Date
Roberto Jiménez Sánchez
9760eef62f Provisioning: fix multi-tenant and single-tenant authorization (#115435)
* feat(auth): add ExtraAudience option to RoundTripper

Add ExtraAudience option to RoundTripper to allow operators to include
additional audiences (e.g., provisioning group) when connecting to the
multitenant aggregator. This ensures tokens include both the target API
server's audience and the provisioning group audience, which is required
to pass the enforceManagerProperties check.

- Add ExtraAudience RoundTripperOption
- Improve documentation and comments
- Add comprehensive test coverage

* fix(operators): add ExtraAudience for dashboards/folders API servers

Operators connecting to dashboards and folders API servers need to include
the provisioning group audience in addition to the target API server's
audience to pass the enforceManagerProperties check.

* provisioning: fix settings/stats authorization for AccessPolicy identities

The settings and stats endpoints were returning 403 for users accessing via
ST->MT because the AccessPolicy identity was routed to the access checker,
which doesn't know about these resources.

This fix handles 'settings' and 'stats' resources before the access checker
path, routing them to the role-based authorization that allows:
- settings: Viewer role (read-only, needed by frontend)
- stats: Admin role (can leak information)

* fix: update BootstrapStep component to remove legacy storage handling and adjust resource counting logic

- Removed legacy storage flag from useResourceStats hook in BootstrapStep.
- Updated BootstrapStepResourceCounting to simplify rendering logic and removed target prop.
- Adjusted tests to reflect changes in resource counting and rendering behavior.

* Revert "fix: update BootstrapStep component to remove legacy storage handling and adjust resource counting logic"

This reverts commit 148802cbb5.

* provisioning: allow any authenticated user for settings/stats endpoints

These are read-only endpoints needed by the frontend:
- settings: returns available repository types and configuration for the wizard
- stats: returns resource counts

Authentication is verified before reaching authorization, so any user who
reaches these endpoints is already authenticated. Requiring specific org
roles failed for AccessPolicy tokens which don't carry traditional roles.

* provisioning: remove redundant admin role check from listFolderFiles

The admin role check in listFolderFiles was redundant (route-level auth already
handles access) and broken for AccessPolicy identities which don't have org roles.

File access is controlled by the AccessClient as documented in the route-level
authorization comment.

* provisioning: add isAdminOrAccessPolicy helper for auth checks

Consolidates authorization logic for provisioning endpoints:
- Adds isAdminOrAccessPolicy() helper that allows admin users OR AccessPolicy identities
- AccessPolicy identities (ST->MT flow) are trusted internal callers without org roles
- Regular users must have admin role (matching frontend navtree restriction)

Used in: authorizeSettings, authorizeStats, authorizeJobs, listFolderFiles

* provisioning: consolidate auth helpers into allowForAdminsOrAccessPolicy

Simplifies authorization by:
- Adding isAccessPolicy() helper for AccessPolicy identity check
- Adding allowForAdminsOrAccessPolicy() that returns Decision directly
- Consolidating stats/settings/jobs into single switch case
- Using consistent pattern in files.go

* provisioning: require admin for files subresource at route level

Aligns route-level authorization with handler-level check in listFolderFiles.
Both now require admin role OR AccessPolicy identity for consistency.

* provisioning: restructure authorization with role-based helpers

Reorganizes authorization code for clarity:

Role-based helpers (all support AccessPolicy for ST->MT flow):
- allowForAdminsOrAccessPolicy: admin role required
- allowForEditorsOrAccessPolicy: editor role required
- allowForViewersOrAccessPolicy: viewer role required

Repository subresources by role:
- Admin: repository CRUD, test, files
- Editor: jobs, resources, sync, history
- Viewer: refs, status (GET only)

Connection subresources by role:
- Admin: connection CRUD
- Viewer: status (GET only)

* provisioning: move refs to admin-only

refs subresource now requires admin role (or AccessPolicy).
Updated documentation comments to reflect current permissions.

* provisioning: add fine-grained permissions for connections

Adds connection permissions following the same pattern as repositories:
- provisioning.connections:create
- provisioning.connections:read
- provisioning.connections:write
- provisioning.connections:delete

Roles:
- fixed:provisioning.connections:reader (granted to Admin)
- fixed:provisioning.connections:writer (granted to Admin)

* provisioning: remove non-existent sync subresource from auth

The sync subresource doesn't exist - syncing is done via the jobs endpoint.
Removed dead code from authorization switch case.

* provisioning: use access checker for fine-grained permissions

Refactors authorization to use b.access.Check() with verb-based checks:

Repository subresources:
- CRUD: uses actual verb (get/create/update/delete)
- test: uses 'update' (write permission)
- files/refs/resources/history/status: uses 'get' (read permission)
- jobs: uses actual verb for jobs resource

Connection subresources:
- CRUD: uses actual verb
- status: uses 'get' (read permission)

The access checker maps verbs to actions defined in accesscontrol.go.
Falls back to admin role for backwards compatibility.

Also removes redundant admin check from listFolderFiles since
authorization is now properly handled at route level.

* provisioning: use verb constants instead of string literals

Uses apiutils.VerbGet, apiutils.VerbUpdate instead of "get", "update".

* provisioning: use access checker for jobs and historicjobs resources

Jobs resource: uses actual verb (create/read/write/delete)
HistoricJobs resource: read-only (historicjobs:read)

* provisioning: allow viewers to access settings endpoint

Settings is read-only and needed by multiple UI pages (not just admin pages).
Stats remains admin-only.

* provisioning: consolidate role-based resource authorization

Extract isRoleBasedResource() and authorizeRoleBasedResource() helpers
to avoid duplicating settings/stats resource checks in multiple places.

* provisioning: use resource name constants instead of hardcoded strings

Replace 'repositories', 'connections', 'jobs', 'historicjobs' with
their corresponding ResourceInfo.GetName() constants.

* provisioning: delegate file authorization to connector

Route level: allow any authenticated user for files subresource
Connector: check repositories:read only for directory listing
Individual file CRUD: handled by DualReadWriter based on actual resource

* provisioning: enhance authorization for files and jobs resources

Updated file authorization to fall back to admin role for listing files. Introduced checkAccessForJobs function to manage job permissions, allowing editors to create and manage jobs while maintaining admin-only access for historic jobs. Improved error messaging for permission denials.

* provisioning: refactor authorization with fine-grained permissions

Authorization changes:
- Use access checker with role-based fallback for backwards compatibility
- Repositories/Connections: admin role fallback
- Jobs: editor role fallback (editors can manage jobs)
- HistoricJobs: admin role fallback (read-only)
- Settings: viewer role (needed by multiple UI pages)
- Stats: admin role

Files subresource:
- Route level allows any authenticated user
- Directory listing checks repositories:read in connector
- Individual file CRUD delegated to DualReadWriter

Refactored checkAccessWithFallback to accept fallback role parameter.

* provisioning: refactor access checker integration for improved authorization

Updated the authorization logic to utilize the new access checker across various resources, including files and jobs. This change simplifies the permission checks by removing redundant identity retrieval and enhances error handling. The access checker now supports role-based fallbacks for admin and editor roles, ensuring backward compatibility while streamlining the authorization process for repository and connection subresources.

* provisioning: remove legacy access checker tests and refactor access checker implementation

Deleted the access_checker_test.go file to streamline the codebase and focus on the updated access checker implementation. Refactored the access checker to enhance clarity and maintainability, ensuring it supports role-based fallback behavior. Updated the access checker integration in the API builder to utilize the new fallback role configuration, improving authorization logic across resources.

* refactor: split AccessChecker into TokenAccessChecker and SessionAccessChecker

- Renamed NewMultiTenantAccessChecker -> NewTokenAccessChecker (uses AuthInfoFrom)
- Renamed NewSingleTenantAccessChecker -> NewSessionAccessChecker (uses GetRequester)
- Split into separate files with their own tests
- Added mockery-generated mock for AccessChecker interface
- Names now reflect identity source rather than deployment mode

* fix: correct error message case and use accessWithAdmin for filesConnector

- Fixed error message to use lowercase 'admin role is required'
- Fixed filesConnector to use accessWithAdmin for proper role fallback
- Formatted code

* refactor: reduce cyclomatic complexity in filesConnector.Connect

Split the Connect handler into smaller focused functions:
- handleRequest: main request processing
- createDualReadWriter: setup dependencies
- parseRequestOptions: extract request options
- handleDirectoryListing: GET directory requests
- handleMethodRequest: route to method handlers
- handleGet/handlePost/handlePut/handleDelete: method-specific logic
- handleMove: move operation logic

* security: remove blind TypeAccessPolicy bypass from access checkers

Removed the code that bypassed authorization for TypeAccessPolicy identities.
All identities now go through proper permission verification via the inner
access checker, which will validate permissions from ServiceIdentityClaims.

This addresses the security concern where TypeAccessPolicy was being trusted
blindly without verifying whether the identity came from the wire or in-process.

* feat: allow editors to access repository refs subresource

Change refs authorization from admin to editor fallback so editors can
view repository branches when pushing changes to dashboards/folders.

- Split refs from other read-only subresources (resources, history, status)
- refs now uses accessWithEditor instead of accessWithAdmin
- Updated documentation comment to reflect authorization levels
- Added integration test TestIntegrationProvisioning_RefsPermissions
  verifying editor access and viewer denial

* tests: add authorization tests for missing provisioning API endpoints

Add comprehensive authorization tests for:
- Repository subresources (test, resources, history, status)
- Connection status subresource
- HistoricJobs resource
- Settings and Stats resources

All authorization paths are now covered by integration tests.

* test: fix RefsPermissions test to use GitHub repository

Use github-readonly.json.tmpl template instead of local folder,
since refs endpoint requires a versioned repository that supports
git operations.

* chore: format test files

* fix: make settings/stats authorization work in MT mode

Update authorizeRoleBasedResource to check authlib.AuthInfoFrom(ctx)
for AccessPolicy identity type in addition to identity.GetRequester(ctx).
This ensures AccessPolicy identities are recognized in MT mode where
identity.GetRequester may not set the identity type correctly.

* fix: remove unused authorization helper functions

Remove allowForAdminsOrAccessPolicy and allowForViewersOrAccessPolicy
as they are no longer used after refactoring to use authorizeRoleBasedResource.

* Fix AccessPolicy identity detection in ST authorizer

- Add check for AccessPolicy identities via GetAuthID() in authorizeRoleBasedResource
- Extended JWT may set identity type to TypeUser but AuthID is 'access-policy:...'
- Forward user ID token in X-Grafana-Id header in RoundTripper for aggregator forwarding

* Revert "Fix AccessPolicy identity detection in ST authorizer"

This reverts commit 0f4885e503.

* Add fine-grained permissions for settings and stats endpoints

- Add provisioning.settings:read action (granted to Viewer role)
- Add provisioning.stats:read action (granted to Admin role)
- Add accessWithViewer to APIBuilder for Viewer role fallback
- Use access checker for settings/stats authorization
- Remove role-based authorization functions (isRoleBasedResource, authorizeRoleBasedResource)

This makes settings and stats consistent with other provisioning resources
and works properly in both ST and MT modes via the access checker.

* Remove AUTHORIZATION_COVERAGE.md

* Add provisioning resources to RBAC mapper

- Add connections, settings, stats to provisioning.grafana.app mappings
- Required for authz service to translate K8s verbs to legacy actions
- Fixes 403 errors for settings/stats in MT mode

* refactor: merge access checkers with original fallthrough behavior

Merge tokenAccessChecker and sessionAccessChecker into a unified
access checker that implements the original fallthrough behavior:

1. First try to get identity from access token (authlib.AuthInfoFrom)
2. If token exists AND (is TypeAccessPolicy OR useExclusivelyAccessCheckerForAuthz),
   use the access checker with token identity
3. If no token or conditions not met, fall back to session identity
   (identity.GetRequester) with optional role-based fallback

This fixes the issue where settings/stats/connections endpoints were
failing in MT mode because the tokenAccessChecker was returning an error
when there was no auth info in context, instead of falling through to
session-based authorization.

The unified checker now properly handles:
- MT mode: tries token first, falls back to session if no token
- ST mode: only uses token for AccessPolicy identities, otherwise session
- Role fallback: applies when configured and access checker denies

* Revert "refactor: merge access checkers with original fallthrough behavior"

This reverts commit 96451f948b.

* Grant settings view role to all

* fix: use actual request verb for settings/stats authorization

Use a.GetVerb() instead of hardcoded VerbGet for settings and stats
authorization. When listing resources (hitting collection endpoint),
the verb is 'list' not 'get', and this mismatch could cause issues
with the RBAC service.

* debug: add logging to access checkers for authorization debugging

Add klog debug logs (V4 level) to token and session access checkers
to help diagnose why settings/stats authorization is failing while
connections works.

* debug: improve access checker logging with grafana-app-sdk logger

- Use grafana-app-sdk logging.FromContext instead of klog
- Add error wrapping with resource.group format for better context
- Log more details including folder, group, and allowed status
- Log error.Error() for better error message visibility

* chore: use generic log messages in access checkers

* Revert "Grant settings view role to all"

This reverts commit 3f5758cf36.

* fix: use request verb for historicjobs authorization

The original role-based check allowed any verb for admins. To preserve
this behavior with the access checker, we should pass the actual verb
from the request instead of hardcoding VerbGet.

---------

Co-authored-by: Charandas Batra <charandas.batra@grafana.com>
2025-12-19 15:11:35 +01:00
Roberto Jiménez Sánchez
a0751b6e71 Provisioning: Default to folder sync only and block new instance sync repositories (#115569)
* Default to folder sync only and block new instance sync repositories

- Change default allowed_targets to folder-only in backend configuration
- Modify validation to only enforce allowedTargets on CREATE operations
- Add deprecation warning for existing instance sync repositories
- Update frontend defaults and tests to reflect new behavior

Fixes #619

* Update warning message: change 'deprecated' to 'not fully supported'

* Fix health check: don't validate allowedTargets for existing repositories

Health checks for existing repositories should treat them as UPDATE operations,
not CREATE operations, so they don't fail validation for instance sync target.

* Fix tests and update i18n translations

- Update BootstrapStep tests to reflect folder-only default behavior
- Run i18n-extract to update translation file structure

* Fix integration tests

* Fix tests

* Fix provisioning test wizard

* Fix fronted test
2025-12-19 11:44:15 +00:00
Roberto Jiménez Sánchez
7e45a300b9 Provisioning: Remove migration from legacy storage (#112505)
* Deprecate Legacy Storage Migration in Backend

* Change the messaging around legacy storage

* Disable cards to connect

* Commit import changes

* Block repository creation if resources are in legacy storage

* Update error message

* Prettify

* chore: uncomment unified migration

* chore: adapt and fix tests

* Remove legacy storage migration from frontend

* Refactor provisioning job options by removing legacy storage and history fields

- Removed the `History` field from `MigrateJobOptions` and related references in the codebase.
- Eliminated the `LegacyStorage` field from `RepositoryViewList` and its associated comments.
- Updated tests and generated OpenAPI schema to reflect these changes.
- Simplified the `MigrationWorker` by removing dependencies on legacy storage checks.

* Refactor OpenAPI schema and tests to remove deprecated fields

- Removed the `history` field from `MigrateJobOptions` and updated the OpenAPI schema accordingly.
- Eliminated the `legacyStorage` field from `RepositoryViewList` and its associated comments in the schema.
- Updated integration tests to reflect the removal of these fields.

* Fix typescript errors

* Refactor provisioning code to remove legacy storage dependencies

- Eliminated references to `dualwrite.Service` and related legacy storage checks across multiple files.
- Updated `APIBuilder`, `RepositoryController`, and `SyncWorker` to streamline resource handling without legacy storage considerations.
- Adjusted tests to reflect the removal of legacy storage mocks and dependencies, ensuring cleaner and more maintainable code.

* Fix unit tests

* Remove more references to legacy

* Enhance provisioning wizard with migration options

- Added a checkbox for migrating existing resources in the BootstrapStep component.
- Updated the form context to track the new migration option.
- Adjusted the SynchronizeStep and useCreateSyncJob hook to incorporate the migration logic.
- Enhanced localization with new descriptions and labels for migration features.

* Remove unused variable and dualwrite reference in provisioning code

- Eliminated an unused variable declaration in `provisioning_manifest.go`.
- Removed the `nil` reference for dualwrite in `repo_operator.go`, aligning with the standalone operator's assumption of unified storage.

* Update go.mod and go.sum to include new dependencies

- Added `github.com/grafana/grafana-app-sdk` version `0.48.5` and several indirect dependencies including `github.com/getkin/kin-openapi`, `github.com/hashicorp/errwrap`, and others.
- Updated `go.sum` to reflect the new dependencies and their respective versions.

* Refactor provisioning components for improved readability

- Simplified the import statement in HomePage.tsx by removing unnecessary line breaks.
- Consolidated props in the SynchronizeStep component for cleaner code.
- Enhanced the layout of the ProvisioningWizard component by streamlining the rendering of the SynchronizeStep.

* Deprecate MigrationWorker and clean up related comments

- Removed the deprecated MigrationWorker implementation and its associated comments from the provisioning code.
- This change reflects the ongoing effort to eliminate legacy components and improve code maintainability.

* Fix linting issues

* Add explicit comment

* Update useResourceStats hook in BootstrapStep component to accept selected target

- Modified the BootstrapStep component to pass the selected target to the useResourceStats hook.
- Updated related tests to reflect the change in expected arguments for the useResourceStats hook.

* fix(provisioning): Update migrate tests to match export-then-sync behavior for all repository types

Updates test expectations for folder-type repositories to match the
implementation changes where both folder and instance repository types
now run export followed by sync. Only the namespace cleaner is skipped
for folder-type repositories.

Changes:
- Update "should run export and sync for folder-type repositories" test to include export mocks
- Update "should fail when sync job fails for folder-type repositories" test to include export mocks
- Rename test to clarify that both export and sync run for folder types
- Add proper mock expectations for SetMessage, StrictMaxErrors, Process, and ResetResults

All migrate package tests now pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Update provisioning wizard text and improve resource counting display

- Enhanced descriptions for migrating existing resources to clarify that unmanaged resources will also be included.
- Refactored BootstrapStepResourceCounting component to simplify the rendering logic and ensure both external storage and unmanaged resources are displayed correctly.
- Updated alert messages in SynchronizeStep to reflect accurate information regarding resource management during migration.
- Adjusted localization strings for consistency with the new descriptions.

* Update provisioning wizard alert messages for clarity and accuracy

- Revised alert points to indicate that resources can still be modified during migration, with a note on potential export issues.
- Clarified that resources will be marked as managed post-provisioning and that dashboards remain accessible throughout the process.

* Fix issue with trigger wrong type of job

* Fix export failure when folder already exists in repository

When exporting resources to a repository, if a folder already exists,
the Read() method would fail with "path component is empty" error.

This occurred because:
1. Folders are identified by trailing slash (e.g., "Legacy Folder/")
2. The Read() method passes this path directly to GetTreeByPath()
3. GetTreeByPath() splits the path by "/" creating empty components
4. This causes the "path component is empty" error

The fix strips the trailing slash before calling GetTreeByPath() to
avoid empty path components, while still using the trailing slash
convention to identify directories.

The Create() method already handles this correctly by appending
".keep" to directory paths, which is why the first export succeeded
but subsequent exports failed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix folder tree not updated when folder already exists in repository

When exporting resources and a folder already exists in the repository,
the folder was not being added to the FolderManager's tree. This caused
subsequent dashboard exports to fail with "folder NOT found in tree".

The fix adds the folder to fm.tree even when it already exists in the
repository, ensuring all folders are available for resource lookups.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Revert "Merge remote-tracking branch 'origin/uncomment-unified-migration-code' into cleanup/deprecate-legacy-storage-migration-in-provisioning"

This reverts commit 6440fae342, reversing
changes made to ec39fb04f2.

* fix: handle empty folder titles in path construction

- Skip folders with empty titles in dirPath to avoid empty path components
- Skip folders with empty paths before checking if they exist in repository
- Fix unit tests to properly check useResourceStats hook calls with type annotations

* Update workspace

* Fix BootstrapStep tests after reverting unified migration merge

Updated test expectations to match the current component behavior where
resource counts are displayed for both instance and folder sync options.

- Changed 'Empty' count expectation from 3 to 4 (2 cards × 2 counts each)
- Changed '7 resources' test to use findAllByText instead of findByText
  since the count appears in multiple cards

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Remove bubbletee deps

* Fix workspace

* provisioning: update error message to reference enableMigration config

Update the error message when provisioning cannot be used due to
incompatible data format to instruct users to enable data migration
for folders and dashboards using the enableMigration configuration
introduced in PR #114857.

Also update the test helper to include EnableMigration: true for both
dashboards and folders to match the new configuration pattern.

* provisioning: add comment explaining Mode5 and EnableMigration requirement

Add a comment in the integration test helper explaining that Provisioning
requires Mode5 (unified storage) and EnableMigration (data migration) as
it expects resources to be fully migrated to unified storage.

* Remove migrate resources checkbox from folder type provisioning wizard

- Remove checkbox UI for migrating existing resources in folder type
- Remove migrateExistingResources from migration logic
- Simplify migration to only use requiresMigration flag
- Remove unused translation keys
- Update i18n strings

* Fix linting

* Remove unnecessary React Fragment wrapper in BootstrapStep

* Address comments

---------

Co-authored-by: Rafael Paulovic <rafael.paulovic@grafana.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-17 17:22:17 +01:00
Daniele Stefano Ferru
bf2682712f Provisioning: Add Connection reference to Repository resource (#115419)
* Provisioning: Add Connection reference to Repository resource

* addressing comments
2025-12-16 17:31:53 +00:00
Daniele Stefano Ferru
5ecfc79e14 Provisioning: Add Connection resource (#115272)
* Provisioning: Add Connection resource

* adding some more integration tests

* updating openapi snapshot, linting

* generating FE code, fixing issue in unit tests

* addressing comments

* addressing comments

* adding more integration tests

* fixing rebase issues

* removing linting exception

* addressing comments: improving validation and tests

* adding Connection URL at mutation time, updating tests accordingly

* linting
2025-12-16 14:37:07 +01:00
Gonzalo Trigueros Manzanas
d4a627c5fc Provisioning: Add resource-level warning support. (#115023) 2025-12-12 15:59:45 +01:00
Stephanie Hingtgen
db9afe31e4 Provisioning: Fix panic on watcher when channel is closed (#114439) 2025-12-01 08:24:03 +03:00
Daniele Stefano Ferru
8e4be891c5 Provisioning: add URL and Path in setting response (#114534)
* Provisioning: add URL and Path in setting response

* linting

* marking fields as non-required
2025-11-27 16:06:03 +01:00
Roberto Jiménez Sánchez
a4cbbe10c0 Provisioning: Add retry logic for nanogit client operations (#114216)
* chore(deps): update nanogit to v0.3.0 in go.mod and go.sum files

* Add retry logic for nanogit client operations

- Configure retry logic in withGitContext to ensure all Git operations have retry support
- Use nanogit's ExponentialBackoffRetrier with 8 attempts (~10s retry window)
- Retry transient network errors and HTTP-specific server errors (5xx for GET/DELETE, 429 for all)
- Rename logger function to withGitContext to better reflect its responsibilities

* fix: resolve staticcheck S1008 linting issue in retry_client.go

Simplify return statement to use errors.As directly instead of if-return pattern

* Revert "fix: resolve staticcheck S1008 linting issue in retry_client.go"

This reverts commit bd367b5629.
2025-11-20 13:55:45 +00:00
Roberto Jiménez Sánchez
cdc6a6114c Provisioning: Improve logging and tracing in job processing (#113454)
* Provisioning: Improve logging and tracing in job processing

- Add comprehensive tracing with OpenTelemetry spans across all job operations
- Enhance logging with consistent style: lowercase, concise messages, appropriate log levels
- Use past tense for completed lifecycle events (e.g., 'stopped' vs 'stop')
- Add structured logging with contextual attributes for better searchability
- Handle graceful shutdowns without throwing errors on context cancellation
- Refactor Cleanup method into listExpiredJobs and cleanUpExpiredJob for better code quality
- Avoid double logging by only logging errors when handled locally
- Add tracing and logging to historyjob controller cleanup operations

Files modified:
- pkg/registry/apis/provisioning/jobs/driver.go: Add tracing spans and improve error handling for graceful shutdown
- pkg/registry/apis/provisioning/jobs/concurrent_driver.go: Add tracing and consistent logging
- pkg/registry/apis/provisioning/jobs/persistentstore.go: Add comprehensive tracing and logging to all public methods, refactor cleanup
- apps/provisioning/pkg/controller/historyjob.go: Add tracing and improve logging consistency

* Update pkg/registry/apis/provisioning/jobs/persistentstore.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Refactor logging in persistentstore.go

- Remove debug log statements at the start of job operations for cleaner output
- Maintain structured logging with contextual attributes for improved traceability

Files modified:
- pkg/registry/apis/provisioning/jobs/persistentstore.go: Clean up logging for job operations

* Enhance logging and tracing in provisioning job operations

- Introduce OpenTelemetry spans for better observability in job processing and webhook handling
- Improve structured logging with contextual attributes for key operations
- Remove unnecessary tracing spans in long-running functions to streamline performance
- Update error handling to record errors in spans for better traceability

Files modified:
- pkg/registry/apis/provisioning/controller/repository.go: Add tracing and structured logging to sync job operations
- pkg/registry/apis/provisioning/jobs/concurrent_driver.go: Remove tracing span from long-running function
- pkg/registry/apis/provisioning/jobs/driver.go: Enhance logging and tracing in job processing
- pkg/registry/apis/provisioning/webhooks/webhook.go: Implement tracing and structured logging for webhook connections

* Update pkg/registry/apis/provisioning/jobs/driver.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Improve error handling in ConcurrentJobDriver to differentiate between graceful shutdown and unexpected stops

* Remove unused import in driver.go to clean up code

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-12 14:59:27 +01:00
Roberto Jiménez Sánchez
02464c19b8 Provisioning: Add validation for Job specifications (#113590)
* Validate Job Specs

* Add comprehensive unit test coverage for job validator

- Added 8 new test cases to improve coverage from 88.9% to ~100%
- Tests for migrate action without options
- Tests for delete/move actions with resources (missing kind)
- Tests for move action with valid resources
- Tests for move/delete with both paths and resources
- Tests for move action with invalid source paths
- Tests for push action with valid paths

Now covers all validation paths including resource validation and
edge cases for all job action types.

* Add integration tests for job validation

Added comprehensive integration tests that verify the job validator properly
rejects invalid job specifications via the API:

- Test job without action (required field)
- Test job with invalid action
- Test pull job without pull options
- Test push job without push options
- Test push job with invalid branch name (consecutive dots)
- Test push job with path traversal attempt
- Test delete job without paths or resources
- Test delete job with invalid path (path traversal)
- Test move job without target path
- Test move job without paths or resources
- Test move job with invalid target path (path traversal)
- Test migrate job without migrate options
- Test valid pull job to ensure validation doesn't block legitimate requests

These tests verify that the admission controller properly validates job specs
before they are persisted, ensuring security (path traversal prevention) and
data integrity (required fields/options).

* Remove valid job test case from integration tests

Removed the positive test case as it's not necessary for validation testing.
The integration tests now focus solely on verifying that invalid job specs
are properly rejected by the admission controller.

* Fix movejob_test to expect validation error at creation time

Updated the 'move without target path' test to expect the job creation
to fail with a validation error, rather than expecting the job to be
created and then fail during execution.

This aligns with the new job validation logic which rejects invalid
job specs at the API admission control level (422 Unprocessable Entity)
before they can be persisted.

This is better behavior as it prevents invalid jobs from being created
in the first place, rather than allowing them to be created and then
failing during execution.

* Simplify action validation using slices.Contains

Replaced manual loop with slices.Contains for cleaner, more idiomatic Go code.
This reduces code complexity while maintaining the same validation logic.

- Added import for 'slices' package
- Replaced 8-line loop with 1-line slices.Contains call
- All unit tests pass

* Refactor job action validation in ValidateJob function

Removed the hardcoded valid actions array and simplified the validation logic. The function now directly appends an error for invalid actions, improving code clarity and maintainability. This change aligns with the recent updates to job validation, ensuring that invalid job specifications are properly handled.
2025-11-07 16:31:50 +00:00
Ryan McKinley
5c5ecac6ee Provisioning: Ensure name and email are always set for the AuthorSignature (#112594)
* all properties

* lint
2025-10-20 08:45:02 +00:00
Ryan McKinley
f1e456eb01 Provisioning: Watch file system for changes (#112184)
* trigger sync on any change

* better comments

* add deletes to test

* Update apps/provisioning/pkg/repository/local/watch.go

* Update pkg/services/provisioning/dashboards/file_reader.go

* Update apps/provisioning/pkg/repository/local/watch.go

---------

Co-authored-by: Stephanie Hingtgen <stephanie.hingtgen@grafana.com>
2025-10-10 17:26:59 +00:00
Stephanie Hingtgen
047c51be01 Provisioning: Do full sync on resync period when needed (#112144) 2025-10-08 05:25:57 -06:00
Stephanie Hingtgen
d5d1851bc1 Provisioning: Cleanup folders properly with webhooks (#112031) 2025-10-04 21:17:42 +00:00
Ryan McKinley
2486dba881 Provisioning: use kind consistently for provisioning stats (#111977) 2025-10-04 09:02:02 -05:00
Stephanie Hingtgen
3ce9137c19 Provisioning: Refactor to combine validation and test endpoint logic (#111965)
Provisioning: Refactor test endpoint
2025-10-03 01:16:37 -06:00
Stephanie Hingtgen
044407d9dc Provisioning: Allow configurable min interval (#111920) 2025-10-02 09:05:09 +03:00
Stephanie Hingtgen
6f0a8a344a Provisioning: Fix repos stuck in deletion (#111918) 2025-10-02 09:04:55 +03:00
Costa Alexoglou
1b766b9c9f Provisioning finalisers fix 2 (#111679)
* adding some logs to better understand what might be happening

* only focus this PR on improve logging in finalizer handling

* debug log before calling finalizers

* working on finalizers

* removing last todos, adding unit tests

* better use SupportedFinalizers name

* addressing comments

* wip: fix tests and add delete error in status

* chore: codegen

* chore: codegen openapi

* Merge remote-tracking branch 'origin/main' into provisioning-finalisers-fix-2

* update frontend client

* fix: errors in testing

* fix: breaking test

---------

Co-authored-by: Daniele Ferru <daniele.ferru@grafana.com>
Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
2025-09-29 15:21:12 +02:00
Costa Alexoglou
31ae013e8d chore: add validations to test endpoint (#111622)
* chore: add validations to test endpoint

* Validate path

---------

Co-authored-by: Clarity-89 <homes89@ukr.net>
2025-09-25 15:10:13 +00:00
Charandas
64c700e563 Provisioning: kind name should be singular (#111570) 2025-09-24 15:25:41 -07:00
Costa Alexoglou
0c0554da5e fix: avoid child paths in repositories (#111573)
* fix: avoid child paths in repositories

* add another unit test; fix linter

* Update pkg/registry/apis/provisioning/register.go

* skip itself

* fix: failing tests

---------

Co-authored-by: Stephanie Hingtgen <stephanie.hingtgen@grafana.com>
2025-09-24 21:35:06 +00:00
Stephanie Hingtgen
8b1caccc72 Provisioning: Add metrics for repo controller (#111450) 2025-09-22 20:14:03 +00:00
Stephanie Hingtgen
bd550d2f06 Provisioning: Wire up prometheus (#111444) 2025-09-22 09:54:50 -05:00
Stephanie Hingtgen
15ee224da5 Provisioning: Allow disabling of image rendering instance wide (#111359) 2025-09-19 12:40:14 +03:00
Stephanie Hingtgen
cb11bc15fa Provisioning: Allow disabling of instance sync (#111270)
---------

Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
Co-authored-by: Alex Khomenko <Clarity-89@users.noreply.github.com>
2025-09-18 10:40:02 -05:00
Costa Alexoglou
0248a393d7 fix: dashboard upsert with empty ref (#111190)
* fix: dashboard upsert with empty ref

* chore: cleanup

* fix: branches in other git providers and linter
2025-09-17 11:49:15 +02:00
Ryan McKinley
0a79b3bdc5 Chore: Upgrade k8s.io/api v0.34.1 and grafana-app-sdk v0.43.1 (#111009) 2025-09-16 13:35:20 +03:00
Daniele Stefano Ferru
1f7afc6b6a Provisioning: add unit and integration tests for finalizer validation (#111012)
* Add unit testS

* add integration tests
2025-09-12 13:57:31 +02:00
Daniele Stefano Ferru
6b2b949f8f Provisioning: check finalizers when validating Repository object (#110955) 2025-09-11 21:38:41 -05:00
Roberto Jiménez Sánchez
09ef9c8176 Provisioning: Remove again dependency cycle between provisioning app and grafana (#110863)
* Remove dependency cycle between provisioning app and grafana

* Format code

* Fix linting
2025-09-10 14:40:44 +02:00
Stephanie Hingtgen
323738d191 Provisioning: Fix check of who can update (#110835) 2025-09-10 09:04:10 +03:00
Stephanie Hingtgen
8805e93b1d Provisioning: Add better nil check (#110847) 2025-09-09 18:23:11 -05:00
Roberto Jiménez Sánchez
acbc2cf01a Provisioning: Configurable Repository Types in monolith and operators (#110822)
* Configurable repository types in monolith and operator

* Default to Github in operators

* Regenerate wire

* Fix and implement unit tests

* Same types for enterprise tests

* Remove unnecessary conversion

* Remove the issue with import cycles
2025-09-09 19:13:22 +02:00
Daniele Stefano Ferru
76f7836419 Provisioning: correctly use resource clients in controllers (#110737)
* Provisioning: correctly use resource clients in controllers

* better names on fields

* fix struct initialisation

* updating roundtripper tests
2025-09-06 18:13:39 -06:00
Stephanie Hingtgen
9ddc70423b Provisioning: Cleanup tester interface (#110640)
* Provisioning: Cleanup tester interface

* undo accidental change

* cleanup

* cleanup test
2025-09-05 07:47:27 +02:00
Stephanie Hingtgen
b567cde3d3 Provisioning: Reuse controller from registry (#110639) 2025-09-05 01:13:55 +00:00
Roberto Jiménez Sánchez
3d009ff7ed Provisioning: Build and use repository factory in repository controller (#110585)
Build and use repository factory
2025-09-04 13:12:56 +02:00
Roberto Jiménez Sánchez
7d630ec3b1 Provisioning: Refactor tweaks to support MT controllers (#110581)
* Refactor common code to support MT controllers

* Delete original status files
2025-09-04 10:06:50 +00:00
Stephanie Hingtgen
84ae9ea71b Provisioning: Add scaffolding for repo controller (#110543) 2025-09-03 17:30:41 +00:00
Matheus Macabu
1e926a29c0 Secrets: Extract external facing decrypt types to apps (#110432) 2025-09-02 10:30:29 +02:00
Roberto Jiménez Sánchez
4eadc823a9 Provisioning: Move repository package to provisioning app (#110228)
* Move repository package to apps

* Move operators to grafana/grafana

* Go mod tidy

* Own package by git sync team for now

* Merged

* Do not use settings in local extra

* Remove dependency on webhook extra

* Hack to work around issue with secure contracts

* Sync Go modules

* Revert "Move operators to grafana/grafana"

This reverts commit 9f19b30a2e.
2025-09-02 09:45:44 +02:00
Roberto Jiménez Sánchez
4de9ec7310 Provisioning: Fix import cycle between grafana and provisioning app (#110406)
* Move operators to grafana/grafana

* Go mod tidy
2025-09-01 13:29:34 +00:00
Stephanie Hingtgen
232d68fb8c Controllers: Make available as a target (#110357)
* Controllers: Add to build process
* Allow setting through env variables
2025-08-30 12:27:50 +02:00
Roberto Jiménez Sánchez
93a35fc7be Provisioning: Move apifmt, loki and safepath to provisioning app (#110226)
* Move apifmt

* Move safepath

* Move Loki package

* Regenerate Loki mock

* Missing file for Loki
2025-08-27 13:26:48 -05:00
Roberto Jiménez Sánchez
e7ccefcf92 Provisioning: Add Standalone Job Controller Without Job Processing (#109610)
* Add standalone job controller
* Add makefile
* Add limit on the current implementation
* Move job controllers to app package
* Add TLS flags
2025-08-25 08:48:40 +00:00
Ryan McKinley
ce65391067 Provisioning: Use inline secrets for gitsync (#109908)
Co-authored-by: Clarity-89 <homes89@ukr.net>
Co-authored-by: Roberto Jimenez Sanchez <roberto.jimenez@grafana.com>
2025-08-22 18:38:28 +02:00
Roberto Jiménez Sánchez
61d137992b Provisioning: Mark repository as unhealthy if hooks fail (#109788) 2025-08-21 08:32:23 +00:00
Ryan McKinley
fa81fae1e3 Provisioning: Add inline secure values to repository schema (#109594) 2025-08-20 09:05:41 +00:00