rewording after feedback from Support about registries

Signed-off-by: Silvio Moioli <silvio@moioli.net>
This commit is contained in:
Silvio Moioli
2025-06-04 14:18:45 +02:00
parent 674ef8b725
commit 2a9af4d13a
7 changed files with 96 additions and 55 deletions
@@ -28,20 +28,28 @@ The following applications and components generally do not interfere with Ranche
* Cluster API components
* CNIs, CPIs, CSIs
* Cloud controller managers
* Observability and monitoring tools (with the exception of prometheus-rancher-exporter)
* the [Harbor](https://goharbor.io/) container registry
* Observability and monitoring tools (except prometheus-rancher-exporter)
Note that each of these components has its own minimum resource requirements, which must be met in addition to Rancher's.
Note that each of these components has its own minimum resource requirements, which must be met in addition to Rancher's. For high-scale deployments, also consider dedicating separate nodes to non-Rancher software using [taints and tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to minimize interference.
Container registries, in particular, can require significant bandwidth for serving images. Ensure sufficient bandwidth is available, ideally reserved using Quality of Service (QoS) mechanisms, for Rancher.
For high-scale deployments, consider dedicating separate nodes to non-Rancher software using [taints and tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to minimize interference.
The following software can interfere with Rancher performance at scale and is therefore discouraged on the upstream cluster:
The following software can interfere with Rancher performance and is therefore discouraged on the upstream cluster:
* [CrossPlane](https://www.crossplane.io/)
* [Argo CD](https://argoproj.github.io/cd/)
* [Flux](https://fluxcd.io/)
* [prometheus-rancher-exporter](https://github.com/David-VTUK/prometheus-rancher-exporter) (see [issue 33](https://github.com/David-VTUK/prometheus-rancher-exporter/issues/33))
* Container registries such as [Harbor](https://goharbor.io/), which can require significant bandwidth for serving images
### Guidance for Container Registries
Container registries, such as [Harbor](https://goharbor.io/), can consume significant network bandwidth when serving images. This demand increases with the number of images, the frequency of image pulls, and the quantity of clusters and container runtimes they serve. Due to this potential for interference with Rancher UI and API traffic, we recommend against running container registries on the same cluster as the Rancher management server.
Regardless of your deployment strategy for a container registry, ensure sufficient bandwidth is available, ideally reserved using Quality of Service (QoS) mechanisms.
Consider the following recommendations based on your needs:
* **Simple Setups (HA Not a Primary Concern):** A container registry deployed as a single Virtual Machine (VM) can be a viable solution.
* **High Availability (HA) Requirements:** We recommend running the registry in a dedicated Kubernetes cluster. All other clusters should then be configured to pull images from this centralized, HA registry.
* **Very Large-Scale or Complex Network Topologies:** Multiple registry clusters might be necessary. These can be deployed in a hierarchical or federated model to efficiently distribute images and manage traffic.
### Make sure nodes are configured correctly for Kubernetes
It's important to follow K8s and etcd best practices when deploying your nodes, including disabling swap, double checking you have full network connectivity between all machines in the cluster, using unique hostnames, MAC addresses, and product_uuids for every node, checking that all correct ports are opened, and deploying with ssd backed etcd. More details can be found in the [kubernetes docs](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) and [etcd's performance op guide](https://etcd.io/docs/v3.5/op-guide/performance/).
@@ -63,4 +71,3 @@ However, metrics-driven capacity planning analysis should be the ultimate guidan
Using Rancher, you can monitor the state and processes of your cluster nodes, Kubernetes components, and software deployments through integration with Prometheus, a leading open-source monitoring solution, and Grafana, which lets you visualize the metrics from Prometheus.
After you [enable monitoring](../../../integrations-in-rancher/monitoring-and-alerting/monitoring-and-alerting.md) in the cluster, you can set up alerts to let you know if your cluster is approaching its capacity. You can also use the Prometheus and Grafana monitoring framework to establish a baseline for key metrics as you scale.