Discourage third party software on the upstream cluster

Signed-off-by: Silvio Moioli <silvio@moioli.net>
This commit is contained in:
Silvio Moioli
2025-04-01 18:51:58 +02:00
parent 148c5da3ad
commit d497641fa5
18 changed files with 168 additions and 120 deletions
@@ -10,7 +10,7 @@ This guide contains our recommendations for running the Rancher server, and is i
### Recommended Architecture and Infrastructure
Refer to this [guide](tips-for-running-rancher.md) for our general advice for setting up the Rancher server on a high-availability Kubernetes cluster.
Refer to this [guide](tips-for-running-rancher.md) for our general advice for setting up the Rancher server for a production installation.
### Deployment Strategies
@@ -14,8 +14,32 @@ If you are installing Rancher in a vSphere environment, refer to the best practi
When you set up your high-availability Rancher installation, consider the following:
### Run Rancher on a Separate Cluster
Don't run other workloads or microservices in the Kubernetes cluster that Rancher is installed on.
### Minimize Third-Party Software on the Upstream Cluster
Running Rancher, especially as the number of managed clusters, nodes, and workloads increases, can place a significant load on core Kubernetes components like `etcd` and `kube-apiserver` on the upstream cluster. Third-party software can interfere with the performance of these components and Rancher itself, potentially causing issues.
Every third-party application introduces a risk of interference. To minimize performance and incompatibility issues on the upstream cluster, avoid deploying any applications or components other than essential Kubernetes system components and Rancher.
The following applications and components generally do not interfere with Rancher or Kubernetes system performance:
* Rancher internal components, such as Fleet
* Rancher extensions
* Cluster API components
* CNIs
* Cloud controller managers
* Observability and monitoring tools (with the exception of prometheus-rancher-exporter)
* the [SUSE Private Registry](https://documentation.suse.com/cloudnative/suse-private-registry/html/private-registry/index.html)
Remember that each of these has its own minimum resource requirements, which must be met in addition to Rancher's requirements.
In particular, SUSE Private Registry can require significant bandwidth for serving images. Ensure sufficient bandwidth is available (and ideally, reserved using Quality of Service mechanisms) for Rancher.
In high-scale scenarios, consider dedicating separate nodes to non-Rancher software to minimize interference.
The following software has been found to interfere with Rancher performance at scale and is therefore discouraged on the upstream cluster:
* [CrossPlane](https://www.crossplane.io/)
* [Argo CD](https://argoproj.github.io/cd/)
* [Flux](https://fluxcd.io/)
* [prometheus-rancher-exporter](https://github.com/David-VTUK/prometheus-rancher-exporter) (see [issue 33](https://github.com/David-VTUK/prometheus-rancher-exporter/issues/33))
### Make sure nodes are configured correctly for Kubernetes
It's important to follow K8s and etcd best practices when deploying your nodes, including disabling swap, double checking you have full network connectivity between all machines in the cluster, using unique hostnames, MAC addresses, and product_uuids for every node, checking that all correct ports are opened, and deploying with ssd backed etcd. More details can be found in the [kubernetes docs](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) and [etcd's performance op guide](https://etcd.io/docs/v3.5/op-guide/performance/).
@@ -23,23 +23,7 @@ When scaling up Rancher, one typical bottleneck is resource growth in the upstre
### Minimizing Third-Party Software on the Upstream Cluster
Running Rancher at scale can put significant load on internal Kubernetes components, such as `etcd` or `kubeapiserver`. Issues may arise if third-party software interferes with the performance of those components or with Rancher.
Every third-party piece of software carries a risk of interference. To prevent performance issues on the upstream cluster, you should avoid running any other apps or components, beyond Kubernetes system components and Rancher itself.
Software in the following categories generally won't interfere with Rancher or Kubernetes system performance:
* Rancher internal components, such as Fleet
* Rancher extensions
* Cluster API components
* CNIs
* Cloud controller managers
* Observability and monitoring tools (with the exception of prometheus-rancher-exporter)
On the other hand, the following software are found to interfere with Rancher performance at scale:
* [CrossPlane](https://www.crossplane.io/)
* [Argo CD](https://argoproj.github.io/cd/)
* [Flux](https://fluxcd.io/)
* [prometheus-rancher-exporter](https://github.com/David-VTUK/prometheus-rancher-exporter) (see [issue 33](https://github.com/David-VTUK/prometheus-rancher-exporter/issues/33))
Recommendations outlined in the [general Rancher recommendations](./tips-for-running-rancher.md#minimize-third-party-software-on-the-upstream-cluster) are particularly important in a high scale context.
### Managing Your Object Counts