mirror of
https://github.com/rancher/rancher-docs.git
synced 2026-05-21 20:35:27 +00:00
Add v2.14 preview docs (#2212)
This commit is contained in:
+100
@@ -0,0 +1,100 @@
|
||||
---
|
||||
title: Prerequisites
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace/aws-cloud-marketplace/adapter-requirements"/>
|
||||
</head>
|
||||
|
||||
### 1. Setting Up License Manager and Purchasing Support
|
||||
|
||||
First, complete the [first step](https://docs.aws.amazon.com/license-manager/latest/userguide/getting-started.html) of the license manager one-time setup.
|
||||
Next, go to the AWS Marketplace. Locate the "Rancher Premium Support Billing Container Starter Pack". Purchase at least one entitlement.
|
||||
|
||||
If you have installed Rancher using the "Rancher Setup" AWS Marketplace offering, skip to [Step 4](#4-create-an-oidc-provider).
|
||||
|
||||
> **Note:** Each entitlement grants access to support for a certain amount of nodes. You can purchase more licenses as necessary later on.
|
||||
|
||||
### 2. Create an EKS Cluster
|
||||
Follow the [Rancher docs](../../../getting-started/installation-and-upgrade/install-upgrade-on-a-kubernetes-cluster/rancher-on-amazon-eks.md) to create an EKS cluster. When you get to the [final step to install Rancher](../../../getting-started/installation-and-upgrade/install-upgrade-on-a-kubernetes-cluster/rancher-on-amazon-eks.md#8-install-the-rancher-helm-chart), **stop and return to this page**. This cluster will need to meet the following requirements:
|
||||
|
||||
- EKS version 1.22.
|
||||
- Each node in the cluster has access to the registry containing Rancher and its related images.
|
||||
- Each node in the cluster has access to the ECR repo storing the CSP Adapter.
|
||||
- Each node in the cluster has access to the license manager service.
|
||||
- Each node in the cluster has access to global endpoints for the STS service.
|
||||
|
||||
### 3. Install Rancher
|
||||
|
||||
In addition to the options specified to install Rancher in the [Rancher docs](../../../getting-started/installation-and-upgrade/install-upgrade-on-a-kubernetes-cluster/rancher-on-amazon-eks.md#8-install-the-rancher-helm-chart), you will also need to enable extra metrics.
|
||||
This can be done through the Helm CLI through the following options:
|
||||
|
||||
```bash
|
||||
--set extraEnv\[0\].name="CATTLE_PROMETHEUS_METRICS" --set-string extraEnv\[0\].value=true
|
||||
```
|
||||
|
||||
You can also use a values.yaml like the below:
|
||||
|
||||
```yaml
|
||||
extraEnv:
|
||||
- name: "CATTLE_PROMETHEUS_METRICS"
|
||||
value: "true"
|
||||
```
|
||||
|
||||
You will also need to install Rancher version 2.6.7 or higher.
|
||||
|
||||
### 4. Create an OIDC Provider
|
||||
|
||||
Follow the [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html) to create an OIDC provider for the cluster specified in the previous section.
|
||||
|
||||
### 5. Create an IAM Role
|
||||
|
||||
An IAM role is required for the CSP adapter to check-in/check-out entitlements.
|
||||
|
||||
First, configure the trust policy as below. Replace `MY_AWS_ACC` with your AWS account number, `MY_AWS_REGION` with your AWS region, and `MY_OIDC_PROVIDER` with the id of your OIDC provider:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Principal": {
|
||||
"Federated": "arn:aws:iam::${MY_AWS_ACC}:oidc-provider/oidc.eks.${MY_AWS_REGION}.amazonaws.com/id/${MY_OIDC_PROVIDER}"
|
||||
},
|
||||
"Action": "sts:AssumeRoleWithWebIdentity",
|
||||
"Condition": {
|
||||
"StringEquals": {
|
||||
"oidc.eks.${MY_AWS_REGION}.amazonaws.com/id/${MY_OIDC_PROVIDER}:sub": "system:serviceaccount:cattle-csp-adapter-system:rancher-csp-adapter",
|
||||
"oidc.eks.${MY_AWS_REGION}.amazonaws.com/id/${MY_OIDC_PROVIDER}:aud": "sts.amazonaws.com"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Next, use a policy for the role which has the following permissions:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "RancherCSPAdapterPermissions",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"license-manager:ListReceivedLicenses",
|
||||
"license-manager:CheckoutLicense",
|
||||
"license-manager:ExtendLicenseConsumption",
|
||||
"license-manager:CheckInLicense",
|
||||
"license-manager:GetLicense",
|
||||
"license-manager:GetLicenseUsage"
|
||||
],
|
||||
"Resource": "*"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Save the name of the role. You will need it later on when installing the CSP adapter.
|
||||
+40
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: AWS Marketplace Integration
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace/aws-cloud-marketplace"/>
|
||||
</head>
|
||||
|
||||
## Overview
|
||||
|
||||
Rancher offers an integration with the AWS Marketplace which allows users to purchase a support contract with SUSE. This integration allows you easily adjust your support needs as you start to support more clusters.
|
||||
|
||||
## Required Skills
|
||||
|
||||
At a minimum, users are expected to have a working knowledge of EKS and peripheral functions such as IAM Policies and Roles, Route 53 DNS, and the use of awscli and Helm commands.
|
||||
|
||||
## Limitations
|
||||
|
||||
- You must be running Rancher v2.6.7 or higher
|
||||
- Rancher must be deployed with additional metrics enabled.
|
||||
- Rancher must be installed on an EKS cluster.
|
||||
- You must purchase at least one entitlement to Rancher support through AWS Marketplace.
|
||||
- You may need additional setup to support proxy/airgap use cases. See the [prerequisites](adapter-requirements.md) for more information.
|
||||
|
||||
## How to Use
|
||||
|
||||
1. Complete the [prerequisite steps](adapter-requirements.md).
|
||||
2. [Install the CSP Adapter](install-adapter.md).
|
||||
|
||||
## FAQ
|
||||
|
||||
**Can I purchase support for more nodes later on?**
|
||||
|
||||
Yes. Simply go to the AWS Marketplace entry that you used to initially purchase support and increase the number of entitlements.
|
||||
|
||||
**Can I use multiple instances of Rancher in the same AWS account?**
|
||||
|
||||
Yes. However, each cluster that Rancher is installed in will need to adhere to the prerequisites.
|
||||
|
||||
In addition, keep in mind that a given entitlement can only be used by one Rancher management server at a time.
|
||||
+32
@@ -0,0 +1,32 @@
|
||||
---
|
||||
title: Common Issues
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace/aws-cloud-marketplace/common-issues"/>
|
||||
</head>
|
||||
|
||||
**After installing the adapter, a banner message appears in Rancher that says "AWS Marketplace Adapter: Unable to run the adapter, please check the adapter logs"**
|
||||
|
||||
This error indicates that while the adapter was installed into the cluster, an error has occurred which prevents it from properly checking-in/checking-out licenses.
|
||||
|
||||
This often occurs because the IAM role was not set up properly. Review the [prerequisites](./adapter-requirements.md) and verify that:
|
||||
|
||||
- An OIDC provider has been created/associated with the cluster Rancher is running on.
|
||||
- The IAM role has been configured to trust this OIDC provider.
|
||||
- The IAM role has at least the permissions outlined in the policy.
|
||||
|
||||
If all of the above have been configured correctly, reach out to support for assistance.
|
||||
|
||||
**I see a banner message that states, "AWS Marketplace Adapter: You have exceeded your licensed node count. At least x more license(s) are required in AWS to become compliant"**
|
||||
|
||||
This message indicates that you do not have enough entitlements for the amount of nodes Rancher is currently managing.
|
||||
|
||||
Keep in mind the following limitations:
|
||||
|
||||
- Each entitlement is valid for a certain number of nodes.
|
||||
- Every node currently managed by Rancher counts toward your usage total (with exception of nodes in the cluster rancher is installed on).
|
||||
- Each entitlement can be used by at most one Rancher instance. For example, if you have two running Rancher instances in your account (each installed on a separate EKS cluster), then you will need at least two entitlements.
|
||||
|
||||
You may also have recently uninstalled/re-installed the adapter. If the adapter loses track of the licenses that it is currently managing, it can take up to an hour to resolve the actual state of the licenses.
|
||||
|
||||
+156
@@ -0,0 +1,156 @@
|
||||
---
|
||||
title: Installing the Adapter
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace/aws-cloud-marketplace/install-adapter"/>
|
||||
</head>
|
||||
|
||||
> **Important:** If you are attempting to re-install the adapter, you may experience errant out-of-compliance messages for up to an hour.
|
||||
|
||||
### Rancher vs. Adapter Compatibility Matrix
|
||||
|
||||
:::note Important:
|
||||
|
||||
Different versions of the CSP adapter rely on features found in specific versions of Rancher.
|
||||
In order to deploy and run the adapter successfully, you need to ensure its version corresponds to the necessary version of Rancher.
|
||||
|
||||
:::
|
||||
|
||||
| Rancher Version | Adapter Version |
|
||||
|-----------------|------------------|
|
||||
| v2.13.3 | 108.0.0+up8.0.0 |
|
||||
| v2.13.2 | 108.0.0+up8.0.0 |
|
||||
| v2.13.1 | 108.0.0+up8.0.0 |
|
||||
| v2.13.0 | 108.0.0+up8.0.0 |
|
||||
|
||||
### 1. Gain Access to the Local Cluster
|
||||
|
||||
> **Note:** Only admin users should have access to the local cluster. Because the CSP adapter must be installed in the local cluster, this installation must be carried out by an admin user.
|
||||
|
||||
First, click on the local cluster and download a kubeconfig token. You can then configure your CLI to use this new token with the following command, replacing `$TOKEN_PATH` with the path on your filesystem to the downloaded token:
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=$TOKEN_PATH
|
||||
```
|
||||
|
||||
### 2. Create the Adapter Namespace
|
||||
|
||||
Create the namespace that the adapter will be installed in.
|
||||
|
||||
```bash
|
||||
kubectl create ns cattle-csp-adapter-system
|
||||
```
|
||||
|
||||
### 3. Create Certificate Secrets
|
||||
|
||||
The adapter requires access to the root CA that Rancher is using to communicate with the Rancher server. You can read more about which certificate options Rancher supports in the [chart options page](../../../getting-started/installation-and-upgrade/installation-references/helm-chart-options.md).
|
||||
|
||||
If your Rancher install uses a certificate signed by a recognized Certificate Authority such as Let's Encrypt, then you can safely skip to [Step 4](#4-install-the-chart).
|
||||
|
||||
However, if your Rancher install uses a custom certificate such as a Rancher-generated certificate or one signed by a private Certificate Authority, you will need to provide the certificate for this authority in PEM-encoded format so that the adapter can communicate with Rancher.
|
||||
|
||||
First, retrieve the certificate that Rancher is using and place in a file named `ca-additional.pem`. If you are using the Rancher-generated certs option, this can be done with the following command:
|
||||
|
||||
```bash
|
||||
kubectl get secret tls-rancher -n cattle-system -o jsonpath="{.data.tls\.crt}" | base64 -d >> ca-additional.pem
|
||||
```
|
||||
|
||||
Then, create a secret which uses this cert:
|
||||
|
||||
```bash
|
||||
kubectl -n cattle-csp-adapter-system create secret generic tls-ca-additional --from-file=ca-additional.pem
|
||||
```
|
||||
|
||||
> **Important:** Do not change the names of the file or of the created secret. Making changes to these values may result in errors when the adapter runs.
|
||||
|
||||
### 4. Install the Chart
|
||||
|
||||
First, add the `rancher/charts` repo using the following command:
|
||||
|
||||
```bash
|
||||
helm repo add rancher-charts https://charts.rancher.io
|
||||
```
|
||||
|
||||
Next, install the CSP adapter. You must specify several values, including the account number, and the name of the role created in the prerequisites.
|
||||
|
||||
Ensure that you use the version of the CSP adapter that matches the version of Rancher that you are running, as defined [above](#rancher-vs-adapter-compatibility-matrix).
|
||||
|
||||
For the below instructions, replace `$MY_ACC_NUM` with your AWS account number and `$MY_ROLE_NAME` with the name of the role created in the prerequisites. In addition, replace `$CSP_ADAPTER_VERSION` with the version that matches your Rancher version in the [version matrix](#rancher-vs-adapter-compatibility-matrix).
|
||||
|
||||
> **Note:** If you use shell variables, do not specify quotation marks. For example, MY_ACC_NUM=123456789012 will work, but MY_ACC_NUM="123456789012" will fail.
|
||||
|
||||
> **Note:** Accounts using the AWS Marketplace listing for the EU and the UK will need to specify an additional `--set image.repository=rancher/rancher-csp-adapter-eu` option. To see if your account needs this option when installing the adapter, refer to the usage instructions of the marketplace listing.
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="Let's Encrypt/ Public Certificate Authority">
|
||||
|
||||
```bash
|
||||
helm install rancher-csp-adapter rancher-charts/rancher-csp-adapter --namespace cattle-csp-adapter-system --set aws.enabled=true --set aws.roleName=$MY_ROLE_NAME --set-string aws.accountNumber=$MY_ACC_NUM --version $CSP_ADAPTER_VERSION
|
||||
```
|
||||
|
||||
|
||||
Alternatively, you can use a `values.yaml` and specify options like below:
|
||||
|
||||
```yaml
|
||||
aws:
|
||||
enabled: true
|
||||
accountNumber: "$MY_ACC_NUM"
|
||||
roleName: $MY_ROLE_NAME
|
||||
```
|
||||
|
||||
> **Note:** The account number needs to be specified in a string format, like the above, or the installation will fail.
|
||||
|
||||
You can then install the adapter with the following command:
|
||||
|
||||
```bash
|
||||
helm install rancher-csp-adapter rancher-charts/rancher-csp-adapter -f values.yaml --version $CSP_ADAPTER_VERSION
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="Private CA Authority / Rancher-generated Certificates">
|
||||
|
||||
```bash
|
||||
helm install rancher-csp-adapter rancher-charts/rancher-csp-adapter --namespace cattle-csp-adapter-system --set aws.enabled=true --set aws.roleName=$MY_ROLE_NAME --set-string aws.accountNumber=$MY_ACC_NUM --set additionalTrustedCAs=true --version $CSP_ADAPTER_VERSION
|
||||
```
|
||||
|
||||
Alternatively, you can use a `values.yaml` and specify options the below:
|
||||
|
||||
```yaml
|
||||
aws:
|
||||
enabled: true
|
||||
accountNumber: "$MY_ACC_NUM"
|
||||
roleName: $MY_ROLE_NAME
|
||||
additionalTrustedCAs: true
|
||||
```
|
||||
|
||||
> **Note:** The account number needs to be specified in a string format, like the above, or the installation will fail.
|
||||
|
||||
You can then install the adapter with the following command:
|
||||
|
||||
```bash
|
||||
helm install rancher-csp-adapter rancher-charts/rancher-csp-adapter -f values.yaml --version $CSP_ADAPTER_VERSION
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
### 5. Managing Certificate Updates
|
||||
|
||||
If you had to create a secret storing a custom cert in [Step 3](#3-create-certificate-secrets), you will need to update this secret over time as the certificate is rotated.
|
||||
|
||||
First, delete the original secret in the cattle-csp-adapter-system namespace, using the below command:
|
||||
|
||||
```bash
|
||||
kubectl delete secret tls-ca-additional -n cattle-csp-adapter-system
|
||||
```
|
||||
|
||||
Then, follow the original installation steps in [Step 3](#3-create-certificate-secrets) to replace the content of the secret with the updated value.
|
||||
|
||||
Finally, restart the rancher-csp-adapter deployment to ensure that the updated value is made available to the adapter:
|
||||
|
||||
```bash
|
||||
kubectl rollout restart deploy rancher-csp-adapter -n cattle-csp-adapter-system
|
||||
```
|
||||
|
||||
> **Note:** Methods such as cert-manager's [trust operator](https://cert-manager.io/docs/projects/trust/) allow you to automate some of these tasks. Although these methods aren't officially supported, they can reduce how often you need to manually rotate certificates.
|
||||
+25
@@ -0,0 +1,25 @@
|
||||
---
|
||||
title: Uninstalling The Adapter
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace/aws-cloud-marketplace/uninstall-adapter"/>
|
||||
</head>
|
||||
|
||||
### 1. Uninstall the adapter chart using helm.
|
||||
|
||||
```bash
|
||||
helm uninstall rancher-csp-adapter -n cattle-csp-adapter-system
|
||||
```
|
||||
|
||||
### 2. Remove the namespace created for the adapter.
|
||||
|
||||
```bash
|
||||
kubectl delete ns cattle-csp-adapter-system
|
||||
```
|
||||
|
||||
### 3. (Optional) remove any outstanding user notifications.
|
||||
|
||||
```bash
|
||||
kubectl delete RancherUserNotification csp-compliance
|
||||
```
|
||||
+11
@@ -0,0 +1,11 @@
|
||||
---
|
||||
title: Cloud Marketplace Integration
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace"/>
|
||||
</head>
|
||||
|
||||
Rancher offers integration with cloud marketplaces to easily purchase support for installations hosted on certain cloud providers. In addition, this integration also provides the ability to generate a supportconfig bundle which can be provided to rancher support.
|
||||
|
||||
This integration only supports AWS.
|
||||
+57
@@ -0,0 +1,57 @@
|
||||
---
|
||||
title: Supportconfig Bundle
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cloud-marketplace/supportconfig"/>
|
||||
</head>
|
||||
|
||||
After installing the CSP adapter, you will have the ability to generate a supportconfig bundle. This bundle is a tar file which can be used to quickly provide information to support.
|
||||
|
||||
These bundles can be created through Rancher or through direct access to the cluster that Rancher is installed on. Note that accessing through Rancher is preferred.
|
||||
|
||||
> **Note:** Only admin users can generate/download supportconfig bundles, regardless of method.
|
||||
|
||||
## Accessing Through Rancher
|
||||
|
||||
First, click on the hamburger menu. Then click the `Get Support` button.
|
||||
|
||||

|
||||
|
||||
In the next page, click on the `Generate Support Config` button.
|
||||
|
||||
> **Note:** If the adapter is not installed, the option to generate the supportconfig bundle will not be present. You must install the CSP adapter to generate a supportconfig bundle.
|
||||
|
||||

|
||||
|
||||
## Accessing Without Rancher
|
||||
|
||||
First, generate a kubeconfig for the cluster that Rancher is installed on.
|
||||
|
||||
> **Note:** If Rancher is down, you will not be able to use a kubeconfig token generated by Rancher to access the cluster.
|
||||
|
||||
Configure your shell environment to use this kubeconfig token:
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=$MY_KUBECONFIG_PATH
|
||||
```
|
||||
|
||||
It is recommended to create a temporary working directory while running this command, like below:
|
||||
|
||||
```bash
|
||||
mkdir temp && cd temp
|
||||
```
|
||||
|
||||
Then, retrieve the supportconfig bundle:
|
||||
|
||||
```bash
|
||||
mkdir rancher && kubectl get configmap csp-config -n cattle-csp-adapter-system -o=jsonpath='{.data.data}' >> rancher/config.json && tar -c -f supportconfig_rancher.tar rancher && rm -rf rancher
|
||||
```
|
||||
|
||||
This will create a `supportconfig_rancher.tar` file in your current directory.
|
||||
|
||||
Users who run these commands on Mac may experience issues due to incompatibilities between gnu-tar and bsd-tar. If support has issues reading a supportconfig that you produce, try using the below command after making gnu-tar accessible as `gtar` on your path:
|
||||
|
||||
```bash
|
||||
mkdir rancher && kubectl get configmap csp-config -n cattle-csp-adapter-system -o=jsonpath='{.data.data}' >> rancher/config.json && gtar -c -f supportconfig_rancher.tar rancher && rm -rf rancher
|
||||
```
|
||||
@@ -0,0 +1,14 @@
|
||||
---
|
||||
title: Cluster API (CAPI) with Rancher Turtles
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cluster-api"/>
|
||||
</head>
|
||||
|
||||
[Rancher Turtles](https://turtles.docs.rancher.com/) is a [Kubernetes Operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/#operators-in-kubernetes) that manages the lifecycle of provisioned Kubernetes clusters, by providing integration between your Cluster API (CAPI) and Rancher. With Rancher Turtles, you can:
|
||||
|
||||
- Import CAPI clusters into Rancher, by installing the Rancher Cluster Agent in CAPI provisioned clusters.
|
||||
- Configure the [CAPI Operator](https://turtles.docs.rancher.com/turtles/stable/en/operator/chart.html#_cluster_api_operator_values).
|
||||
|
||||
The [Overview](./overview.md) section outlines installation options, Rancher Turtles architecture, and a brief demo. For more details, see the [Rancher Turtles documentation](https://turtles.docs.rancher.com/).
|
||||
@@ -0,0 +1,260 @@
|
||||
---
|
||||
title: Overview
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/cluster-api/overview"/>
|
||||
</head>
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
Below is a visual representation of the key components of Rancher Turtles and their relationship to Rancher and the Rancher Cluster Agent. Understanding these components is essential for gaining insights into how Rancher leverages the CAPI operator for cluster management.
|
||||
|
||||

|
||||
|
||||
## Security
|
||||
|
||||
As defined by [Supply-chain Levels for Software Artifacts (SLSA)](https://slsa.dev/spec/v1.0/about), SLSA is a set of incrementally adoptable guidelines for supply chain security, established by industry consensus. The specification set by SLSA is useful for both software producers and consumers: producers can follow SLSA’s guidelines to make their software supply chain more secure, and consumers can use SLSA to make decisions about whether to trust a software package.
|
||||
|
||||
Rancher Turtles meets [SLSA Level 3](https://slsa.dev/spec/v1.0/levels#build-l3) requirements as an appropriate hardened build platform, with consistent build processes, and provenance distribution. For more information, visit the [Rancher Turtles Security](https://turtles.docs.rancher.com/turtles/stable/en/security/slsa.html) document.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before installing Rancher Turtles in your Rancher environment, you must disable Rancher's `embedded-cluster-api` functionality. This also includes cleaning up Rancher-specific webhooks that otherwise would conflict with CAPI ones.
|
||||
|
||||
To simplify setting up Rancher for installing Rancher Turtles, the official Rancher Turtles Helm chart includes a `pre-install` hook that removes the following:
|
||||
|
||||
- Disables the `embedded-cluster-api` feature in Rancher.
|
||||
- Deletes the `mutating-webhook-configuration` and `validating-webhook-configuration` webhooks, as they are no longer needed.
|
||||
|
||||
These webhooks can be removed through the Rancher UI as well:
|
||||
|
||||
1. In the upper left corner, click **☰** > **Cluster Management**.
|
||||
1. Select your local cluster.
|
||||
1. In the left-hand navigation menu, select **More Resources** > **Admission**.
|
||||
1. From the dropdown, select the Resource pages for `MutatingWebhookConfiguration` and `ValidatingWebhookConfiguration`.
|
||||
1. On the respective Resource pages, click the **⋮** that are attached to the `mutating-webhook-configuration` and `validating-webhook-configuration` webhooks and select the **Delete** option.
|
||||
|
||||
The webhooks can also be accessed by entering the names of the webhooks into the **Resource Search** field.
|
||||
|
||||
The following `kubectl` commands can manually remove the necessary webhooks:
|
||||
|
||||
```console
|
||||
kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io mutating-webhook-configuration
|
||||
```
|
||||
|
||||
```console
|
||||
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io validating-webhook-configuration
|
||||
```
|
||||
|
||||
Use the following example to disable the `embedded-cluster-api` feature from the console:
|
||||
|
||||
1. Create a `feature.yaml` file, with `embedded-cluster-api` set to false:
|
||||
|
||||
```yaml title="feature.yaml"
|
||||
apiVersion: management.cattle.io/v3
|
||||
kind: Feature
|
||||
metadata:
|
||||
name: embedded-cluster-api
|
||||
spec:
|
||||
value: false
|
||||
```
|
||||
|
||||
2. Use `kubectl` to apply the `feature.yaml` file to the cluster:
|
||||
|
||||
```bash
|
||||
kubectl apply -f feature.yaml
|
||||
```
|
||||
|
||||
## Installing the Rancher Turtles Operator
|
||||
|
||||
You can install the Rancher Turtles operator via the Rancher UI, or with Helm. The first method is recommended for most environments.
|
||||
|
||||
:::caution
|
||||
|
||||
If you already have the Cluster API (CAPI) Operator installed in your cluster, you must use the [manual Helm installation method](#installing-via-helm).
|
||||
|
||||
:::
|
||||
|
||||
### Installing via the Rancher UI
|
||||
|
||||
By adding the Turtles repository via the Rancher UI, Rancher can process the installation and configuration of the CAPI Extension.
|
||||
|
||||
1. Click **☰**. Under **Explore Cluster** in the left navigation menu, select **local**.
|
||||
1. In the left navigation menu of the **Cluster Dashboard**, select **Apps > Repositories**.
|
||||
1. Click **Create** to add a new repository.
|
||||
1. Enter the following:
|
||||
- **Name**: turtles
|
||||
- **Index URL**: https://rancher.github.io/turtles
|
||||
1. Wait until the new repository has a status of `Active`.
|
||||
1. In the left navigation menu, select **Apps > Charts**.
|
||||
1. Enter "turtles" into the search filter to find the Turtles chart.
|
||||
1. Click **Rancher Turtles - the Cluster API Extension**.
|
||||
1. Click **Install > Next > Install**.
|
||||
|
||||
This process uses the default values for the Helm chart, which are good for most installations. If your configuration requires overriding some of these defaults, you can either specify the values during installation from the Rancher UI or you can [manually install the chart via Helm](#installing-via-helm). For details about available values, see the Rancher Turtles [Helm chart reference guide](https://turtles.docs.rancher.com/turtles/stable/en/operator/chart.html).
|
||||
|
||||
The installation may take a few minutes and after completing you can see the following new deployments in the cluster:
|
||||
|
||||
- `rancher-turtles-system/rancher-turtles-controller-manager`
|
||||
- `rancher-turtles-system/rancher-turtles-cluster-api-operator`
|
||||
- `capi-system/capi-controller-manager`
|
||||
|
||||
#### Demo
|
||||
|
||||
This demo illustrates how to use the Rancher UI to install Rancher Turtles, create/import a CAPI cluster, and install monitoring on the cluster:
|
||||
|
||||
<iframe width="560" height="315" src="https://www.youtube.com/embed/lGsr7KfBjgU?si=ORkzuAJjcdXUXMxh" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
|
||||
|
||||
### Installing via Helm
|
||||
|
||||
There are two ways to install Rancher Turtles with Helm, depending on whether you include the [CAPI Operator](https://github.com/kubernetes-sigs/cluster-api-operator) as a dependency:
|
||||
|
||||
- [Install Rancher Turtles with CAPI Operator as a dependency](#installing-rancher-turtles-with-cluster-api-capi-operator-as-a-helm-dependency).
|
||||
- [Install Rancher Turtles without CAPI Operator](#installing-rancher-turtles-without-cluster-api-capi-operator-as-a-helm-dependency).
|
||||
|
||||
The CAPI Operator is required for installing Rancher Turtles. You can choose whether you want to take care of this dependency yourself or let the Rancher Turtles Helm chart manage it for you. [Installing Turtles as a dependency](#installing-rancher-turtles-with-cluster-api-capi-operator-as-a-helm-dependency) is simpler, but your best option depends on your specific configuration.
|
||||
|
||||
The CAPI Operator allows for handling the lifecycle of [CAPI providers](https://turtles.docs.rancher.com/turtles/stable/en/operator/manual.html) using a declarative approach, extending the capabilities of `clusterctl`. If you want to learn more about it, you can refer to [Cluster API Operator book](https://cluster-api-operator.sigs.k8s.io/).
|
||||
|
||||
#### Installing Rancher Turtles with `Cluster API (CAPI) Operator` as a Helm dependency
|
||||
|
||||
1. Add the Helm repository containing the `rancher-turtles` chart as the first step in installation:
|
||||
|
||||
```bash
|
||||
helm repo add turtles https://rancher.github.io/turtles
|
||||
helm repo update
|
||||
```
|
||||
|
||||
2. As mentioned before, installing Rancher Turtles requires the [CAPI Operator](https://github.com/kubernetes-sigs/cluster-api-operator). The Helm chart can automatically install it with a minimal set of flags:
|
||||
|
||||
```bash
|
||||
helm install rancher-turtles turtles/rancher-turtles --version <version> \
|
||||
-n rancher-turtles-system \
|
||||
--dependency-update \
|
||||
--create-namespace --wait \
|
||||
--timeout 180s
|
||||
```
|
||||
|
||||
3. This operation could take a few minutes and after completing you can review the installed controllers listed below:
|
||||
|
||||
- `rancher-turtles-controller`
|
||||
- `capi-operator`
|
||||
|
||||
:::note
|
||||
|
||||
- If `cert-manager` is already available in the cluster, disable its installation as a Rancher Turtles dependency. This prevents dependency conflicts:
|
||||
`--set cluster-api-operator.cert-manager.enabled=false`
|
||||
- For a list of Rancher Turtles versions, refer to the [Turtles release page](https://github.com/rancher/turtles/releases).
|
||||
|
||||
:::
|
||||
|
||||
This is the basic, recommended configuration, which manages the creation of a secret containing the required CAPI feature flags (`CLUSTER_TOPOLOGY`, `EXP_CLUSTER_RESOURCE_SET` and `EXP_MACHINE_POOL` enabled) in the core provider namespace. These feature flags are required to enable additional CAPI functionality.
|
||||
|
||||
If you need to override the default behavior and use an existing secret (or add custom environment variables), you can pass the secret name Helm flag. In this case, as a user, you are in charge of managing the secret creation and its content, including enabling the minimum required features: `CLUSTER_TOPOLOGY`, `EXP_CLUSTER_RESOURCE_SET` and `EXP_MACHINE_POOL`.
|
||||
|
||||
```bash
|
||||
helm install ...
|
||||
# Passing secret name and namespace for additional environment variables
|
||||
--set cluster-api-operator.cluster-api.configSecret.name=<secret-name>
|
||||
```
|
||||
|
||||
The following is an example of a user-managed secret `cluster-api-operator.cluster-api.configSecret.name=variables` with `CLUSTER_TOPOLOGY`, `EXP_CLUSTER_RESOURCE_SET` and `EXP_MACHINE_POOL` feature flags set and an extra custom variable:
|
||||
|
||||
```yaml title="secret.yaml"
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: variables
|
||||
namespace: rancher-turtles-system
|
||||
type: Opaque
|
||||
stringData:
|
||||
CLUSTER_TOPOLOGY: "true"
|
||||
EXP_CLUSTER_RESOURCE_SET: "true"
|
||||
EXP_MACHINE_POOL: "true"
|
||||
CUSTOM_ENV_VAR: "false"
|
||||
```
|
||||
|
||||
:::info
|
||||
|
||||
For detailed information on the values supported by the chart and their usage, refer to [Helm chart options](https://turtles.docs.rancher.com/turtles/stable/en/operator/chart.html).
|
||||
|
||||
:::
|
||||
|
||||
#### Installing Rancher Turtles without `Cluster API (CAPI) Operator` as a Helm dependency
|
||||
|
||||
:::note
|
||||
|
||||
Remember that if you opt for this installation option, you must manage the CAPI Operator installation yourself. You can follow the [manual installation guide](https://turtles.docs.rancher.com/turtles/stable/en/operator/manual.html) in the Rancher Turtles documentation for assistance.
|
||||
|
||||
:::
|
||||
|
||||
1. Add the Helm repository containing the `rancher-turtles` chart as the first step in installation:
|
||||
|
||||
```bash
|
||||
helm repo add turtles https://rancher.github.io/turtles
|
||||
helm repo update
|
||||
```
|
||||
|
||||
2. Install the chart into the `rancher-turtles-system` namespace:
|
||||
|
||||
```bash
|
||||
helm install rancher-turtles turtles/rancher-turtles --version <version>
|
||||
-n rancher-turtles-system
|
||||
--set cluster-api-operator.enabled=false
|
||||
--set cluster-api-operator.cluster-api.enabled=false
|
||||
--create-namespace --wait
|
||||
--dependency-update
|
||||
```
|
||||
|
||||
The previous commands tell Helm to ignore installing `cluster-api-operator` as a dependency.
|
||||
|
||||
3. This operation could take a few minutes and after completing you can review the installed controller listed below:
|
||||
|
||||
- `rancher-turtles-controller`
|
||||
|
||||
## Uninstalling Rancher Turtles
|
||||
|
||||
:::caution
|
||||
|
||||
When installing Rancher Turtles in your Rancher environment, by default, Rancher Turtles enables the CAPI Operator cleanup. This includes cleaning up CAPI Operator specific webhooks and deployments that otherwise cause issues with Rancher provisioning.
|
||||
|
||||
To simplify uninstalling Rancher Turtles (via Rancher or Helm command), the official Rancher Turtles Helm chart includes a `post-delete` hook that removes the following:
|
||||
|
||||
- Deletes the `mutating-webhook-configuration` and `validating-webhook-configuration` webhooks that are no longer needed.
|
||||
- Deletes the CAPI `deployments` that are no longer needed.
|
||||
|
||||
:::
|
||||
|
||||
To uninstall Rancher Turtles:
|
||||
|
||||
```bash
|
||||
helm uninstall -n rancher-turtles-system rancher-turtles --cascade foreground --wait
|
||||
```
|
||||
|
||||
This may take a few minutes to complete.
|
||||
|
||||
:::note
|
||||
|
||||
Remember that, if you use a different name for the installation or a different namespace, you may need to customize the command for your specific configuration.
|
||||
|
||||
:::
|
||||
|
||||
After Rancher Turtles is uninstalled, Rancher's `embedded-cluster-api` feature must be re-enabled:
|
||||
|
||||
1. Create a `feature.yaml` file, with `embedded-cluster-api` set to true:
|
||||
|
||||
```yaml title="feature.yaml"
|
||||
apiVersion: management.cattle.io/v3
|
||||
kind: Feature
|
||||
metadata:
|
||||
name: embedded-cluster-api
|
||||
spec:
|
||||
value: true
|
||||
```
|
||||
|
||||
2. Use `kubectl` to apply the `feature.yaml` file to the cluster:
|
||||
|
||||
```bash
|
||||
kubectl apply -f feature.yaml
|
||||
```
|
||||
+113
@@ -0,0 +1,113 @@
|
||||
---
|
||||
title: Compliance Scans
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/compliance-scans"/>
|
||||
</head>
|
||||
|
||||
Rancher can run a security scan to check whether a cluster is deployed according to security best practices as defined in Kubernetes security benchmarks, such as the ones provided by STIG, BSI or CIS. The Compliance scans can run on any Kubernetes cluster, including hosted Kubernetes providers such as EKS, AKS, and GKE.
|
||||
|
||||
The `rancher-compliance` app leverages <a href="https://github.com/aquasecurity/kube-bench" target="_blank">kube-bench,</a> an open-source tool from Aqua Security, to check the compliance of clusters against Kubernetes Benchmarks. Also, to generate a cluster-wide report, the application utilizes <a href="https://github.com/vmware-tanzu/sonobuoy" target="_blank">Sonobuoy</a> for report aggregation.
|
||||
|
||||
|
||||
## About the CIS Benchmark
|
||||
|
||||
The Center for Internet Security is a 501(c\)(3) non-profit organization, formed in October 2000, with a mission to "identify, develop, validate, promote, and sustain best practice solutions for cyber defense and build and lead communities to enable an environment of trust in cyberspace". The organization is headquartered in East Greenbush, New York, with members including large corporations, government agencies, and academic institutions.
|
||||
|
||||
CIS Benchmarks are best practices for the secure configuration of a target system. CIS Benchmarks are developed through the generous volunteer efforts of subject matter experts, technology vendors, public and private community members, and the CIS Benchmark Development team.
|
||||
|
||||
[Sign up](https://learn.cisecurity.org/benchmarks) at the CIS website to view the official Benchmark documents.
|
||||
|
||||
## About the Generated Report
|
||||
|
||||
Each scan generates a report can be viewed in the Rancher UI and can be downloaded in CSV format.
|
||||
|
||||
By default, the CIS Benchmark v1.6 is used.
|
||||
|
||||
The Benchmark version is included in the generated report.
|
||||
|
||||
The Benchmark provides recommendations of two types: Automated and Manual. Recommendations marked as Manual in the Benchmark are not included in the generated report.
|
||||
|
||||
Some tests are designated as "Not Applicable." These tests will not be run on any CIS scan because of the way that Rancher provisions RKE2/K3s clusters. For information on how test results can be audited, and why some tests are designated to be not applicable, refer to Rancher's [self-assessment guide](../../reference-guides/rancher-security/rancher-security.md#the-cis-benchmark-and-self-assessment) for the corresponding Kubernetes version.
|
||||
|
||||
The report contains the following information:
|
||||
|
||||
| Column in Report | Description |
|
||||
|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `id` | The ID number of the CIS Benchmark. |
|
||||
| `description` | The description of the CIS Benchmark test. |
|
||||
| `remediation` | What needs to be fixed in order to pass the test. |
|
||||
| `state` | Indicates if the test passed, failed, was skipped, or was not applicable. |
|
||||
| `node_type` | The node role, which affects which tests are run on the node. Master tests are run on controlplane nodes, etcd tests are run on etcd nodes, and node tests are run on the worker nodes. |
|
||||
| `audit` | This is the audit check that `kube-bench` runs for this test. |
|
||||
| `audit_config` | Any configuration applicable to the audit script. |
|
||||
| `test_info` | Test-related info as reported by `kube-bench`, if any. |
|
||||
| `commands` | Test-related commands as reported by `kube-bench`, if any. |
|
||||
| `config_commands` | Test-related configuration data as reported by `kube-bench`, if any. |
|
||||
| `actual_value` | The test's actual value, present if reported by `kube-bench`. |
|
||||
| `expected_result` | The test's expected result, present if reported by `kube-bench`. |
|
||||
|
||||
Refer to [the table in the cluster hardening guide](../../reference-guides/rancher-security/rancher-security.md) for information on which versions of Kubernetes, the Benchmark, Rancher, and our cluster hardening guide correspond to each other. Also refer to the hardening guide for configuration files of CIS-compliant clusters and information on remediating failed tests.
|
||||
|
||||
## Test Profiles
|
||||
|
||||
The following profiles are available:
|
||||
|
||||
- Generic CIS 1.6
|
||||
- Generic CIS 1.20
|
||||
- Generic CIS 1.23
|
||||
- RKE2 permissive 1.6
|
||||
- RKE2 hardened 1.6
|
||||
- RKE2 permissive 1.20
|
||||
- RKE2 hardened 1.20
|
||||
- RKE2 permissive 1.23
|
||||
- RKE2 hardened 1.23
|
||||
- K3s permissive 1.6
|
||||
- K3s hardened 1.6
|
||||
- K3s permissive 1.20
|
||||
- K3s hardened 1.20
|
||||
- K3s permissive 1.23
|
||||
- K3s hardened 1.23
|
||||
- AKS
|
||||
- EKS
|
||||
- GKE
|
||||
|
||||
You also have the ability to customize a profile by saving a set of tests to skip.
|
||||
|
||||
All profiles will have a set of not applicable tests that will be skipped during the CIS scan. These tests are not applicable based on how a RKE2/K3s cluster manages Kubernetes.
|
||||
|
||||
There are two types of RKE2/K3s cluster scan profiles:
|
||||
|
||||
- **Permissive:** This profile has a set of tests that have been will be skipped as these tests will fail on a default RKE2/K3s Kubernetes cluster. Besides the list of skipped tests, the profile will also not run the not applicable tests.
|
||||
- **Hardened:** This profile will not skip any tests, except for the non-applicable tests.
|
||||
|
||||
The EKS and GKE cluster scan profiles are based on CIS Benchmark versions that are specific to those types of clusters.
|
||||
|
||||
In order to pass the "Hardened" profile, you will need to follow the steps on the [hardening guide](../../reference-guides/rancher-security/rancher-security.md#rancher-hardening-guide) and use the `cluster.yml` defined in the hardening guide to provision a hardened cluster.
|
||||
|
||||
The default profile and the supported CIS benchmark version depends on the type of cluster that will be scanned:
|
||||
|
||||
The `rancher-compliance` supports the CIS 1.9 Benchmark version.
|
||||
|
||||
- For RKE2 Kubernetes clusters, the RKE2 Permissive 1.9 profile is the default.
|
||||
- EKS and GKE have their own CIS Benchmarks published by `kube-bench`. The corresponding test profiles are used by default for those clusters.
|
||||
- For cluster types other than RKE2, EKS and GKE, the Generic CIS 1.5 profile is used by default.
|
||||
|
||||
## About Skipped and Not Applicable Tests
|
||||
|
||||
For now, only user-defined skipped tests are marked as skipped in the generated report.
|
||||
|
||||
Any skipped tests that are defined as being skipped by one of the default profiles are marked as not applicable.
|
||||
|
||||
## Roles-based Access Control
|
||||
|
||||
For information about permissions, refer to [this page](rbac-for-compliance-scans.md)
|
||||
|
||||
## Configuration
|
||||
|
||||
For more information about configuring the custom resources for the scans, profiles, and benchmark versions, refer to [this page](configuration-reference.md)
|
||||
|
||||
## How-to Guides
|
||||
|
||||
Please refer to the [Compliance Scan Guides](../../how-to-guides/advanced-user-guides/compliance-scan-guides/compliance-scan-guides.md) to learn how to run Compliance scans.
|
||||
+113
@@ -0,0 +1,113 @@
|
||||
---
|
||||
title: Configuration
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/compliance-scans/configuration-reference"/>
|
||||
</head>
|
||||
|
||||
This configuration reference is intended to help you manage the custom resources created by the `rancher-compliance` application. These resources are used for performing compliance scans on a cluster, skipping tests, setting the test profile that will be used during a scan, and other customization.
|
||||
|
||||
To configure the custom resources, go to the **Cluster Dashboard** To configure the compliance scans,
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to configure compliance scans and click **Explore**.
|
||||
1. In the left navigation bar, click **Compliance**.
|
||||
|
||||
## Scans
|
||||
|
||||
A scan is created to trigger a compliance scan on the cluster based on the defined profile. A report is created after the scan is completed.
|
||||
|
||||
When configuring a scan, you need to define the name of the scan profile that will be used with the `scanProfileName` directive.
|
||||
|
||||
An example ClusterScan custom resource is below:
|
||||
|
||||
```yaml
|
||||
apiVersion: compliance.cattle.io/v1
|
||||
kind: ClusterScan
|
||||
metadata:
|
||||
name: scan-smnr9
|
||||
spec:
|
||||
scanProfileName: cis-1.10-profile
|
||||
```
|
||||
|
||||
## Profiles
|
||||
|
||||
A profile contains the configuration for the compliance scan, which includes the benchmark version to use and any specific tests to skip in that benchmark.
|
||||
|
||||
:::caution
|
||||
|
||||
By default, a few ClusterScanProfiles are installed as part of the `rancher-compliance` chart. If a user edits these default benchmarks or profiles, the next chart update will reset them back. So it is advisable for users to not edit the default ClusterScanProfiles.
|
||||
|
||||
:::
|
||||
|
||||
Users can clone the ClusterScanProfiles to create custom profiles.
|
||||
|
||||
Skipped tests are listed under the `skipTests` directive.
|
||||
|
||||
When you create a new profile, you will also need to give it a name.
|
||||
|
||||
An example `ClusterScanProfile` is below:
|
||||
|
||||
```yaml
|
||||
apiVersion: compliance.cattle.io/v1
|
||||
kind: ClusterScanProfile
|
||||
metadata:
|
||||
annotations:
|
||||
clusterscanprofile.compliance.cattle.io/builtin: 'true'
|
||||
meta.helm.sh/release-name: rancher-compliance
|
||||
meta.helm.sh/release-namespace: compliance-operator-system
|
||||
creationTimestamp: '2025-09-15T18:09:52Z'
|
||||
generation: 1
|
||||
labels:
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: cis-1.10-profile
|
||||
resourceVersion: '93582'
|
||||
uid: 0baad187-1157-46ac-982d-014338847c27
|
||||
spec:
|
||||
benchmarkVersion: cis-1.10
|
||||
skipTests:
|
||||
- '1.1.20'
|
||||
- '1.1.21'
|
||||
```
|
||||
|
||||
## Benchmark Versions
|
||||
|
||||
A benchmark version is the name of benchmark to run using `kube-bench`, as well as the valid configuration parameters for that benchmark.
|
||||
|
||||
A `ClusterScanBenchmark` defines the Compliance `BenchmarkVersion` name and test configurations. The `BenchmarkVersion` name is a parameter provided to the `kube-bench` tool.
|
||||
|
||||
By default, a few `BenchmarkVersion` names and test configurations are packaged as part of the Compliance scan application. When this feature is enabled, these default BenchmarkVersions will be automatically installed and available for users to create a ClusterScanProfile.
|
||||
|
||||
:::caution
|
||||
|
||||
If the default BenchmarkVersions are edited, the next chart update will reset them back. Therefore we don't recommend editing the default ClusterScanBenchmarks.
|
||||
|
||||
:::
|
||||
|
||||
A ClusterScanBenchmark consists of the fields:
|
||||
|
||||
- `ClusterProvider`: This is the cluster provider name for which this benchmark is applicable. For example: RKE2, EKS, GKE, etc. Leave it empty if this benchmark can be run on any cluster type.
|
||||
- `MinKubernetesVersion`: Specifies the cluster's minimum kubernetes version necessary to run this benchmark. Leave it empty if there is no dependency on a particular Kubernetes version.
|
||||
- `MaxKubernetesVersion`: Specifies the cluster's maximum Kubernetes version necessary to run this benchmark. Leave it empty if there is no dependency on a particular k8s version.
|
||||
|
||||
An example `ClusterScanBenchmark` is below:
|
||||
|
||||
```yaml
|
||||
apiVersion: compliance.cattle.io/v1
|
||||
kind: ClusterScanBenchmark
|
||||
metadata:
|
||||
annotations:
|
||||
meta.helm.sh/release-name: rancher-compliance
|
||||
meta.helm.sh/release-namespace: compliance-operator-system
|
||||
creationTimestamp: '2025-09-15T18:09:52Z'
|
||||
generation: 1
|
||||
labels:
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: cis-1.10
|
||||
resourceVersion: '93569'
|
||||
uid: 309e543e-9102-4091-be91-08d7af7fb7a7
|
||||
spec:
|
||||
clusterProvider: ''
|
||||
minKubernetesVersion: 1.28.0
|
||||
```
|
||||
+83
@@ -0,0 +1,83 @@
|
||||
---
|
||||
title: Creating a Custom Benchmark Version for Running a Cluster Scan
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/compliance-scans/custom-benchmark"/>
|
||||
</head>
|
||||
|
||||
Each Benchmark Version defines a set of test configuration files that define the Compliance tests to be run by the <a href="https://github.com/aquasecurity/kube-bench" target="_blank">kube-bench</a> tool.
|
||||
The `rancher-compliance` application installs a few default Benchmark Versions which are listed under Compliance application menu.
|
||||
|
||||
|
||||
But in the following cases, a custom configuration or remediation may be required:
|
||||
|
||||
- Non-standard file locations: When Kubernetes binaries, configuration or certificate paths deviate from upstream benchmark defaults.
|
||||
Example: Unlike traditional Kubernetes, K3s bundles control plane components into a single binary. Therefore,` --anonymous-auth` flag presence and configuration should be verified in K3s' logs (`journalctl`), not via `kube-apiserver` process checks (`ps`).
|
||||
|
||||
- Alternative risk mitigations: If a setup doesn't meet a check but has an equally effective compensating control with justification. Or simply is not concerned by the check requirement because of its design.
|
||||
Example: By default, K3s embeds the api server within the k3s process. There is no API server pod specification file, so verifying the latter's file permissions is not required.
|
||||
|
||||
## 1. Prepare the Custom Benchmark Version ConfigMap
|
||||
|
||||
To create a custom benchmark version, first you need to create a ConfigMap containing the benchmark version's config files and upload it to your Kubernetes cluster where you want to run the scan.
|
||||
|
||||
To prepare a custom benchmark version ConfigMap, suppose we want to add a custom Benchmark Version named `foo`.
|
||||
|
||||
1. Create a directory named `foo` and inside this directory, place all the config YAML files that the <a href="https://github.com/aquasecurity/kube-bench" target="_blank">kube-bench</a> tool looks for. For example, here are the config YAML files for a Generic CIS 1.5 Benchmark Version https://github.com/aquasecurity/kube-bench/tree/master/cfg/cis-1.5
|
||||
1. Place the complete `config.yaml` file, which includes all the components that should be tested.
|
||||
1. Add the Benchmark version name to the `target_mapping` section of the `config.yaml`:
|
||||
|
||||
```yaml
|
||||
target_mapping:
|
||||
"foo":
|
||||
- "master"
|
||||
- "node"
|
||||
- "controlplane"
|
||||
- "etcd"
|
||||
- "policies"
|
||||
```
|
||||
1. Upload this directory to your Kubernetes Cluster by creating a ConfigMap:
|
||||
|
||||
```yaml
|
||||
kubectl create configmap -n <namespace> foo --from-file=<path to directory foo>
|
||||
```
|
||||
|
||||
## 2. Add a Custom Benchmark Version to a Cluster
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to add a custom benchmark and click **Explore**.
|
||||
1. In the left navigation bar, click **Compliance > Benchmark Version**.
|
||||
1. Click **Create**.
|
||||
1. Enter the **Name** and a description for your custom benchmark version.
|
||||
1. Choose the cluster provider that your benchmark version applies to.
|
||||
1. Choose the ConfigMap you have uploaded from the dropdown.
|
||||
1. Add the minimum and maximum Kubernetes version limits applicable, if any.
|
||||
1. Click **Create**.
|
||||
|
||||
## 3. Create a New Profile for the Custom Benchmark Version
|
||||
|
||||
To run a scan using your custom benchmark version, you need to add a new Profile pointing to this benchmark version.
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to add a custom benchmark and click **Explore**.
|
||||
1. In the left navigation bar, click **Compliance > Profile**.
|
||||
1. Click **Create**.
|
||||
1. Provide a **Name** and description. In this example, we name it `foo-profile`.
|
||||
1. Choose the Benchmark Version from the dropdown.
|
||||
1. Click **Create**.
|
||||
|
||||
## 4. Run a Scan Using the Custom Benchmark Version
|
||||
|
||||
Once the Profile pointing to your custom benchmark version `foo` has been created, you can create a new Scan to run the custom test configs in the Benchmark Version.
|
||||
|
||||
To run a scan,
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to add a custom benchmark and click **Explore**.
|
||||
1. In the left navigation bar, click **Compliance > Scan**.
|
||||
1. Click **Create**.
|
||||
1. Choose the new cluster scan profile.
|
||||
1. Click **Create**.
|
||||
|
||||
**Result:** A report is generated with the scan results. To see the results, click the name of the scan that appears.
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
---
|
||||
title: Roles-based Access Control
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/compliance-scans/rbac-for-compliance-scans"/>
|
||||
</head>
|
||||
|
||||
This section describes the permissions required to use the rancher-compliance App.
|
||||
|
||||
The rancher-compliance is a cluster-admin only feature by default.
|
||||
|
||||
However, the `rancher-compliance` chart installs these two default `ClusterRoles`:
|
||||
|
||||
- compliance-admin
|
||||
- compliance-view
|
||||
|
||||
In Rancher, only cluster owners and global administrators have `compliance-admin` access by default.
|
||||
|
||||
## Cluster-Admin Access
|
||||
|
||||
Rancher Compliance Scans is a cluster-admin only feature by default.
|
||||
This means only the Rancher global admins, and the cluster’s cluster-owner can:
|
||||
|
||||
- Install/Uninstall the rancher-compliance App
|
||||
- See the navigation links for Compliance CRDs - ClusterScanBenchmarks, ClusterScanProfiles, ClusterScans
|
||||
- List the default ClusterScanBenchmarks and ClusterScanProfiles
|
||||
- Create/Edit/Delete new ClusterScanProfiles
|
||||
- Create/Edit/Delete a new ClusterScan to run the Compliance scan on the cluster
|
||||
- View and Download the ClusterScanReport created after the ClusterScan is complete
|
||||
|
||||
|
||||
## Summary of Default Permissions for Kubernetes Default Roles
|
||||
|
||||
The rancher-compliance creates three `ClusterRoles` and adds the Compliance CRD access to the following default K8s `ClusterRoles`:
|
||||
|
||||
| ClusterRole created by chart | Default K8s ClusterRole | Permissions given with Role
|
||||
| ------------------------------| ---------------------------| ---------------------------|
|
||||
| `compliance-admin` | `admin`| Ability to CRUD clusterscanbenchmarks, clusterscanprofiles, clusterscans, clusterscanreports CR
|
||||
| `compliance-view` | `view `| Ability to List(R) clusterscanbenchmarks, clusterscanprofiles, clusterscans, clusterscanreports CR
|
||||
|
||||
|
||||
By default only cluster-owner role will have ability to manage and use `rancher-compliance` feature.
|
||||
|
||||
The other Rancher roles (cluster-member, project-owner, project-member) do not have any default permissions to manage and use rancher-compliance resources.
|
||||
|
||||
But if a cluster-owner wants to delegate access to other users, they can do so by creating ClusterRoleBindings between these users and the above Compliance ClusterRoles manually.
|
||||
There is no automatic role aggregation supported for the `rancher-compliance` ClusterRoles.
|
||||
@@ -0,0 +1,31 @@
|
||||
---
|
||||
title: Operating System Management with Elemental
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/elemental"/>
|
||||
</head>
|
||||
|
||||
Elemental enables cloud-native host management. Elemental allows you to onboard any machine in any location, whether its in a datacenter or on the edge, and integrate them seamlessly into Kubernetes while managing your workflows (e.g., OS updates).
|
||||
|
||||
## Elemental with Rancher
|
||||
|
||||
Elemental in Rancher:
|
||||
|
||||
- Is Kubernetes native, which allows you to manage the OS via Elemental in Kubernetes clusters.
|
||||
- Is nondisruptive from a Kubernetes operational perspective.
|
||||
- Is declarative and GitOps friendly.
|
||||
- Allows OCI Image-based flows, which are trusted, deterministic, and predictable.
|
||||
- Works at scale. It enables fleet-sized OS management.
|
||||
|
||||
### When should I use Elemental?
|
||||
|
||||
- Elemental enables cloud-native OS management from Rancher manager. It works with any OS (e.g., SLE Micro vanilla).
|
||||
- Elemental allows cloud-native management for machines in datacenters and on the edge.
|
||||
- Elemental is flexible and allows platform teams to perform all kind of workflows across their fleet of machines.
|
||||
|
||||
## Elemental with Rancher Prime
|
||||
|
||||
- Deeply integrated already as GUI Extension in Rancher.
|
||||
- Extends the Rancher story to the OS. Working perfectly with SLE Micro for Rancher today.
|
||||
|
||||
@@ -0,0 +1,12 @@
|
||||
---
|
||||
title: Fleet Architecture
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/fleet/architecture"/>
|
||||
</head>
|
||||
|
||||
Fleet can manage deployments from git of raw Kubernetes YAML, Helm charts, or Kustomize or any combination of the three. Regardless of the source, all resources are dynamically turned into Helm charts, and Helm is used as the engine to deploy everything in the cluster. This gives you a high degree of control, consistency, and auditability. Fleet focuses not only on the ability to scale, but to give one a high degree of control and visibility to exactly what is installed on the cluster.
|
||||
|
||||

|
||||
|
||||
@@ -0,0 +1,27 @@
|
||||
---
|
||||
title: Continuous Delivery with Fleet
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/fleet"/>
|
||||
</head>
|
||||
|
||||
Fleet orchestrates and manages the continuous delivery of applications through the supply chain for fleets of clusters. Fleet organizes the supply chain to help teams deliver with confidence and trust in a timely manner using GitOps as a safe operating model.
|
||||
|
||||
## Fleet with Rancher
|
||||
|
||||
Many users often manage over 10 clusters at a time. Given the proliferation of clusters, continuous delivery is an important part of Rancher. Fleet ensures a reliable continuous delivery experience using GitOps, which is a safe and increasingly common operating model.
|
||||
|
||||
### When should I use Fleet?
|
||||
|
||||
- I need to deploy my monitoring stack (e.g., Grafana, Prometheus) across geographical regions, each with different retention policies.
|
||||
- I am a platform operator and want to provision clusters with all components using a scalable and safe operating model (GitOps).
|
||||
- I am an application developer and want my latest changes to automatically go into my development environment.
|
||||
|
||||
## Fleet with Rancher Prime
|
||||
|
||||
Fleet is already deeply integrated as the Continuous Delivery tool and GitOps Engine in Rancher.
|
||||
|
||||
<!--
|
||||
- In future, we can have additional value adds like sharding controller (Manage shards for user) or notification controller (Event dispatcher/receiver) for prime customer only.
|
||||
-->
|
||||
@@ -0,0 +1,73 @@
|
||||
---
|
||||
title: Overview
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/fleet/overview"/>
|
||||
</head>
|
||||
|
||||
## What is Continuous Delivery with Fleet?
|
||||
|
||||
Continuous Delivery is Rancher's GitOps functionality, which is provided via integration with Fleet.
|
||||
|
||||
- *Cluster engine*: Fleet is a container management and deployment engine designed to offer users more control on the local cluster and constant monitoring through GitOps. Fleet focuses not only on the ability to scale, but it also gives users a high degree of control and visibility to monitor exactly what is installed on the cluster.
|
||||
|
||||
- *Deployment management*: Fleet can manage deployments from git of raw Kubernetes YAML, Helm charts, Kustomize, or any combination of the three. Regardless of the source, all resources are dynamically turned into Helm charts, and Helm is used as the engine to deploy all resources in the cluster. As a result, users can enjoy a high degree of control, consistency, and auditability of their clusters.
|
||||
|
||||
## Architecture
|
||||
|
||||
For information about how Fleet works, see the [Architecture](./architecture.md) page.
|
||||
|
||||
## Accessing Fleet in the Rancher UI
|
||||
|
||||
Fleet comes preinstalled in Rancher and is managed by the **Continuous Delivery** option in the Rancher UI. For additional information on Continuous Delivery and other Fleet troubleshooting tips, refer [here](https://fleet.rancher.io/troubleshooting).
|
||||
|
||||
Users can leverage continuous delivery to deploy their applications to the Kubernetes clusters in the git repository without any manual operation by following **gitops** practice.
|
||||
|
||||
Follow the steps below to access Continuous Delivery in the Rancher UI:
|
||||
|
||||
1. Click **☰ > Continuous Delivery**.
|
||||
|
||||
1. Select your namespace at the top of the menu, noting the following:
|
||||
|
||||
- By default, **fleet-default** is selected which includes all downstream clusters that are registered through Rancher.
|
||||
|
||||
- You may switch to **fleet-local**, which only contains the **local** cluster, or you may create your own workspace to which you may assign and move clusters.
|
||||
|
||||
- You can then manage clusters by clicking on **Clusters** on the left navigation bar.
|
||||
|
||||
1. Click on **Gitrepos** on the left navigation bar to deploy the gitrepo into your clusters in the current workspace.
|
||||
|
||||
1. Select your [git repository](https://fleet.rancher.io/gitrepo-add) and [target clusters/cluster group](https://fleet.rancher.io/gitrepo-targets). You can also create the cluster group in the UI by clicking on **Cluster Groups** from the left navigation bar.
|
||||
|
||||
1. Once the gitrepo is deployed, you can monitor the application through the Rancher UI.
|
||||
|
||||
## Windows Support
|
||||
|
||||
For details on support for clusters with Windows nodes, see the [Windows Support](./windows-support.md) page.
|
||||
|
||||
## GitHub Repository
|
||||
|
||||
The Fleet Helm charts are available [here](https://github.com/rancher/fleet/releases).
|
||||
|
||||
## Using Fleet Behind a Proxy
|
||||
|
||||
For details on using Fleet behind a proxy, see the [Using Fleet Behind a Proxy](./use-fleet-behind-a-proxy.md) page.
|
||||
|
||||
## Helm Chart Dependencies
|
||||
|
||||
In order for Helm charts with dependencies to deploy successfully, you must run a manual command (as listed below), as it is up to the user to fulfill the dependency list. If you do not do this and proceed to clone your repository and run `helm install`, your installation will fail because the dependencies will be missing.
|
||||
|
||||
The Helm chart in the git repository must include its dependencies in the charts subdirectory. You must either manually run `helm dependencies update $chart` or run `helm dependencies build $chart` locally, then commit the complete charts directory to your git repository. Note that you will update your commands with the applicable parameters
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Known Issue**: clientSecretName and helmSecretName secrets for Fleet gitrepos are not included in the backup nor restore created by the [backup-restore-operator](../../how-to-guides/new-user-guides/backup-restore-and-disaster-recovery/back-up-rancher.md#1-install-the-rancher-backup-operator). We will update the community once a permanent solution is in place.
|
||||
|
||||
- **Temporary Workaround**: By default, user-defined secrets are not backed up in Fleet. It is necessary to recreate secrets if performing a disaster recovery restore or migration of Rancher into a fresh cluster. To modify resourceSet to include extra resources you want to backup, refer to docs [here](https://github.com/rancher/backup-restore-operator#user-flow).
|
||||
|
||||
- **Debug logging**: To enable debug logging of Fleet components, create a new **fleet** entry in the existing **rancher-config** ConfigMap in the **cattle-system** namespace with the value `{"debug": 1, "debugLevel": 1}`. The Fleet application restarts after you save the ConfigMap.
|
||||
|
||||
## Documentation
|
||||
|
||||
See the [official Fleet documentation](https://fleet.rancher.io/) to learn more.
|
||||
@@ -0,0 +1,82 @@
|
||||
---
|
||||
title: Using Fleet Behind a Proxy
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/fleet/use-fleet-behind-a-proxy"/>
|
||||
</head>
|
||||
|
||||
In this section, you'll learn how to enable Fleet in a setup that has a Rancher server with a public IP a Kubernetes cluster that has no public IP, but is configured to use a proxy.
|
||||
|
||||
Rancher does not establish connections with registered downstream clusters. The Rancher agent deployed on the downstream cluster must be able to establish the connection with Rancher.
|
||||
|
||||
To set up Fleet to work behind a proxy, you will need to set the **Agent Environment Variables** for the downstream cluster. These are cluster-level configuration options.
|
||||
|
||||
Through the Rancher UI, you can configure these environment variables for any cluster type, including registered and custom clusters. The variables can be added while editing an existing cluster or while provisioning a new cluster.
|
||||
|
||||
For public downstream clusters, it is sufficient to [set the required environment variables in the Rancher UI.](#setting-environment-variables-in-the-rancher-ui)
|
||||
|
||||
For private nodes or private clusters, the environment variables need to be set on the nodes themselves. Then the environment variables are configured from the Rancher UI, typically when provisioning a custom cluster or when registering the private cluster. For an example of how to set the environment variables on Ubuntu node in a K3s Kubernetes cluster, see [this section.](#setting-environment-variables-on-private-nodes)
|
||||
|
||||
## Required Environment Variables
|
||||
|
||||
When adding Fleet agent environment variables for the proxy, replace <PROXY_IP> with your private proxy IP.
|
||||
|
||||
:::caution
|
||||
|
||||
The `NO_PROXY` environment variable is not standardized, and the accepted format of the value can differ between applications. When configuring the `NO_PROXY` variable in Rancher, the value must adhere to the format expected by Golang.
|
||||
|
||||
Specifically, the value should be a comma-delimited string which only contains IP addresses, CIDR notation, domain names, or special DNS labels (e.g. `*`). For a full description of the expected value format, refer to the [**upstream Golang documentation**](https://pkg.go.dev/golang.org/x/net/http/httpproxy#Config)
|
||||
|
||||
:::
|
||||
|
||||
| Variable Name | Value |
|
||||
|------------------|--------|
|
||||
| `HTTP_PROXY` | http://<PROXY_IP>:8888 |
|
||||
| `HTTPS_PROXY` | http://<PROXY_IP>:8888
|
||||
| `NO_PROXY` | 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.svc,.cluster.local |
|
||||
|
||||
## Setting Environment Variables in the Rancher UI
|
||||
|
||||
To add the environment variable to an existing cluster:
|
||||
|
||||
<Tabs groupId="k8s-distro">
|
||||
<TabItem value="RKE2/K3s" default>
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster where you want to add environment variables and click **⋮ > Edit Config**.
|
||||
1. Click **Agent Environment Vars** under **Cluster configuration**.
|
||||
1. Click **Add**.
|
||||
1. Enter the [required environment variables](#required-environment-variables)
|
||||
1. Click **Save**.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="RKE">
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster where you want to add environment variables and click **⋮ > Edit Config**.
|
||||
1. Click **Advanced Options**.
|
||||
1. Click **Add Environment Variable**.
|
||||
1. Enter the [required environment variables](#required-environment-variables)
|
||||
1. Click **Save**.
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
**Result:** The Fleet agent works behind a proxy.
|
||||
|
||||
## Setting Environment Variables on Private Nodes
|
||||
|
||||
For private nodes and private clusters, the proxy environment variables need to be set on the nodes themselves, as well as configured from the Rancher UI.
|
||||
|
||||
This example shows how the environment variables would be set up on an Ubuntu node in a K3s Kubernetes cluster:
|
||||
|
||||
```
|
||||
ssh -o ForwardAgent=yes ubuntu@<public_proxy_ip>
|
||||
ssh <k3s_ip>
|
||||
export proxy_private_ip=<private_proxy_ip>
|
||||
export HTTP_PROXY=http://${proxy_private_ip}:8888
|
||||
export HTTPS_PROXY=http://${proxy_private_ip}:8888
|
||||
export NO_PROXY=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.svc,.cluster.local
|
||||
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
|
||||
```
|
||||
@@ -0,0 +1,25 @@
|
||||
---
|
||||
title: Windows Support
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/fleet/windows-support"/>
|
||||
</head>
|
||||
|
||||
Prior to Rancher v2.5.6, the `agent` did not have native Windows manifests on downstream clusters with Windows nodes. This would result in a failing `agent` pod for the cluster.
|
||||
|
||||
If you are upgrading from an older version of Rancher to v2.5.6+, you can deploy a working `agent` with the following workflow *in the downstream cluster*:
|
||||
|
||||
1. Cordon all Windows nodes.
|
||||
1. Apply the below toleration to the `agent` workload.
|
||||
1. Uncordon all Windows nodes.
|
||||
1. Delete all `agent` pods. New pods should be created with the new toleration.
|
||||
1. Once the `agent` pods are running, and auto-update is enabled for Fleet, they should be updated to a Windows-compatible `agent` version.
|
||||
|
||||
```yaml
|
||||
tolerations:
|
||||
- effect: NoSchedule
|
||||
key: cattle.io/os
|
||||
operator: Equal
|
||||
value: linux
|
||||
```
|
||||
@@ -0,0 +1,15 @@
|
||||
---
|
||||
title: Virtualization on Kubernetes with Harvester
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/harvester"/>
|
||||
</head>
|
||||
|
||||
## Harvester
|
||||
|
||||
Introduced in Rancher v2.6.1, Harvester is an open-source hyper-converged infrastructure (HCI) software built on Kubernetes. Harvester installs on bare metal servers and provides integrated virtualization and distributed storage capabilities. Although Harvester operates using Kubernetes, it does not require knowledge of Kubernetes concepts, making it more user-friendly.
|
||||
|
||||
## Harvester with Rancher
|
||||
|
||||
With Rancher Prime and Harvester, IT operators now have access to an enterprise-ready, simple-to-use infrastructure platform that cohesively manages their virtual machines and Kubernetes clusters alongside one another. For more information on the support offering, see the [Support Matrix](https://www.suse.com/suse-harvester/support-matrix/all-supported-versions/harvester-v1-2-0/). With the Rancher Virtualization Management feature, users can import and manage multiple Harvester clusters. Leveraging the Rancher's authentication feature and RBAC control for multi-tenancy support.
|
||||
@@ -0,0 +1,42 @@
|
||||
---
|
||||
title: Overview
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/harvester/overview"/>
|
||||
</head>
|
||||
|
||||
Introduced in Rancher v2.6.1, [Harvester](https://docs.harvesterhci.io/) is an open-source hyper-converged infrastructure (HCI) software built on Kubernetes. Harvester installs on bare metal servers and provides integrated virtualization and distributed storage capabilities. Although Harvester operates using Kubernetes, it does not require users to know Kubernetes concepts, making it a more user-friendly application.
|
||||
|
||||
## Feature Flag
|
||||
|
||||
The Harvester feature flag is used to manage access to the Virtualization Management (VM) page in Rancher where users can navigate directly to Harvester clusters and access the Harvester UI. The Harvester feature flag is enabled by default. Click [here](../../how-to-guides/advanced-user-guides/enable-experimental-features/enable-experimental-features.md) for more information on feature flags in Rancher.
|
||||
|
||||
To navigate to the Harvester cluster, click **☰ > Virtualization Management**. From Harvester Clusters page, click one of the clusters listed to go to the single Harvester cluster view.
|
||||
|
||||
* If the Harvester feature flag is enabled, Harvester clusters will be filtered out from any pages or apps (such as Continuous Delivery with Fleet) that list Kubernetes clusters.
|
||||
|
||||
* If the Harvester feature flag is disabled, and a Harvester cluster is imported, the Harvester cluster will be shown in the Rancher cluster list in the Cluster Management page. Harvester clusters will only be shown on the cluster list when the feature flag is off.
|
||||
|
||||
* With the Harvester integration, Harvester clusters can now be imported into Rancher as a cluster type `Harvester`.
|
||||
|
||||
* Users may import a Harvester cluster only on the Virtualization Management page. Importing a cluster on the Cluster Management page is not supported, and a warning will advise you to return to the VM page to do so.
|
||||
|
||||
## Harvester Node Driver
|
||||
|
||||
The [Harvester node driver](https://docs.harvesterhci.io/v1.5/rancher/node/node-driver/) is generally available for K3s and RKE2 options in Rancher. The node driver is available whether or not the Harvester feature flag is enabled. Note that the node driver is off by default. Users may create K3s or RKE2 clusters on Harvester only from the Cluster Management page.
|
||||
|
||||
Harvester allows `.ISO` images to be uploaded and displayed through the Harvester UI, but this is not supported in the Rancher UI. This is because `.ISO` images usually require additional setup that interferes with a clean deployment (without requiring user intervention), and they are not typically used in cloud environments.
|
||||
|
||||
See [Provisioning Drivers](../../how-to-guides/new-user-guides/authentication-permissions-and-global-configuration/about-provisioning-drivers/about-provisioning-drivers.md#node-drivers) for more information on node drivers in Rancher.
|
||||
|
||||
## Port Requirements
|
||||
|
||||
The port requirements for the Harvester cluster can be found [here](https://docs.harvesterhci.io/v1.5/install/requirements#networking).
|
||||
|
||||
In addition, other networking considerations are as follows:
|
||||
|
||||
- Be sure to enable VLAN trunk ports of the physical switch for VM VLAN networks.
|
||||
- Follow the networking setup guidance [here](https://docs.harvesterhci.io/v1.5/networking/index).
|
||||
|
||||
For other port requirements for other guest clusters, such as K3s and RKE2, please see [these docs](https://docs.harvesterhci.io/v1.5/install/requirements/#guest-clusters).
|
||||
@@ -0,0 +1,18 @@
|
||||
---
|
||||
title: Integrations in Rancher
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher"/>
|
||||
</head>
|
||||
|
||||
Prime is the Rancher ecosystem’s enterprise offering, with additional security, extended lifecycles, and access to Prime-exclusive documentation. Rancher Prime installation assets are hosted on a trusted SUSE registry, owned and managed by Rancher. The trusted Prime registry includes only stable releases that have been community-tested.
|
||||
|
||||
Prime also offers options for production support, as well as add-ons to your subscription that tailor to your commercial needs.
|
||||
|
||||
To learn more and get started with Rancher Prime, please visit [this page](https://www.rancher.com/quick-start).
|
||||
|
||||
import DocCardList from '@theme/DocCardList';
|
||||
import { useCurrentSidebarCategory } from '@docusaurus/theme-common/internal';
|
||||
|
||||
<DocCardList items={useCurrentSidebarCategory().items.slice(0,10)} />
|
||||
+47
@@ -0,0 +1,47 @@
|
||||
---
|
||||
title: Configuration Options
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/configuration-options"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
### Egress Support
|
||||
|
||||
By default the Egress gateway is disabled, but can be enabled on install or upgrade through the values.yaml or via the [overlay file](#overlay-file).
|
||||
|
||||
### Enabling Automatic Sidecar Injection
|
||||
|
||||
Automatic sidecar injection is disabled by default. To enable this, set the `sidecarInjectorWebhook.enableNamespacesByDefault=true` in the values.yaml on install or upgrade. This automatically enables Istio sidecar injection into all new namespaces that are deployed.
|
||||
|
||||
### Overlay File
|
||||
|
||||
An Overlay File is designed to support extensive configuration of your Istio installation. It allows you to make changes to any values available in the [IstioOperator API](https://istio.io/latest/docs/reference/config/istio.operator.v1alpha1/). This will ensure you can customize the default installation to fit any scenario.
|
||||
|
||||
The Overlay File will add configuration on top of the default installation that is provided from the Istio chart installation. This means you do not need to redefine the components that already defined for installation.
|
||||
|
||||
For more information on Overlay Files, refer to the [Istio documentation.](https://istio.io/latest/docs/setup/install/istioctl/#configure-component-settings)
|
||||
|
||||
### Selectors and Scrape Configs
|
||||
|
||||
The Monitoring app sets `prometheus.prometheusSpec.ignoreNamespaceSelectors=false` which enables monitoring across all namespaces by default. This ensures you can view traffic, metrics and graphs for resources deployed in a namespace with `istio-injection=enabled` label.
|
||||
|
||||
If you would like to limit Prometheus to specific namespaces, set `prometheus.prometheusSpec.ignoreNamespaceSelectors=true`. Once you do this, you must perform some additional configuration to continue to monitor your resources.
|
||||
|
||||
For details, refer to [this section.](selectors-and-scrape-configurations.md)
|
||||
|
||||
### Additional Steps for Installing Istio on an RKE2 Cluster
|
||||
|
||||
Refer to [this section.](install-istio-on-rke2-cluster.md)
|
||||
|
||||
### Additional Steps for Project Network Isolation
|
||||
|
||||
Refer to [this section.](project-network-isolation.md)
|
||||
+72
@@ -0,0 +1,72 @@
|
||||
---
|
||||
title: Additional Steps for Installing Istio on RKE2 and K3s Clusters
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/configuration-options/install-istio-on-rke2-cluster"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
When installing or upgrading the Istio Helm chart through **Apps,**
|
||||
|
||||
1. If you are installing the chart, click **Customize Helm options before install** and click **Next**.
|
||||
1. You will see options for configuring the Istio Helm chart. On the **Components** tab, check the box next to **Enabled CNI**.
|
||||
1. Add a custom overlay file specifying `cniBinDir` and `cniConfDir`. For more information on these options, refer to the [Istio documentation.](https://istio.io/latest/docs/setup/additional-setup/cni/#helm-chart-parameters) An example is below:
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="RKE2">
|
||||
|
||||
```yaml
|
||||
apiVersion: install.istio.io/v1alpha1
|
||||
kind: IstioOperator
|
||||
spec:
|
||||
components:
|
||||
cni:
|
||||
enabled: true
|
||||
k8s:
|
||||
overlays:
|
||||
- apiVersion: "apps/v1"
|
||||
kind: "DaemonSet"
|
||||
name: "istio-cni-node"
|
||||
patches:
|
||||
- path: spec.template.spec.containers.[name:install-cni].securityContext.privileged
|
||||
value: true
|
||||
values:
|
||||
cni:
|
||||
cniBinDir: /opt/cni/bin
|
||||
cniConfDir: /etc/cni/net.d
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="K3s">
|
||||
|
||||
```yaml
|
||||
apiVersion: install.istio.io/v1alpha1
|
||||
kind: IstioOperator
|
||||
spec:
|
||||
components:
|
||||
cni:
|
||||
enabled: true
|
||||
k8s:
|
||||
overlays:
|
||||
- apiVersion: "apps/v1"
|
||||
kind: "DaemonSet"
|
||||
name: "istio-cni-node"
|
||||
patches:
|
||||
- path: spec.template.spec.containers.[name:install-cni].securityContext.privileged
|
||||
value: true
|
||||
values:
|
||||
cni:
|
||||
cniBinDir: /var/lib/rancher/k3s/data/current/bin
|
||||
cniConfDir: /var/lib/rancher/k3s/agent/etc/cni/net.d
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
**Result:** Now you should be able to utilize Istio as desired, including sidecar injection and monitoring via Kiali.
|
||||
+33
@@ -0,0 +1,33 @@
|
||||
---
|
||||
title: Additional Steps for Project Network Isolation
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/configuration-options/project-network-isolation"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) has been deprecated since Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
In clusters where:
|
||||
|
||||
- You are using Rancher v2.5.8+ with an any RKE2 network plug-in that supports the enforcement of Kubernetes network policies, such as Canal
|
||||
- The Project Network Isolation option is enabled
|
||||
- You install the Istio Ingress module
|
||||
|
||||
The Istio Ingress Gateway pod won't be able to redirect ingress traffic to the workloads by default. This is because all the namespaces will be inaccessible from the namespace where Istio is installed. You have two options.
|
||||
|
||||
The first option is to add a new Network Policy in each of the namespaces where you intend to have ingress controlled by Istio. Your policy should include the following lines:
|
||||
|
||||
```
|
||||
- podSelector:
|
||||
matchLabels:
|
||||
app: istio-ingressgateway
|
||||
```
|
||||
|
||||
The second option is to move the `istio-system` namespace to the `system` project, which by default is excluded from the network isolation.
|
||||
+132
@@ -0,0 +1,132 @@
|
||||
---
|
||||
title: Selectors and Scrape Configs
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/configuration-options/selectors-and-scrape-configurations"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
The Monitoring app sets `prometheus.prometheusSpec.ignoreNamespaceSelectors=false`, which enables monitoring across all namespaces by default.
|
||||
|
||||
This ensures you can view traffic, metrics and graphs for resources deployed in a namespace with `istio-injection=enabled` label.
|
||||
|
||||
If you would like to limit Prometheus to specific namespaces, set `prometheus.prometheusSpec.ignoreNamespaceSelectors=true`. Once you do this, you must perform some additional configuration to continue to monitor your resources.
|
||||
|
||||
|
||||
### Limiting Monitoring to Specific Namespaces by Setting ignoreNamespaceSelectors to True
|
||||
|
||||
To limit monitoring to specific namespaces, you will edit the `ignoreNamespaceSelectors` Helm chart option. You will configure this option when installing or upgrading the Monitoring Helm chart:
|
||||
|
||||
1. When installing or upgrading the Monitoring Helm chart, edit the values.yml and set`prometheus.prometheusSpec.ignoreNamespaceSelectors=true`.
|
||||
1. Complete the install or upgrade.
|
||||
|
||||
**Result:** Prometheus will be limited to specific namespaces which means one of the following configurations will need to be set up to continue to view data in various dashboards
|
||||
|
||||
### Enabling Prometheus to Detect Resources in Other Namespaces
|
||||
|
||||
There are two different ways to enable Prometheus to detect resources in other namespaces when `prometheus.prometheusSpec.ignoreNamespaceSelectors=true`:
|
||||
|
||||
- **Monitoring specific namespaces:** Add a Service Monitor or Pod Monitor in the namespace with the targets you want to scrape.
|
||||
- **Monitoring across namespaces:** Add an `additionalScrapeConfig` to your rancher-monitoring instance to scrape all targets in all namespaces.
|
||||
|
||||
### Monitoring Specific Namespaces: Create a Service Monitor or Pod Monitor
|
||||
|
||||
This option allows you to define which specific services or pods you would like monitored in a specific namespace.
|
||||
|
||||
The usability tradeoff is that you have to create the service monitor or pod monitor per namespace since you cannot monitor across namespaces.
|
||||
|
||||
:::note Prerequisite:
|
||||
|
||||
Define a ServiceMonitor or PodMonitor for `<your namespace>`. An example ServiceMonitor is provided below.
|
||||
|
||||
:::
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster that you created and click **Explore**.
|
||||
1. In the top navigation bar, open the kubectl shell.
|
||||
1. If the ServiceMonitor or PodMonitor file is stored locally in your cluster, in `kubectl create -f <name of service/pod monitor file>.yaml`.
|
||||
1. If the ServiceMonitor or PodMonitor is not stored locally, run `cat<< EOF | kubectl apply -f -`, paste the file contents into the terminal, then run `EOF` to complete the command.
|
||||
1. Run `kubectl label namespace <your namespace> istio-injection=enabled` to enable the envoy sidecar injection.
|
||||
|
||||
**Result:** `<your namespace>` can be scraped by prometheus.
|
||||
|
||||
<figcaption>Example Service Monitor for Istio Proxies</figcaption>
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: ServiceMonitor
|
||||
metadata:
|
||||
name: envoy-stats-monitor
|
||||
namespace: istio-system
|
||||
labels:
|
||||
monitoring: istio-proxies
|
||||
spec:
|
||||
selector:
|
||||
matchExpressions:
|
||||
- {key: istio-prometheus-ignore, operator: DoesNotExist}
|
||||
namespaceSelector:
|
||||
any: true
|
||||
jobLabel: envoy-stats
|
||||
endpoints:
|
||||
- path: /stats/prometheus
|
||||
targetPort: 15090
|
||||
interval: 15s
|
||||
relabelings:
|
||||
- sourceLabels: [__meta_kubernetes_pod_container_port_name]
|
||||
action: keep
|
||||
regex: '.*-envoy-prom'
|
||||
- action: labeldrop
|
||||
regex: "__meta_kubernetes_pod_label_(.+)"
|
||||
- sourceLabels: [__meta_kubernetes_namespace]
|
||||
action: replace
|
||||
targetLabel: namespace
|
||||
- sourceLabels: [__meta_kubernetes_pod_name]
|
||||
action: replace
|
||||
targetLabel: pod_name
|
||||
```
|
||||
|
||||
### Monitoring across namespaces: Set ignoreNamespaceSelectors to False
|
||||
|
||||
This enables monitoring across namespaces by giving Prometheus additional scrape configurations.
|
||||
|
||||
The usability tradeoff is that all of Prometheus' `additionalScrapeConfigs` are maintained in a single Secret. This could make upgrading difficult if monitoring is already deployed with additionalScrapeConfigs before installing Istio.
|
||||
|
||||
1. When installing or upgrading the Monitoring Helm chart, edit the values.yml and set the `prometheus.prometheusSpec.additionalScrapeConfigs` array to the **Additional Scrape Config** provided below.
|
||||
1. Complete the install or upgrade.
|
||||
|
||||
**Result:** All namespaces with the `istio-injection=enabled` label will be scraped by prometheus.
|
||||
|
||||
<figcaption>Additional Scrape Config</figcaption>
|
||||
|
||||
``` yaml
|
||||
- job_name: 'istio/envoy-stats'
|
||||
scrape_interval: 15s
|
||||
metrics_path: /stats/prometheus
|
||||
kubernetes_sd_configs:
|
||||
- role: pod
|
||||
relabel_configs:
|
||||
- source_labels: [__meta_kubernetes_pod_container_port_name]
|
||||
action: keep
|
||||
regex: '.*-envoy-prom'
|
||||
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
|
||||
action: replace
|
||||
regex: ([^:]+)(?::\d+)?;(\d+)
|
||||
replacement: $1:15090
|
||||
target_label: __address__
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_pod_label_(.+)
|
||||
- source_labels: [__meta_kubernetes_namespace]
|
||||
action: replace
|
||||
target_label: namespace
|
||||
- source_labels: [__meta_kubernetes_pod_name]
|
||||
action: replace
|
||||
target_label: pod_name
|
||||
```
|
||||
+75
@@ -0,0 +1,75 @@
|
||||
---
|
||||
title: CPU and Memory Allocations
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/cpu-and-memory-allocations"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
This section describes the minimum recommended computing resources for the Istio components in a cluster.
|
||||
|
||||
The CPU and memory allocations for each component are [configurable.](#configuring-resource-allocations)
|
||||
|
||||
Before enabling Istio, we recommend that you confirm that your Rancher worker nodes have enough CPU and memory to run all of the components of Istio.
|
||||
|
||||
:::tip
|
||||
|
||||
In larger deployments, it is strongly advised that the infrastructure be placed on dedicated nodes in the cluster by adding a node selector for each Istio component.
|
||||
|
||||
:::
|
||||
|
||||
The table below shows a summary of the minimum recommended resource requests and limits for the CPU and memory of each core Istio component.
|
||||
|
||||
In Kubernetes, the resource request indicates that the workload will not deployed on a node unless the node has at least the specified amount of memory and CPU available. If the workload surpasses the limit for CPU or memory, it can be terminated or evicted from the node. For more information on managing resource limits for containers, refer to the [Kubernetes documentation.](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/)
|
||||
|
||||
| Workload | CPU - Request | Memory - Request | CPU - Limit | Memory - Limit |
|
||||
|----------------------|---------------|------------|-----------------|-------------------|
|
||||
| ingress gateway | 100m | 128mi | 2000m | 1024mi |
|
||||
| egress gateway | 100m | 128mi | 2000m | 1024mi |
|
||||
| istiod | 500m | 2048mi | No limit | No limit |
|
||||
| proxy | 10m | 10mi | 2000m | 1024mi |
|
||||
| **Totals:** | **710m** | **2314Mi** | **6000m** | **3072Mi** |
|
||||
|
||||
## Configuring Resource Allocations
|
||||
|
||||
You can individually configure the resource allocation for each type of Istio component. This section includes the default resource allocations for each component.
|
||||
|
||||
To make it easier to schedule the workloads to a node, a cluster-admin can reduce the CPU and memory resource requests for the component. However, the default CPU and memory allocations are the minimum that we recommend.
|
||||
|
||||
You can find more information about Istio configuration in the [official Istio documentation](https://istio.io/).
|
||||
|
||||
To configure the resources allocated to an Istio component,
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster that you created and click **Explore**.
|
||||
1. In the left navigation bar, click **Apps**.
|
||||
1. Click **Installed Apps**.
|
||||
1. Go to the `istio-system` namespace. In one of the Istio workloads, such as `rancher-istio`, click **⋮ > Edit/Upgrade**.
|
||||
1. Click **Upgrade** to edit the base components via changes to the values.yaml or add an [overlay file](configuration-options/configuration-options.md#overlay-file). For more information about editing the overlay file, see [this section.](#editing-the-overlay-file)
|
||||
1. Change the CPU or memory allocations, the nodes where each component will be scheduled to, or the node tolerations.
|
||||
1. Click **Upgrade**. to rollout changes
|
||||
|
||||
**Result:** The resource allocations for the Istio components are updated.
|
||||
|
||||
### Editing the Overlay File
|
||||
|
||||
The overlay file can contain any of the values in the [Istio Operator spec.](https://istio.io/latest/docs/reference/config/istio.operator.v1alpha1/#IstioOperatorSpec) The overlay file included with the Istio application is just one example of a potential configuration of the overlay file.
|
||||
|
||||
As long as the file contains `kind: IstioOperator` and the YAML options are valid, the file can be used as an overlay.
|
||||
|
||||
In the example overlay file provided with the Istio application, the following section allows you to change Kubernetes resources:
|
||||
|
||||
```
|
||||
# k8s:
|
||||
# resources:
|
||||
# requests:
|
||||
# cpu: 200m
|
||||
```
|
||||
@@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Disabling Istio
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/disable-istio"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
This section describes how to uninstall Istio in a cluster or disable a namespace, or workload.
|
||||
|
||||
## Uninstall Istio in a Cluster
|
||||
|
||||
To uninstall Istio,
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster that you created and click **Explore**.
|
||||
1. In the left navigation bar, click **Apps > Installed Apps**.
|
||||
1. In the `istio-system` namespace, go to `rancher-istio` and click **⋮ > Delete**.
|
||||
1. After `rancher-istio` is deleted, you can then select all the remaining apps in the `istio-system` namespace and click **Delete**.
|
||||
|
||||
**Result:** The `rancher-istio` app in the cluster gets removed. The Istio sidecar cannot be deployed on any workloads in the cluster.
|
||||
|
||||
:::note
|
||||
|
||||
You can no longer disable and re-enable your Istio installation. If you would like to save your settings for a future install, view and save individual YAMLs to refer back to / reuse for future installations.
|
||||
|
||||
:::
|
||||
|
||||
**Troubleshooting Uninstall:** If you didn't follow the uninstall steps, you may encounter a warning during uninstall:
|
||||
|
||||
`Error: uninstallation completed with 1 error(s): unable to build kubernetes objects for delete: unable to recognize "": no matches for kind "MonitoringDashboard" in version "monitoring.kiali.io/v1alpha1"`
|
||||
|
||||
This could mean a few things. You either selected all the apps in the `istio-system` namespace and deleted them at the same time, or you deleted `rancher-istio` chart dependencies prior to deleting the `rancher-istio` chart. Since the uninstall did not complete properly, you will have resources remaining in the `istio-system` namespace that you will need to manually clean up. Another option to avoid manual clean up is to install `rancher-istio` again, then uninstall it in the correct order.
|
||||
|
||||
## Disable Istio in a Namespace
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster that you created and click **Explore**.
|
||||
1. Click **Cluster > Projects/Namespaces**.
|
||||
1. Go to the namespace where you want to enable Istio and click **⋮ > Enable Istio Auto Injection**. Alternately, click the namespace, and then on the namespace detail page, click **⋮ > Enable Istio Auto Injection**.
|
||||
|
||||
**Result:** When workloads are deployed in this namespace, they will not have the Istio sidecar.
|
||||
|
||||
## Remove the Istio Sidecar from a Workload
|
||||
|
||||
Disable Istio in the namespace, then redeploy the workloads with in it. They will be deployed without the Istio sidecar.
|
||||
@@ -0,0 +1,157 @@
|
||||
---
|
||||
title: Istio
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
[Istio](https://istio.io/) is an open-source tool that makes it easier for DevOps teams to observe, secure, control, and troubleshoot the traffic within a complex network of microservices.
|
||||
|
||||
As a network of microservices changes and grows, the interactions between them can become increasingly difficult to manage and understand. In such a situation, it is useful to have a service mesh as a separate infrastructure layer. Istio's service mesh lets you manipulate traffic between microservices without changing the microservices directly.
|
||||
|
||||
Our integration of Istio is designed so that a Rancher operator, such as an administrator or cluster owner, can deliver Istio to a team of developers. Then developers can use Istio to enforce security policies, troubleshoot problems, or manage traffic for green/blue deployments, canary deployments, or A/B testing.
|
||||
|
||||
This core service mesh provides features that include but are not limited to the following:
|
||||
|
||||
- **Traffic Management** such as ingress and egress routing, circuit breaking, mirroring.
|
||||
- **Security** with resources to authenticate and authorize traffic and users, mTLS included.
|
||||
- **Observability** of logs, metrics, and distributed traffic flows.
|
||||
|
||||
After [setting up istio](../../how-to-guides/advanced-user-guides/istio-setup-guide/istio-setup-guide.md) you can leverage Istio's control plane functionality through the Rancher UI, `kubectl`, or `istioctl`.
|
||||
|
||||
Istio needs to be set up by a `cluster-admin` before it can be used in a project.
|
||||
|
||||
|
||||
## What's New in Rancher v2.5
|
||||
|
||||
The overall architecture of Istio has been simplified. A single component, Istiod, has been created by combining Pilot, Citadel, Galley and the sidecar injector. Node Agent functionality has also been merged into istio-agent.
|
||||
|
||||
Addons that were previously installed by Istio (cert-manager, Grafana, Jaeger, Kiali, Prometheus, Zipkin) will now need to be installed separately. Istio will support installation of integrations that are from the Istio Project and will maintain compatibility with those that are not.
|
||||
|
||||
A Prometheus integration will still be available through an installation of [Rancher Monitoring](../monitoring-and-alerting/monitoring-and-alerting.md), or by installing your own Prometheus operator. Rancher's Istio chart will also install Kiali by default to ensure you can get a full picture of your microservices out of the box.
|
||||
|
||||
Istio has migrated away from Helm as a way to install Istio and now provides installation through the istioctl binary or Istio Operator. To ensure the easiest interaction with Istio, Rancher's Istio will maintain a Helm chart that utilizes the istioctl binary to manage your Istio installation.
|
||||
|
||||
This Helm chart will be available via the Apps and Marketplace in the UI. A user that has access to the Rancher Chart's catalog will need to set up Istio before it can be used in the project.
|
||||
|
||||
## Tools Bundled with Istio
|
||||
|
||||
Our [Istio](https://istio.io/) installer wraps the istioctl binary commands in a handy Helm chart, including an overlay file option to allow complex customization.
|
||||
|
||||
It also includes the following:
|
||||
|
||||
### Kiali
|
||||
|
||||
[Kiali](https://kiali.io/) is a comprehensive visualization aid used for graphing traffic flow throughout the service mesh. It allows you to see how they are connected, including the traffic rates and latencies between them.
|
||||
|
||||
You can check the health of the service mesh, or drill down to see the incoming and outgoing requests to a single component.
|
||||
|
||||
:::note
|
||||
For Istio installations `103.1.0+up1.19.6` and later, Kiali uses a token value for its authentication strategy. The name of the Kiali service account in Rancher is `kiali`. Use this name if you are writing commands that require you to enter the name of the Kiali service account (for example, if you are trying to generate or retrieve a session token). For more information, refer to the [Kiali token authentication FAQ](https://kiali.io/docs/faq/authentication/).
|
||||
:::
|
||||
|
||||
### Jaeger
|
||||
|
||||
Our Istio installer includes a quick-start, all-in-one installation of [Jaeger,](https://www.jaegertracing.io/) a tool used for tracing distributed systems.
|
||||
|
||||
Note that this is not a production-qualified deployment of Jaeger. This deployment uses an in-memory storage component, while a persistent storage component is recommended for production. For more information on which deployment strategy you may need, refer to the [Jaeger documentation.](https://www.jaegertracing.io/docs/1.65/operator/#production-strategy)
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before enabling Istio, we recommend that you confirm that your Rancher worker nodes have enough [CPU and memory](cpu-and-memory-allocations.md) to run all of the components of Istio.
|
||||
|
||||
If you are installing Istio on RKE2 cluster, some additional steps are required. For details, see [this section.](#additional-steps-for-installing-istio-on-an-rke2-cluster)
|
||||
|
||||
## Setup Guide
|
||||
|
||||
Refer to the [setup guide](../../how-to-guides/advanced-user-guides/istio-setup-guide/istio-setup-guide.md) for instructions on how to set up Istio and use it in a project.
|
||||
|
||||
## Remove Istio
|
||||
|
||||
To remove Istio components from a cluster, namespace, or workload, refer to the section on [uninstalling Istio.](disable-istio.md)
|
||||
|
||||
## Accessing Visualizations
|
||||
|
||||
> By default, only cluster-admins have access to Kiali. For instructions on how to allow admin, edit or views roles to access them, see [this section.](rbac-for-istio.md)
|
||||
|
||||
:::note
|
||||
For Istio installations version `103.1.0+up1.19.6` and later, Kiali uses a token value for its authentication strategy. The name of the Kiali service account in Rancher is `kiali`. Use this name if you are writing commands that require you to enter the name of the Kiali service account (for example, if you are trying to generate or retrieve a session token). For more information, refer to the [Kiali token authentication FAQ](https://kiali.io/docs/faq/authentication/).
|
||||
:::
|
||||
|
||||
After Istio is set up in a cluster, Grafana, Prometheus, and Kiali are available in the Rancher UI.
|
||||
|
||||
To access the Grafana and Prometheus visualizations,
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see the visualizations and click **Explore**.
|
||||
1. In the left navigation bar, click **Monitoring**.
|
||||
1. Click **Grafana** or any of the other dashboards.
|
||||
|
||||
To access the Kiali visualization,
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see Kiali and click **Explore**.
|
||||
1. In the left navigation bar, click **Istio**.
|
||||
1. Click **Kiali**. From here you can access the **Traffic Graph** tab or the **Traffic Metrics** tab to see network visualizations and metrics.
|
||||
|
||||
By default, all namespace will picked up by prometheus and make data available for Kiali graphs. Refer to [selector/scrape config setup](configuration-options/selectors-and-scrape-configurations.md) if you would like to use a different configuration for prometheus data scraping.
|
||||
|
||||
Your access to the visualizations depend on your role. Grafana and Prometheus are only available for `cluster-admin` roles. The Kiali UI is available only to `cluster-admin` by default, but `cluster-admin` can allow other roles to access them by editing the Istio values.yaml.
|
||||
|
||||
## Architecture
|
||||
|
||||
Istio installs a service mesh that uses [Envoy](https://www.envoyproxy.io) sidecar proxies to intercept traffic to each workload. These sidecars intercept and manage service-to-service communication, allowing fine-grained observation and control over traffic within the cluster.
|
||||
|
||||
Only workloads that have the Istio sidecar injected can be tracked and controlled by Istio.
|
||||
|
||||
When a namespace has Istio enabled, new workloads deployed in the namespace will automatically have the Istio sidecar. You need to manually enable Istio in preexisting workloads.
|
||||
|
||||
For more information on the Istio sidecar, refer to the [Istio sidecare-injection docs](https://istio.io/docs/setup/kubernetes/additional-setup/sidecar-injection/) and for more information on Istio's architecture, refer to the [Istio Architecture docs](https://istio.io/latest/docs/ops/deployment/architecture/)
|
||||
|
||||
### Multiple Ingresses
|
||||
|
||||
By default, each Rancher-provisioned cluster has one NGINX ingress controller allowing traffic into the cluster. Istio also installs an ingress gateway by default into the `istio-system` namespace. The result is that your cluster will have two ingresses in your cluster.
|
||||
|
||||

|
||||
|
||||
Additional Istio Ingress gateways can be enabled via the [overlay file](configuration-options/configuration-options.md#overlay-file).
|
||||
|
||||
### Egress Support
|
||||
|
||||
By default the Egress gateway is disabled, but can be enabled on install or upgrade through the values.yaml or via the [overlay file](configuration-options/configuration-options.md#overlay-file).
|
||||
|
||||
## Additional Steps for Installing Istio on an RKE2 Cluster
|
||||
|
||||
To install Istio on an RKE2 cluster, follow the steps in [this section.](configuration-options/install-istio-on-rke2-cluster.md)
|
||||
|
||||
## Upgrading Istio in an Air-Gapped Environment
|
||||
|
||||
The Istio pod security policy is now enabled by default. A new value, `installer.releaseMirror.enabled`, has been added to the rancher-istio chart to enable and disable the server that supports air-gapped upgrades. Note that `installer.releaseMirror.enabled` is set to `false` by default. You can set this value as needed when you install or upgrade. Follow the steps below:
|
||||
|
||||
1. Provision an air-gapped Rancher instance and an air-gapped custom cluster in the Rancher UI.
|
||||
2. Install Monitoring in the cluster: **Cluster Explorer -> Apps & Marketplace -> Charts -> Monitoring**.
|
||||
3. Pull all required images for Istio into the private registry you will use in the air-gapped environment.
|
||||
4. Install Istio in the cluster: **Cluster Explorer -> Apps & Marketplace -> Charts -> Istio**.
|
||||
|
||||
:::note
|
||||
|
||||
You can enable [Jaeger](https://www.jaegertracing.io/) and [Kiali](https://kiali.io/) on a fresh Istio install. To ensure that Jaeger and Kiali work, set `installer.releaseMirror.enabled` to `true` in `values.yaml` during installation.
|
||||
|
||||
:::
|
||||
|
||||
5. Upgrade the Istio installation.
|
||||
|
||||
:::caution
|
||||
|
||||
If you haven't already, set `installer.releaseMirror.enabled=true` to upgrade Istio.
|
||||
|
||||
:::
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
title: Role-based Access Control
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/istio/rbac-for-istio"/>
|
||||
</head>
|
||||
|
||||
:::warning
|
||||
|
||||
[Rancher-Istio](https://github.com/rancher/charts/tree/release-v2.11/charts/rancher-istio) will be deprecated in Rancher v2.12.0; turn to the [SUSE Rancher Application Collection](https://apps.rancher.io) build of Istio for enhanced security (included in SUSE Rancher Prime subscriptions).
|
||||
|
||||
Detailed information can be found in [this announcement](https://forums.suse.com/t/deprecation-of-rancher-istio/45043).
|
||||
|
||||
:::
|
||||
|
||||
This section describes the permissions required to access Istio features.
|
||||
|
||||
The rancher istio chart installs three `ClusterRoles`
|
||||
|
||||
## Cluster-Admin Access
|
||||
|
||||
By default, only those with the `cluster-admin` `ClusterRole` can:
|
||||
|
||||
- Install istio app in a cluster
|
||||
- Configure resource allocations for Istio
|
||||
|
||||
|
||||
## Admin and Edit access
|
||||
|
||||
By default, only Admin and Edit roles can:
|
||||
|
||||
- Enable and disable Istio sidecar auto-injection for namespaces
|
||||
- Add the Istio sidecar to workloads
|
||||
- View the traffic metrics and traffic graph for the cluster
|
||||
- Configure Istio's resources (such as the gateway, destination rules, or virtual services)
|
||||
|
||||
## Summary of Default Permissions for Kubernetes Default roles
|
||||
|
||||
Istio creates three `ClusterRoles` and adds Istio CRD access to the following default K8s `ClusterRole`:
|
||||
|
||||
ClusterRole create by chart | Default K8s ClusterRole | Rancher Role |
|
||||
------------------------------:| ---------------------------:|---------:|
|
||||
`istio-admin` | admin| Project Owner |
|
||||
`istio-edit`| edit | Project Member |
|
||||
`istio-view` | view | Read-only |
|
||||
|
||||
Rancher will continue to use cluster-owner, cluster-member, project-owner, project-member, etc as role names, but will utilize default roles to determine access. For each default K8s `ClusterRole` there are different Istio CRD permissions and K8s actions (Create ( C ), Get ( G ), List ( L ), Watch ( W ), Update ( U ), Patch ( P ), Delete( D ), All ( * )) that can be performed.
|
||||
|
||||
|
||||
|CRDs | Admin | Edit | View
|
||||
|----------------------------| ------| -----| -----
|
||||
| <ul><li>`config.istio.io`</li><ul><li>`adapters`</li><li>`attributemanifests`</li><li>`handlers`</li><li>`httpapispecbindings`</li><li>`httpapispecs`</li><li>`instances`</li><li>`quotaspecbindings`</li><li>`quotaspecs`</li><li>`rules`</li><li>`templates`</li></ul></ul>| GLW | GLW | GLW
|
||||
|<ul><li>`networking.istio.io`</li><ul><li>`destinationrules`</li><li>`envoyfilters`</li><li>`gateways`</li><li>`serviceentries`</li><li>`sidecars`</li><li>`virtualservices`</li><li>`workloadentries`</li></ul></ul>| * | * | GLW
|
||||
|<ul><li>`security.istio.io`</li><ul><li>`authorizationpolicies`</li><li>`peerauthentications`</li><li>`requestauthentications`</li></ul></ul>| * | * | GLW
|
||||
+35
@@ -0,0 +1,35 @@
|
||||
---
|
||||
title: Kubernetes Distributions
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/kubernetes-distributions"/>
|
||||
</head>
|
||||
|
||||
## K3s
|
||||
|
||||
K3s is a lightweight, fully compliant Kubernetes distribution designed for a range of use cases, including edge computing, IoT, CI/CD, development and embedding Kubernetes into applications. It simplifies Kubernetes management by packaging the system as a single binary, using sqlite3 as the default storage, and offering a user-friendly launcher. K3s includes essential features like local storage and load balancing, Helm chart controller and the Traefik CNI. It minimizes external dependencies and provides a streamlined Kubernetes experience. K3s was donated to the CNCF as a Sandbox Project in June 2020.
|
||||
|
||||
### K3s with Rancher
|
||||
|
||||
- Rancher allows easy provision of K3s across a range of platforms including Amazon EC2, DigitalOcean, Azure, vSphere, or existing servers.
|
||||
- Standard Rancher management of Kubernetes clusters including all outlined [cluster management capabilities](../../how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/kubernetes-clusters-in-rancher-setup.md#cluster-management-capabilities-by-cluster-type).
|
||||
|
||||
|
||||
## RKE2
|
||||
|
||||
RKE2 is a compliant Kubernetes distribution developed by Rancher. It is specifically designed for security and compliance within the U.S. Federal Government sector.
|
||||
|
||||
Primary characteristics of RKE2 include:
|
||||
|
||||
1. **Security and Compliance Focus**: RKE2 places a strong emphasis on security and compliance, operating under a "secure by default" framework, making it suitable for government services and highly regulated industries like finance and healthcare.
|
||||
1. **CIS Kubernetes Benchmark Conformance**: RKE2 comes pre-configured to meet the CIS Kubernetes Hardening Benchmark (currently supporting v1.23 and v1.7), with minimal manual intervention required.
|
||||
1. **FIPS 140-2 Compliance**: RKE2 complies with the FIPS 140-2 standard using FIPS-validated crypto modules for its components.
|
||||
1. **Embedded etcd**: RKE2 defaults to using an embedded etcd as its data store. This aligns it more closely with standard Kubernetes practices, allowing better integration with other Kubernetes tools and reducing the risk of misconfiguration.
|
||||
1. **Alignment with Upstream Kubernetes**: RKE2 aims to stay closely aligned with upstream Kubernetes, reducing the risk of non-conformance that may occur when using distributions that deviate from standard Kubernetes practices.
|
||||
1. **Multiple CNI Support**: RKE2 offers support for multiple Container Network Interface (CNI) plugins, including Cilium, Calico, and Multus. This is essential for use cases such as telco distribution centers and factories with various production facilities.
|
||||
|
||||
## RKE2 with Rancher
|
||||
|
||||
- Rancher allows easy provision of RKE2 across a range of platforms including Amazon EC2, DigitalOcean, Azure, vSphere, or existing servers.
|
||||
- Standard Rancher management of Kubernetes clusters including all outlined [cluster management capabilities](../../how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/kubernetes-clusters-in-rancher-setup.md#cluster-management-capabilities-by-cluster-type).
|
||||
@@ -0,0 +1,35 @@
|
||||
---
|
||||
title: Advanced Policy Management with Kubewarden
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/kubewarden"/>
|
||||
</head>
|
||||
|
||||
Kubewarden is a Policy Engine that secures and helps manage your cluster resources. It allows for validation and mutation of resource requests via policies, including context-aware policies and verifying image signatures. It can run policies in monitor or enforcing mode and provides an overview of the state of the cluster.
|
||||
|
||||
Kubewarden aims to be the Universal Policy Engine by enabling and simplifying Policy as Code. Kubewarden policies are compiled into WebAssembly: they are small (400KBs ~ 2MBs), sandboxed, secure, and portable. It aims to be universal by catering to each persona in your organization:
|
||||
|
||||
- Policy User: manage and declare policies using Kubernetes Custom Resources, reuse existing policies written in Rego (OPA and Gatekeeper). Test the policies outside the cluster in CI/CD.
|
||||
- Policy Developer: write policies in your preferred Wasm-compiling language (Rego, Go, Rust, C#, Swift, Typescript, and more to come). Reuse the ecosystem of tools, libraries, and workflows you already know.
|
||||
- Policy Distributor: policies are OCI artifacts, serve them through your OCI repository and use industry standards in your infrastructure, like Software-Bill-Of-Materials and artifact signatures.
|
||||
- Cluster Operator: Kubewarden is modular (OCI registry, PolicyServers, Audit Scanner, Controller). Configure your deployment to suit your needs, segregating different tenants. Get an overview of past, current, and possible violations across the cluster with the Audit Scanner and the PolicyReports.
|
||||
- Kubewarden Integrator: use it as a platform to write new Kubewarden modules and custom policies.
|
||||
|
||||
## Kubewarden with Rancher
|
||||
|
||||
Kubewarden’s upstream Helm charts are fully integrated as Rancher Apps, providing a UI for the install options. The charts also come with defaults that respect the Rancher stack (for example: not policing Rancher system namespaces), and default PolicyServer and Policies. Users have access to all Kubewarden features and can deploy PolicyServers and Policies manually by interacting with the Kubernetes API (e.g.: using kubectl).
|
||||
|
||||
Kubewarden provides a full replacement of the removed Kubernetes Pod Security Policies. Kubewarden also integrates with the new Pod Security Admission feature introduced by a recent version of Kubernetes by augmenting its security capabilities.
|
||||
|
||||
## Kubewarden with Rancher Prime
|
||||
|
||||
The available Rancher UI Extension for Kubewarden integrates it into the Rancher UI. The UI Extension automates the installation and configuration of the Kubewarden stack and configures access to the policies maintained by SUSE. The UI Extension provides access to a curated catalog of ready-to-use policies. Using the UI Extension, one can browse, install, and configure these policies.
|
||||
|
||||
The UI Extension provides an overview of the Kubewarden stack components and their behavior. This includes access to the Kubewarden metrics and trace events. An operator can understand the impact of policies on the cluster and troubleshoot issues.
|
||||
|
||||
In addition, the UI Extension provides the Policy Reporter UI, which gives a visual overview of the compliance status of the Kubernetes cluster. With this UI, an operator can quickly identify all non-compliant Kubernetes resources, understand the reasons for violations and act accordingly.
|
||||
All of this with the support offering of Rancher Prime.
|
||||
|
||||
|
||||
|
||||
+12
@@ -0,0 +1,12 @@
|
||||
---
|
||||
title: Custom Resource Configuration
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/custom-resource-configuration"/>
|
||||
</head>
|
||||
|
||||
The following Custom Resource Definitions are used to configure logging:
|
||||
|
||||
- [Flow and ClusterFlow](flows-and-clusterflows.md)
|
||||
- [Output and ClusterOutput](outputs-and-clusteroutputs.md)
|
||||
+79
@@ -0,0 +1,79 @@
|
||||
---
|
||||
title: Flows and ClusterFlows
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/custom-resource-configuration/flows-and-clusterflows"/>
|
||||
</head>
|
||||
|
||||
See the [Logging operator documentation](https://kube-logging.github.io/docs/configuration/flow/) for the full details on how to configure `Flows` and `ClusterFlows`.
|
||||
|
||||
See [Rancher Integration with Logging Services: Troubleshooting](../logging.md#The-Logging-Buffer-Overloads-Pods) for how to resolve memory problems with the logging buffer.
|
||||
|
||||
## Flows
|
||||
|
||||
A `Flow` defines which logs to collect and filter and which output to send the logs to.
|
||||
|
||||
The `Flow` is a namespaced resource, which means logs will only be collected from the namespace that the `Flow` is deployed in.
|
||||
|
||||
`Flows` can be configured by filling out forms in the Rancher UI.
|
||||
|
||||
For more details about the `Flow` custom resource, see [FlowSpec.](https://kube-logging.github.io/docs/configuration/crds/v1beta1/flow_types/)
|
||||
|
||||
### Matches
|
||||
|
||||
Match statements are used to select which containers to pull logs from.
|
||||
|
||||
You can specify match statements to select or exclude logs according to Kubernetes labels, container and host names. Match statements are evaluated in the order they are defined and processed only until the first matching select or exclude rule applies.
|
||||
|
||||
Matches can be configured by filling out the `Flow` or `ClusterFlow` forms in the Rancher UI.
|
||||
|
||||
For detailed examples on using the match statement, see the [official documentation on log routing.](https://kube-logging.github.io/docs/configuration/log-routing/)
|
||||
|
||||
### Filters
|
||||
|
||||
You can define one or more filters within a `Flow`. Filters can perform various actions on the logs, such as adding data, transforming the logs, or parsing values from the records. The filters in the `Flow` are applied in the same order they appear in the definition.
|
||||
|
||||
For a list of filters supported by the Logging operator, see [the official documentation on Fluentd filters](https://kube-logging.github.io/docs/configuration/plugins/filters/).
|
||||
|
||||
Filters need to be configured in YAML.
|
||||
|
||||
### Outputs
|
||||
|
||||
This `Output` will receive logs from the `Flow`. Because the `Flow` is a namespaced resource, the `Output` must reside in same namespace as the `Flow`.
|
||||
|
||||
`Outputs` can be referenced when filling out the `Flow` or `ClusterFlow` forms in the Rancher UI.
|
||||
|
||||
## ClusterFlows
|
||||
|
||||
Matches, filters and `Outputs` are configured for `ClusterFlows` in the same way that they are configured for `Flows`. The key difference is that the `ClusterFlow` is scoped at the cluster level and can configure log collection across all namespaces.
|
||||
|
||||
`ClusterFlows` can be configured by filling out forms in the Rancher UI.
|
||||
|
||||
After `ClusterFlow` selects logs from all namespaces in the cluster, logs from the cluster will be collected and logged to the selected `ClusterOutput`.
|
||||
|
||||
## YAML Example
|
||||
|
||||
The following example `Flow` transforms the log messages from the default namespace and sends them to an S3 `Output`:
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: Flow
|
||||
metadata:
|
||||
name: flow-sample
|
||||
namespace: default
|
||||
spec:
|
||||
filters:
|
||||
- parser:
|
||||
remove_key_name_field: true
|
||||
parse:
|
||||
type: nginx
|
||||
- tag_normaliser:
|
||||
format: ${namespace_name}.${pod_name}.${container_name}
|
||||
localOutputRefs:
|
||||
- s3-output
|
||||
match:
|
||||
- select:
|
||||
labels:
|
||||
app: nginx
|
||||
```
|
||||
+299
@@ -0,0 +1,299 @@
|
||||
---
|
||||
title: Outputs and ClusterOutputs
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/custom-resource-configuration/outputs-and-clusteroutputs"/>
|
||||
</head>
|
||||
|
||||
See the [Logging operator documentation](https://kube-logging.github.io/docs/configuration/flow/) for the full details on how to configure `Flows` and `ClusterFlows`.
|
||||
|
||||
See [Rancher Integration with Logging Services: Troubleshooting](../logging.md#The-Logging-Buffer-Overloads-Pods) for how to resolve memory problems with the logging buffer.
|
||||
|
||||
## Outputs
|
||||
|
||||
The `Output` resource defines where your `Flows` can send the log messages. `Outputs` are the final stage for a logging `Flow`.
|
||||
|
||||
The `Output` is a namespaced resource, which means only a `Flow` within the same namespace can access it.
|
||||
|
||||
You can use secrets in these definitions, but they must also be in the same namespace.
|
||||
|
||||
`Outputs` can be configured by filling out forms in the Rancher UI.
|
||||
|
||||
For the details of `Output` custom resource, see [OutputSpec.](https://kube-logging.github.io/docs/configuration/crds/v1beta1/output_types/).
|
||||
|
||||
The Rancher UI provides forms for configuring the following `Output` types:
|
||||
|
||||
- Amazon ElasticSearch
|
||||
- Azure Storage
|
||||
- Cloudwatch
|
||||
- Datadog
|
||||
- Elasticsearch
|
||||
- File
|
||||
- Fluentd
|
||||
- GCS
|
||||
- Kafka
|
||||
- Kinesis Stream
|
||||
- LogDNA
|
||||
- LogZ
|
||||
- Loki
|
||||
- New Relic
|
||||
- Splunk
|
||||
- SumoLogic
|
||||
- Syslog
|
||||
|
||||
The Rancher UI provides forms for configuring the `Output` type, target, and access credentials if applicable.
|
||||
|
||||
For example configuration for each logging plugin supported by the logging operator, see the [Logging operator documentation](https://kube-logging.github.io/docs/configuration/plugins/outputs/).
|
||||
|
||||
## ClusterOutputs
|
||||
|
||||
`ClusterOutput` defines an `Output` without namespace restrictions. It is only effective when deployed in the same namespace as the logging operator.
|
||||
|
||||
`ClusterOutputs` can be configured by filling out forms in the Rancher UI.
|
||||
|
||||
For the details of the `ClusterOutput` custom resource, see [ClusterOutput.](https://kube-logging.github.io/docs/configuration/crds/v1beta1/clusteroutput_types/)
|
||||
|
||||
## YAML Examples
|
||||
|
||||
Once logging is installed, you can use these examples to help craft your own logging pipeline.
|
||||
|
||||
- [Cluster Output to ElasticSearch](#cluster-output-to-elasticsearch)
|
||||
- [Output to Splunk](#output-to-splunk)
|
||||
- [Output to Syslog](#output-to-syslog)
|
||||
- [Unsupported Outputs](#unsupported-outputs)
|
||||
|
||||
### Cluster Output to ElasticSearch
|
||||
|
||||
Let's say you wanted to send all logs in your cluster to an `elasticsearch` cluster. First, we create a cluster `Output`.
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: ClusterOutput
|
||||
metadata:
|
||||
name: "example-es"
|
||||
namespace: "cattle-logging-system"
|
||||
spec:
|
||||
elasticsearch:
|
||||
host: elasticsearch.example.com
|
||||
port: 9200
|
||||
scheme: http
|
||||
```
|
||||
|
||||
We have created this `ClusterOutput`, without elasticsearch configuration, in the same namespace as our operator: `cattle-logging-system.`. Any time we create a `ClusterFlow` or `ClusterOutput`, we have to put it in the `cattle-logging-system` namespace.
|
||||
|
||||
Now that we have configured where we want the logs to go, let's configure all logs to go to that `ClusterOutput`.
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: ClusterFlow
|
||||
metadata:
|
||||
name: "all-logs"
|
||||
namespace: "cattle-logging-system"
|
||||
spec:
|
||||
globalOutputRefs:
|
||||
- "example-es"
|
||||
```
|
||||
|
||||
We should now see our configured index with logs in it.
|
||||
|
||||
|
||||
### Output to Splunk
|
||||
|
||||
What if we have an application team who only wants logs from a specific namespaces sent to a `splunk` server? For this case, we can use namespaced `Outputs` and `Flows`.
|
||||
|
||||
Before we start, let's set up that team's application: `coolapp`.
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: devteam
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: coolapp
|
||||
namespace: devteam
|
||||
labels:
|
||||
app: coolapp
|
||||
spec:
|
||||
replicas: 2
|
||||
selector:
|
||||
matchLabels:
|
||||
app: coolapp
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: coolapp
|
||||
spec:
|
||||
containers:
|
||||
- name: generator
|
||||
image: paynejacob/loggenerator:latest
|
||||
```
|
||||
|
||||
With `coolapp` running, we will follow a similar path as when we created a `ClusterOutput`. However, unlike `ClusterOutputs`, we create our `Output` in our application's namespace.
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: Output
|
||||
metadata:
|
||||
name: "devteam-splunk"
|
||||
namespace: "devteam"
|
||||
spec:
|
||||
splunkHec:
|
||||
hec_host: splunk.example.com
|
||||
hec_port: 8088
|
||||
protocol: http
|
||||
```
|
||||
|
||||
Once again, let's feed our `Output` some logs:
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: Flow
|
||||
metadata:
|
||||
name: "devteam-logs"
|
||||
namespace: "devteam"
|
||||
spec:
|
||||
localOutputRefs:
|
||||
- "devteam-splunk"
|
||||
```
|
||||
|
||||
|
||||
### Output to Syslog
|
||||
|
||||
Let's say you wanted to send all logs in your cluster to an `syslog` server. First, we create a `ClusterOutput`:
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: ClusterOutput
|
||||
metadata:
|
||||
name: "example-syslog"
|
||||
namespace: "cattle-logging-system"
|
||||
spec:
|
||||
syslog:
|
||||
buffer:
|
||||
timekey: 30s
|
||||
timekey_use_utc: true
|
||||
timekey_wait: 10s
|
||||
flush_interval: 5s
|
||||
format:
|
||||
type: json
|
||||
app_name_field: test
|
||||
host: syslog.example.com
|
||||
insecure: true
|
||||
port: 514
|
||||
transport: tcp
|
||||
```
|
||||
|
||||
Now that we have configured where we want the logs to go, let's configure all logs to go to that `Output`.
|
||||
|
||||
```yaml
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: ClusterFlow
|
||||
metadata:
|
||||
name: "all-logs"
|
||||
namespace: cattle-logging-system
|
||||
spec:
|
||||
globalOutputRefs:
|
||||
- "example-syslog"
|
||||
```
|
||||
|
||||
### Unsupported Outputs
|
||||
|
||||
For the final example, we create an `Output` to write logs to a destination that is not supported out of the box:
|
||||
|
||||
:::note Note on syslog:
|
||||
|
||||
`syslog` is a supported `Output`. However, this example still provides an overview on using unsupported plugins.
|
||||
|
||||
:::
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: syslog-config
|
||||
namespace: cattle-logging-system
|
||||
type: Opaque
|
||||
stringData:
|
||||
fluent-bit.conf: |
|
||||
[INPUT]
|
||||
Name forward
|
||||
Port 24224
|
||||
|
||||
[OUTPUT]
|
||||
Name syslog
|
||||
InstanceName syslog-output
|
||||
Match *
|
||||
Addr syslog.example.com
|
||||
Port 514
|
||||
Cluster ranchers
|
||||
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: fluentbit-syslog-forwarder
|
||||
namespace: cattle-logging-system
|
||||
labels:
|
||||
output: syslog
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
output: syslog
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
output: syslog
|
||||
spec:
|
||||
containers:
|
||||
- name: fluentbit
|
||||
image: paynejacob/fluent-bit-out-syslog:latest
|
||||
ports:
|
||||
- containerPort: 24224
|
||||
volumeMounts:
|
||||
- mountPath: "/fluent-bit/etc/"
|
||||
name: configuration
|
||||
volumes:
|
||||
- name: configuration
|
||||
secret:
|
||||
secretName: syslog-config
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: syslog-forwarder
|
||||
namespace: cattle-logging-system
|
||||
spec:
|
||||
selector:
|
||||
output: syslog
|
||||
ports:
|
||||
- protocol: TCP
|
||||
port: 24224
|
||||
targetPort: 24224
|
||||
---
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: ClusterFlow
|
||||
metadata:
|
||||
name: all-logs
|
||||
namespace: cattle-logging-system
|
||||
spec:
|
||||
globalOutputRefs:
|
||||
- syslog
|
||||
---
|
||||
apiVersion: logging.banzaicloud.io/v1beta1
|
||||
kind: ClusterOutput
|
||||
metadata:
|
||||
name: syslog
|
||||
namespace: cattle-logging-system
|
||||
spec:
|
||||
forward:
|
||||
servers:
|
||||
- host: "syslog-forwarder.cattle-logging-system"
|
||||
require_ack_response: false
|
||||
ignore_network_errors_at_startup: false
|
||||
```
|
||||
|
||||
Let's break down what is happening here. First, we create a deployment of a container that has the additional `syslog` plugin and accepts logs forwarded from another `fluentd`. Next we create an `Output` configured as a forwarder to our deployment. The deployment `fluentd` will then forward all logs to the configured `syslog` destination.
|
||||
@@ -0,0 +1,32 @@
|
||||
---
|
||||
title: Logging Architecture
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/logging-architecture"/>
|
||||
</head>
|
||||
|
||||
This section summarizes the architecture of the Rancher logging application.
|
||||
|
||||
For more details about how the Logging operator works, see the [official documentation.](https://kube-logging.github.io/docs/#architecture)
|
||||
|
||||
## How the Logging Operator Works
|
||||
|
||||
The Logging operator automates the deployment and configuration of a Kubernetes logging pipeline. It deploys and configures a Fluent Bit DaemonSet on every node to collect container and application logs from the node file system.
|
||||
|
||||
Fluent Bit queries the Kubernetes API and enriches the logs with metadata about the pods, and transfers both the logs and the metadata to Fluentd. Fluentd receives, filters, and transfers logs to multiple `Outputs`.
|
||||
|
||||
The following custom resources are used to define how logs are filtered and sent to their `Outputs`:
|
||||
|
||||
- A `Flow` is a namespaced custom resource that uses filters and selectors to route log messages to the appropriate `Outputs`.
|
||||
- A `ClusterFlow` is used to route cluster-level log messages.
|
||||
- An `Output` is a namespaced resource that defines where the log messages are sent.
|
||||
- A `ClusterOutput` defines an `Output` that is available from all `Flows` and `ClusterFlows`.
|
||||
|
||||
Each `Flow` must reference an `Output`, and each `ClusterFlow` must reference a `ClusterOutput`.
|
||||
|
||||
The following figure from the [Logging Operator documentation](https://kube-logging.github.io/docs/#architecture) shows the new logging architecture:
|
||||
|
||||
<figcaption>How the Logging Operator Works with Fluentd and Fluent Bit</figcaption>
|
||||
|
||||

|
||||
+97
@@ -0,0 +1,97 @@
|
||||
---
|
||||
title: rancher-logging Helm Chart Options
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/logging-helm-chart-options"/>
|
||||
</head>
|
||||
|
||||
## Enable/Disable Windows Node Logging
|
||||
|
||||
You can enable or disable Windows node logging by setting `global.cattle.windows.enabled` to either `true` or `false` in the `values.yaml`.
|
||||
|
||||
By default, Windows node logging will be enabled if the Cluster Dashboard UI is used to install the logging application on a Windows cluster.
|
||||
|
||||
In this scenario, setting `global.cattle.windows.enabled` to `false` will disable Windows node logging on the cluster.
|
||||
When disabled, logs will still be collected from Linux nodes within the Windows cluster.
|
||||
|
||||
:::note
|
||||
|
||||
Currently an [issue](https://github.com/rancher/rancher/issues/32325) exists where Windows nodeAgents are not deleted when performing a `helm upgrade` after disabling Windows logging in a Windows cluster. In this scenario, users may need to manually remove the Windows nodeAgents if they are already installed.
|
||||
|
||||
:::
|
||||
|
||||
## Working with a Custom Docker Root Directory
|
||||
|
||||
If using a custom Docker root directory, you can set `global.dockerRootDirectory` in `values.yaml`.
|
||||
|
||||
This will ensure that the Logging CRs created will use your specified path rather than the default Docker `data-root` location.
|
||||
|
||||
Note that this only affects Linux nodes.
|
||||
|
||||
If there are any Windows nodes in the cluster, the change will not be applicable to those nodes.
|
||||
|
||||
## Adding NodeSelector Settings and Tolerations for Custom Taints
|
||||
|
||||
You can add your own `nodeSelector` settings and add `tolerations` for additional taints by editing the logging Helm chart values. For details, see [this page.](taints-and-tolerations.md)
|
||||
|
||||
## Enabling the Logging Application to Work with SELinux
|
||||
|
||||
:::note Requirements:
|
||||
|
||||
Logging v2 was tested with SELinux on RHEL/CentOS 7 and 8.
|
||||
|
||||
:::
|
||||
|
||||
[Security-Enhanced Linux (SELinux)](https://en.wikipedia.org/wiki/Security-Enhanced_Linux) is a security enhancement to Linux. After being historically used by government agencies, SELinux is now industry standard and is enabled by default on CentOS 7 and 8.
|
||||
|
||||
To use Logging v2 with SELinux, we recommend installing the `rancher-selinux` RPM according to these [instructions](../../reference-guides/rancher-security/selinux-rpm/selinux-rpm.md).
|
||||
|
||||
Then, when installing the logging application, configure the chart to be SELinux aware by changing `global.seLinux.enabled` to `true` in the `values.yaml`.
|
||||
|
||||
## Additional Logging Sources
|
||||
|
||||
By default, Rancher collects logs for [control plane components](https://kubernetes.io/docs/concepts/overview/components/#control-plane-components) and [node components](https://kubernetes.io/docs/concepts/overview/components/#node-components) for all cluster types.
|
||||
|
||||
In some cases, Rancher may be able to collect additional logs.
|
||||
|
||||
The following table summarizes the sources where additional logs may be collected for each node types:
|
||||
|
||||
| Logging Source | Linux Nodes (including in Windows cluster) | Windows Nodes |
|
||||
| --- | --- | ---|
|
||||
| RKE2 | ✓ | |
|
||||
| K3s | ✓ | |
|
||||
| AKS | ✓ | |
|
||||
| EKS | ✓ | |
|
||||
| GKE | ✓ | |
|
||||
|
||||
To enable hosted Kubernetes providers as additional logging sources, enable **Enable enhanced cloud provider logging** option when installing or upgrading the Logging Helm chart.
|
||||
|
||||
When enabled, Rancher collects all additional node and control plane logs the provider has made available, which may vary between providers
|
||||
|
||||
If you're already using a cloud provider's own logging solution such as AWS CloudWatch or Google Cloud operations suite (formerly Stackdriver), it is not necessary to enable this option as the native solution will have unrestricted access to all logs.
|
||||
|
||||
## Systemd Configuration
|
||||
|
||||
In Rancher logging, `SystemdLogPath` must be configured for K3s and RKE2 Kubernetes distributions.
|
||||
|
||||
K3s and RKE2 Kubernetes distributions log to journald, which is the subsystem of systemd that is used for logging. In order to collect these logs, the `systemdLogPath` needs to be defined. While the `run/log/journal` directory is used by default, some Linux distributions do not default to this path. For example, Ubuntu defaults to `var/log/journal`. To determine your `systemdLogPath` configuration, see steps below.
|
||||
|
||||
**Steps for Systemd Configuration:**
|
||||
|
||||
* Run `cat /etc/systemd/journald.conf | grep -E ^\#?Storage | cut -d"=" -f2` on one of your nodes.
|
||||
* If `persistent` is returned, your `systemdLogPath` should be `/var/log/journal`.
|
||||
* If `volatile` is returned, your `systemdLogPath` should be `/run/log/journal`.
|
||||
* If `auto` is returned, check if `/var/log/journal` exists.
|
||||
* If `/var/log/journal` exists, then use `/var/log/journal`.
|
||||
* If `/var/log/journal` does not exist, then use `/run/log/journal`.
|
||||
|
||||
:::note
|
||||
|
||||
If any value not described above is returned, Rancher Logging will not be able to collect control plane logs. To address this issue, you will need to perform the following actions on every control plane node:
|
||||
|
||||
* Set `Storage=volatile` in journald.conf.
|
||||
* Reboot your machine.
|
||||
* Set `systemdLogPath` to `/run/log/journal`.
|
||||
|
||||
:::
|
||||
@@ -0,0 +1,152 @@
|
||||
---
|
||||
title: Rancher Integration with Logging Services
|
||||
description: Rancher integrates with popular logging services. Learn the requirements and benefits of integrating with logging services, and enable logging on your cluster.
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging"/>
|
||||
</head>
|
||||
|
||||
The [Logging operator](https://kube-logging.github.io/docs/) now powers Rancher's logging solution in place of the former, in-house solution.
|
||||
|
||||
## Enabling Logging
|
||||
|
||||
You can enable the logging for a Rancher managed cluster by going to the Apps page and installing the logging app.
|
||||
|
||||
1. Go to the cluster where you want to install logging and click **Apps**.
|
||||
1. Click the **Logging** app.
|
||||
1. Scroll to the bottom of the Helm chart README and click **Install**.
|
||||
|
||||
**Result:** The logging app is deployed in the `cattle-logging-system` namespace.
|
||||
|
||||
## Uninstall Logging
|
||||
|
||||
1. Go to the cluster where you want to install logging and click **Apps**.
|
||||
1. Click **Installed Apps**.
|
||||
1. Go to the `cattle-logging-system` namespace and check the boxes for `rancher-logging` and `rancher-logging-crd`.
|
||||
1. Click **Delete**.
|
||||
1. Confirm **Delete**.
|
||||
|
||||
**Result** `rancher-logging` is uninstalled.
|
||||
|
||||
## Architecture
|
||||
|
||||
For more information about how the logging application works, see [this section.](logging-architecture.md)
|
||||
|
||||
|
||||
|
||||
## Role-based Access Control
|
||||
|
||||
Rancher logging has two roles, `logging-admin` and `logging-view`. For more information on how and when to use these roles, see [this page.](rbac-for-logging.md)
|
||||
|
||||
## Configuring Logging Custom Resources
|
||||
|
||||
To manage `Flows,` `ClusterFlows`, `Outputs`, and `ClusterOutputs`,
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to configure logging custom resources and click **Explore**.
|
||||
1. In the left navigation bar, click **Logging**.
|
||||
|
||||
### Flows and ClusterFlows
|
||||
|
||||
For help with configuring `Flows` and `ClusterFlows`, see [this page.](custom-resource-configuration/flows-and-clusterflows.md)
|
||||
|
||||
### Outputs and ClusterOutputs
|
||||
|
||||
For help with configuring `Outputs` and `ClusterOutputs`, see [this page.](custom-resource-configuration/outputs-and-clusteroutputs.md)
|
||||
|
||||
## Using a custom HostTailer image
|
||||
|
||||
To use a custom image for the `HostTailer` resource, you need to specify the image in the `containerOverrides` section of each `fileTailer` of the `HostTailer` resource.
|
||||
|
||||
```yaml
|
||||
apiVersion: logging-extensions.banzaicloud.io/v1alpha1
|
||||
kind: HostTailer
|
||||
metadata:
|
||||
name: cluster-system-log
|
||||
spec:
|
||||
workloadMetaOverrides:
|
||||
annotations: {}
|
||||
labels: {}
|
||||
fileTailers:
|
||||
- disabled: false
|
||||
name: kubelet-log
|
||||
path: /var/lib/rancher/rke2/agent/logs/*.log
|
||||
containerOverrides:
|
||||
image: <your_registry>/<your_image>:<your_tag>
|
||||
- disabled: false
|
||||
name: containerd-log
|
||||
path: /var/lib/rancher/rke2/agent/containerd/*.log
|
||||
containerOverrides:
|
||||
image: <your_registry>/<your_image>:<your_tag>
|
||||
- name: kube-audit
|
||||
path: /var/log/kube-audit/audit-log.json
|
||||
disabled: false
|
||||
containerOverrides:
|
||||
image: <your_registry>/<your_image>:<your_tag>
|
||||
```
|
||||
|
||||
## Configuring the Logging Helm Chart
|
||||
|
||||
For a list of options that can be configured when the logging application is installed or upgraded, see [this page.](logging-helm-chart-options.md)
|
||||
|
||||
### Windows Support
|
||||
|
||||
You can [enable logging](logging-helm-chart-options.md#enabledisable-windows-node-logging) from Windows nodes.
|
||||
|
||||
|
||||
### Working with a Custom Docker Root Directory
|
||||
|
||||
For details on using a custom Docker root directory, see [this section.](logging-helm-chart-options.md#working-with-a-custom-docker-root-directory)
|
||||
|
||||
|
||||
### Working with Taints and Tolerations
|
||||
|
||||
For information on how to use taints and tolerations with the logging application, see [this page.](taints-and-tolerations.md)
|
||||
|
||||
|
||||
### Logging V2 with SELinux
|
||||
|
||||
For information on enabling the logging application for SELinux-enabled nodes, see [this section.](logging-helm-chart-options.md#enabling-the-logging-application-to-work-with-selinux)
|
||||
|
||||
### Additional Logging Sources
|
||||
|
||||
By default, Rancher collects logs for control plane components and node components for all cluster types. In some cases additional logs can be collected. For details, see [this section.](logging-helm-chart-options.md#additional-logging-sources)
|
||||
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### The Logging Buffer Overloads Pods
|
||||
|
||||
Depending on your configuration, the default buffer size may be too large and cause pod failures. One way to reduce the load is to lower the logger's flush interval. This prevents logs from overfilling the buffer. You can also add more flush threads to handle moments when many logs are attempting to fill the buffer at once.
|
||||
|
||||
For a more complete description of how to configure the logging buffer to suit your organization's needs, see the official Logging operator documentation on [buffers](https://kube-logging.github.io/docs/configuration/plugins/outputs/buffer/) and on [Fluentd configuration](https://kube-logging.github.io/docs/logging-infrastructure/fluentd/).
|
||||
|
||||
### The `cattle-logging` Namespace Being Recreated
|
||||
|
||||
If your cluster previously deployed logging from the global view in the legacy Rancher UI, you may encounter an issue where its `cattle-logging` namespace is continually being recreated.
|
||||
|
||||
The solution is to delete all `clusterloggings.management.cattle.io` and `projectloggings.management.cattle.io` custom resources from the cluster specific namespace in the management cluster.
|
||||
The existence of these custom resources causes Rancher to create the `cattle-logging` namespace in the downstream cluster if it does not exist.
|
||||
|
||||
The cluster namespace matches the cluster ID, so we need to find the cluster ID for each cluster.
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster you want to get the ID of and click **Explore**.
|
||||
2. Copy the `<cluster-id>` portion from one of the URLs below. The `<cluster-id>` portion is the cluster namespace name.
|
||||
|
||||
```bash
|
||||
# Cluster Management UI
|
||||
https://<your-url>/c/<cluster-id>/
|
||||
|
||||
# Cluster Dashboard
|
||||
https://<your-url>/dashboard/c/<cluster-id>/
|
||||
```
|
||||
|
||||
Now that we have the `<cluster-id>` namespace, we can delete the CRs that cause `cattle-logging` to be continually recreated.
|
||||
*Warning:* ensure that logging, the version installed from the global view in the legacy Rancher UI, is not currently in use.
|
||||
|
||||
```bash
|
||||
kubectl delete crd clusterloggings.management.cattle.io -n <cluster-id>
|
||||
kubectl delete crd projectloggings.management.cattle.io -n <cluster-id>
|
||||
```
|
||||
@@ -0,0 +1,27 @@
|
||||
---
|
||||
title: Role-based Access Control for Logging
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/rbac-for-logging"/>
|
||||
</head>
|
||||
|
||||
Rancher logging has two roles, `logging-admin` and `logging-view`.
|
||||
|
||||
- `logging-admin` gives users full access to namespaced `Flows` and `Outputs`
|
||||
- `logging-view` allows users to *view* namespaced `Flows` and `Outputs`, and `ClusterFlows` and `ClusterOutputs`
|
||||
|
||||
:::note Why choose one role over the other?
|
||||
|
||||
Edit access to `ClusterFlow` and `ClusterOutput` resources is powerful. Any user with it has edit access for all logs in the cluster.
|
||||
|
||||
:::
|
||||
|
||||
In Rancher, the cluster administrator role is the only role with full access to all `rancher-logging` resources. Cluster members are not able to edit or read any logging resources. Project owners and members have the following privileges:
|
||||
|
||||
Project Owners | Project Members
|
||||
--- | ---
|
||||
able to create namespaced `Flows` and `Outputs` in their projects' namespaces | only able to view the `Flows` and `Outputs` in projects' namespaces
|
||||
can collect logs from anything in their projects' namespaces | cannot collect any logs in their projects' namespaces
|
||||
|
||||
Both project owners and project members require at least *one* namespace in their project to use logging. If they do not, then they may not see the logging button in the top nav dropdown.
|
||||
@@ -0,0 +1,69 @@
|
||||
---
|
||||
title: Working with Taints and Tolerations
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/logging/taints-and-tolerations"/>
|
||||
</head>
|
||||
|
||||
"Tainting" a Kubernetes node causes pods to repel running on that node.
|
||||
|
||||
Unless the pods have a `toleration` for that node's taint, they will run on other nodes in the cluster.
|
||||
|
||||
[Taints and tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) can work in conjunction with the `nodeSelector` [field](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) within the `PodSpec`, which enables the *opposite* effect of a taint.
|
||||
|
||||
Using `nodeSelector` gives pods an affinity towards certain nodes.
|
||||
|
||||
Both provide choice for the what node(s) the pod will run on.
|
||||
|
||||
- [Default Implementation in Rancher's Logging Stack](#default-implementation-in-ranchers-logging-stack)
|
||||
- [Adding NodeSelector Settings and Tolerations for Custom Taints](#adding-nodeselector-settings-and-tolerations-for-custom-taints)
|
||||
|
||||
|
||||
## Default Implementation in Rancher's Logging Stack
|
||||
|
||||
By default, Rancher taints all Linux nodes with `cattle.io/os=linux`, and does not taint Windows nodes.
|
||||
The logging stack pods have `tolerations` for this taint, which enables them to run on Linux nodes.
|
||||
Moreover, most logging stack pods run on Linux only and have a `nodeSelector` added to ensure they run on Linux nodes.
|
||||
|
||||
This example Pod YAML file shows a nodeSelector being used with a toleration:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
# metadata...
|
||||
spec:
|
||||
# containers...
|
||||
tolerations:
|
||||
- key: cattle.io/os
|
||||
operator: "Equal"
|
||||
value: "linux"
|
||||
effect: NoSchedule
|
||||
nodeSelector:
|
||||
kubernetes.io/os: linux
|
||||
```
|
||||
|
||||
In the above example, we ensure that our pod only runs on Linux nodes, and we add a `toleration` for the taint we have on all of our Linux nodes.
|
||||
|
||||
You can do the same with Rancher's existing taints, or with your own custom ones.
|
||||
|
||||
## Adding NodeSelector Settings and Tolerations for Custom Taints
|
||||
|
||||
If you would like to add your own `nodeSelector` settings, or if you would like to add `tolerations` for additional taints, you can pass the following to the chart's values.
|
||||
|
||||
```yaml
|
||||
tolerations:
|
||||
# insert tolerations...
|
||||
nodeSelector:
|
||||
# insert nodeSelector...
|
||||
```
|
||||
|
||||
These values will add both settings to the `fluentd`, `fluentbit`, and `logging-operator` containers.
|
||||
Essentially, these are global settings for all pods in the logging stack.
|
||||
|
||||
However, if you would like to add tolerations for *only* the `fluentbit` container, you can add the following to the chart's values.
|
||||
|
||||
```yaml
|
||||
fluentbit_tolerations:
|
||||
# insert tolerations list for fluentbit containers only...
|
||||
```
|
||||
@@ -0,0 +1,15 @@
|
||||
---
|
||||
title: Cloud Native Storage with Longhorn
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/longhorn"/>
|
||||
</head>
|
||||
|
||||
## Longhorn
|
||||
|
||||
Longhorn is an official [Cloud Native Computing Foundation project (CNCF)](https://cncf.io/) project that delivers a powerful cloud-native distributed storage platform for Kubernetes that can run anywhere. When combined with Rancher, Longhorn makes the deployment of highly available persistent block storage in your Kubernetes environment easy, fast and reliable.
|
||||
|
||||
## Longhorn with Rancher
|
||||
|
||||
With Rancher Prime and Longhorn, users can easily deploy with 1-click via the Rancher catalog and conduct lifecycle management for managed clusters; empowering the user to install and upgrade, together with draining operation for graceful operations. Longhorn with Rancher also provides mixed cluster support with Windows, Rancher hosted images, UI Proxy access through Rancher, and Rancher monitoring with Longhorn metrics.
|
||||
@@ -0,0 +1,74 @@
|
||||
---
|
||||
title: Overview
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/longhorn/overview"/>
|
||||
</head>
|
||||
|
||||
[Longhorn](https://longhorn.io/) is a lightweight, reliable, and easy-to-use distributed block storage system for Kubernetes.
|
||||
|
||||
Longhorn is free, open source software. Originally developed by Rancher Labs, it is now being developed as a sandbox project of the Cloud Native Computing Foundation. It can be installed on any Kubernetes cluster with Helm, with kubectl, or with the Rancher UI. You can learn more about its architecture [here.](https://longhorn.io/docs/latest/concepts/)
|
||||
|
||||
With Longhorn, you can:
|
||||
|
||||
- Use Longhorn volumes as persistent storage for the distributed stateful applications in your Kubernetes cluster
|
||||
- Partition your block storage into Longhorn volumes so that you can use Kubernetes volumes with or without a cloud provider
|
||||
- Replicate block storage across multiple nodes and data centers to increase availability
|
||||
- Store backup data in external storage such as NFS or AWS S3
|
||||
- Create cross-cluster disaster recovery volumes so that data from a primary Kubernetes cluster can be quickly recovered from backup in a second Kubernetes cluster
|
||||
- Schedule recurring snapshots of a volume, and schedule recurring backups to NFS or S3-compatible secondary storage
|
||||
- Restore volumes from backup
|
||||
- Upgrade Longhorn without disrupting persistent volumes
|
||||
|
||||
<figcaption>Longhorn Dashboard</figcaption>
|
||||
|
||||

|
||||
|
||||
## Installing Longhorn with Rancher
|
||||
|
||||
1. Fulfill all [Installation Requirements.](https://longhorn.io/docs/latest/deploy/install/#installation-requirements)
|
||||
1. Go to the cluster where you want to install Longhorn.
|
||||
1. Click **Apps**.
|
||||
1. Click **Charts**.
|
||||
1. Click **Longhorn**.
|
||||
1. Optional: To customize the initial settings, click **Longhorn Default Settings** and edit the configuration. For help customizing the settings, refer to the [Longhorn documentation.](https://longhorn.io/docs/latest/references/settings/)
|
||||
1. Click **Install**.
|
||||
|
||||
**Result:** Longhorn is deployed in the Kubernetes cluster.
|
||||
|
||||
## Accessing Longhorn from the Rancher UI
|
||||
|
||||
1. Go to the cluster where Longhorn is installed. In the left navigation menu, click **Longhorn**.
|
||||
1. On this page, you can edit Kubernetes resources managed by Longhorn. To view the Longhorn UI, click the **Longhorn** button in the **Overview** section.
|
||||
|
||||
**Result:** You will be taken to the Longhorn UI, where you can manage your Longhorn volumes and their replicas in the Kubernetes cluster, as well as secondary backups of your Longhorn storage that may exist in another Kubernetes cluster or in S3.
|
||||
|
||||
## Uninstalling Longhorn from the Rancher UI
|
||||
|
||||
1. Go to the cluster where Longhorn is installed and click **Apps**.
|
||||
1. Click **Installed Apps**.
|
||||
1. Go to the `longhorn-system` namespace and check the boxes next to the `longhorn` and `longhorn-crd` apps.
|
||||
1. Click **Delete,** and confirm **Delete**.
|
||||
|
||||
**Result:** Longhorn is uninstalled.
|
||||
|
||||
## GitHub Repository
|
||||
|
||||
The Longhorn project is available [here.](https://github.com/longhorn/longhorn)
|
||||
|
||||
## Documentation
|
||||
|
||||
The Longhorn documentation is [here.](https://longhorn.io/docs/)
|
||||
|
||||
## Architecture
|
||||
|
||||
Longhorn creates a dedicated storage controller for each volume and synchronously replicates the volume across multiple replicas stored on multiple nodes.
|
||||
|
||||
The storage controller and replicas are themselves orchestrated using Kubernetes.
|
||||
|
||||
You can learn more about its architecture [here.](https://longhorn.io/docs/latest/concepts/)
|
||||
|
||||
<figcaption>Longhorn Architecture</figcaption>
|
||||
|
||||

|
||||
+121
@@ -0,0 +1,121 @@
|
||||
---
|
||||
title: Built-in Dashboards
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/monitoring-and-alerting/built-in-dashboards"/>
|
||||
</head>
|
||||
|
||||
## Grafana UI
|
||||
|
||||
[Grafana](https://grafana.com/grafana/) allows you to query, visualize, alert on and understand your metrics no matter where they are stored. Create, explore, and share dashboards with your team and foster a data driven culture.
|
||||
|
||||
To see the default dashboards for time series data visualization, go to the Grafana UI.
|
||||
|
||||
### Customizing Grafana
|
||||
|
||||
To view and customize the PromQL queries powering the Grafana dashboard, see [this page.](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/customize-grafana-dashboard.md)
|
||||
|
||||
### Persistent Grafana Dashboards
|
||||
|
||||
To create a persistent Grafana dashboard, see [this page.](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/create-persistent-grafana-dashboard.md)
|
||||
|
||||
### Access to Grafana
|
||||
|
||||
For information about role-based access control for Grafana, see [this section.](rbac-for-monitoring.md#role-based-access-control-for-grafana)
|
||||
|
||||
|
||||
## Alertmanager UI
|
||||
|
||||
When `rancher-monitoring` is installed, the Prometheus Alertmanager UI is deployed, allowing you to view your alerts and the current Alertmanager configuration.
|
||||
|
||||
:::note
|
||||
|
||||
This section assumes familiarity with how monitoring components work together. For more information about Alertmanager, see [How Alertmanager Works.](how-monitoring-works.md#3-how-alertmanager-works)
|
||||
|
||||
:::
|
||||
|
||||
### Accessing the Alertmanager UI
|
||||
|
||||
The Alertmanager UI lets you see the most recently fired alerts.
|
||||
|
||||
:::note Prerequisite:
|
||||
|
||||
The `rancher-monitoring` application must be installed.
|
||||
|
||||
:::
|
||||
|
||||
To see the Alertmanager UI,
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see the Alertmanager UI, click **Explore**.
|
||||
1. In the left navigation bar, click **Monitoring**.
|
||||
1. Click **Alertmanager**.
|
||||
|
||||
**Result:** The Alertmanager UI opens in a new tab. For help with configuration, refer to the [official Alertmanager documentation.](https://prometheus.io/docs/alerting/latest/alertmanager/)
|
||||
|
||||
For more information on configuring Alertmanager in Rancher, see [this page.](../../how-to-guides/advanced-user-guides/monitoring-v2-configuration-guides/advanced-configuration/alertmanager.md)
|
||||
|
||||
<figcaption>The Alertmanager UI</figcaption>
|
||||
|
||||

|
||||
|
||||
|
||||
### Viewing Default Alerts
|
||||
|
||||
To see alerts that are fired by default, go to the Alertmanager UI and click **Expand all groups**.
|
||||
|
||||
|
||||
## Prometheus UI
|
||||
|
||||
By default, the [kube-state-metrics service](https://github.com/kubernetes/kube-state-metrics) provides a wealth of information about CPU and memory utilization to the monitoring application. These metrics cover Kubernetes resources across namespaces. This means that in order to see resource metrics for a service, you don't need to create a new ServiceMonitor for it. Because the data is already in the time series database, you can go to the Prometheus UI and run a PromQL query to get the information. The same query can be used to configure a Grafana dashboard to show a graph of those metrics over time.
|
||||
|
||||
To see the Prometheus UI, install `rancher-monitoring`. Then:
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see the Prometheus UI and click **Explore**.
|
||||
1. In the left navigation bar, click **Monitoring**.
|
||||
1. Click **Prometheus Graph**.
|
||||
|
||||
<figcaption>Prometheus Graph UI</figcaption>
|
||||
|
||||

|
||||
|
||||
### Viewing the Prometheus Targets
|
||||
|
||||
To see what services you are monitoring, you will need to see your targets. Targets are set up by ServiceMonitors and PodMonitors as sources to scrape metrics from. You won't need to directly edit targets, but the Prometheus UI can be useful for giving you an overview of all of the sources of metrics that are being scraped.
|
||||
|
||||
To see the Prometheus Targets, install `rancher-monitoring`. Then:
|
||||
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see the Prometheus targets and click **Explore**.
|
||||
1. In the left navigation bar, click **Monitoring**.
|
||||
1. Click **Prometheus Targets**.
|
||||
|
||||
<figcaption>Targets in the Prometheus UI</figcaption>
|
||||
|
||||

|
||||
|
||||
### Viewing the PrometheusRules
|
||||
|
||||
When you define a Rule (which is declared within a RuleGroup in a PrometheusRule resource), the [spec of the Rule itself](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api-reference/api.md#rule) contains labels that are used by Alertmanager to figure out which Route should receive a certain Alert.
|
||||
|
||||
To see the PrometheusRules, install `rancher-monitoring`. Then:
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see the visualizations and click **Explore**.
|
||||
1. In the left navigation bar, click **Monitoring**.
|
||||
1. Click **Prometheus Rules**.
|
||||
|
||||
You can also see the rules in the Prometheus UI:
|
||||
|
||||
<figcaption>Rules in the Prometheus UI</figcaption>
|
||||
|
||||

|
||||
|
||||
For more information on configuring PrometheusRules in Rancher, see [this page.](../../how-to-guides/advanced-user-guides/monitoring-v2-configuration-guides/advanced-configuration/prometheusrules.md)
|
||||
|
||||
## Legacy UI
|
||||
|
||||
For information on the dashboards available in v2.2 to v2.4 of Rancher, before the introduction of the `rancher-monitoring` application, see the [Rancher v2.0—v2.4 docs](https://github.com/rancher/rancher-docs/tree/main/archived_docs/en/version-2.0-2.4/explanations/integrations-in-rancher/cluster-monitoring/viewing-metrics.md).
|
||||
+249
@@ -0,0 +1,249 @@
|
||||
---
|
||||
title: How Monitoring Works
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/monitoring-and-alerting/how-monitoring-works"/>
|
||||
</head>
|
||||
|
||||
## 1. Architecture Overview
|
||||
|
||||
_**The following sections describe how data flows through the Monitoring V2 application:**_
|
||||
|
||||
### Prometheus Operator
|
||||
|
||||
Prometheus Operator observes ServiceMonitors, PodMonitors, and PrometheusRules being created. When the Prometheus configuration resources are created, Prometheus Operator calls the Prometheus API to sync the new configuration. As the diagram at the end of this section shows, the Prometheus Operator acts as the intermediary between Prometheus and Kubernetes, calling the Prometheus API to synchronize Prometheus with the monitoring-related resources in Kubernetes.
|
||||
|
||||
### ServiceMonitors and PodMonitors
|
||||
|
||||
ServiceMonitors and PodMonitors declaratively specify targets, such as Services and Pods, that need to be monitored.
|
||||
|
||||
- Targets are scraped on a recurring schedule based on the configured Prometheus scrape interval, and the metrics that are scraped are stored into the Prometheus Time Series Database (TSDB).
|
||||
|
||||
- In order to perform the scrape, ServiceMonitors and PodMonitors are defined with label selectors that determine which Services or Pods should be scraped and endpoints that determine how the scrape should happen on the given target, e.g., scrape/metrics in TCP 10252, proxying through IP addr x.x.x.x.
|
||||
|
||||
- Out of the box, Monitoring V2 comes with certain pre-configured exporters that are deployed based on the type of Kubernetes cluster that it is deployed on. For more information, see [Scraping and Exposing Metrics](#5-scraping-and-exposing-metrics).
|
||||
|
||||
### How PushProx Works
|
||||
|
||||
- Certain internal Kubernetes components are scraped via a proxy deployed as part of Monitoring V2 called **PushProx**. The Kubernetes components that expose metrics to Prometheus through PushProx are the following:
|
||||
`kube-controller-manager`, `kube-scheduler`, `etcd`, and `kube-proxy`.
|
||||
|
||||
- For each PushProx exporter, we deploy one PushProx client onto all target nodes. For example, a PushProx client is deployed onto all controlplane nodes for kube-controller-manager, all etcd nodes for kube-etcd, and all nodes for kubelet.
|
||||
|
||||
- We deploy exactly one PushProx proxy per exporter. The process for exporting metrics is as follows:
|
||||
|
||||
1. The PushProx Client establishes an outbound connection with the PushProx Proxy.
|
||||
1. The client then polls the proxy for scrape requests that have come into the proxy.
|
||||
1. When the proxy receives a scrape request from Prometheus, the client sees it as a result of the poll.
|
||||
1. The client scrapes the internal component.
|
||||
1. The internal component responds by pushing metrics back to the proxy.
|
||||
|
||||
|
||||
<figcaption><br/>Process for Exporting Metrics with PushProx:<br/></figcaption>
|
||||
|
||||

|
||||
|
||||
### PrometheusRules
|
||||
|
||||
PrometheusRules allow users to define rules for what metrics or time series database queries should result in alerts being fired. Rules are evaluated on an interval.
|
||||
|
||||
- **Recording rules** create a new time series based on existing series that have been collected. They are frequently used to precompute complex queries.
|
||||
- **Alerting rules** run a particular query and fire an alert from Prometheus if the query evaluates to a non-zero value.
|
||||
|
||||
### Alert Routing
|
||||
|
||||
Once Prometheus determines that an alert needs to be fired, alerts are forwarded to **Alertmanager**.
|
||||
|
||||
- Alerts contain labels that come from the PromQL query itself and additional labels and annotations that can be provided as part of specifying the initial PrometheusRule.
|
||||
|
||||
- Before receiving any alerts, Alertmanager will use the **routes** and **receivers** specified in its configuration to form a routing tree on which all incoming alerts are evaluated. Each node of the routing tree can specify additional grouping, labeling, and filtering that needs to happen based on the labels attached to the Prometheus alert. A node on the routing tree (usually a leaf node) can also specify that an alert that reaches it needs to be sent out to a configured Receiver, e.g., Slack, PagerDuty, SMS, etc. Note that Alertmanager will send an alert first to **alertingDriver**, then alertingDriver will send or forward alert to the proper destination.
|
||||
|
||||
- Routes and receivers are also stored in the Kubernetes API via the Alertmanager Secret. When the Secret is updated, Alertmanager is also updated automatically. Note that routing occurs via labels only (not via annotations, etc.).
|
||||
|
||||
## 2. How Prometheus Works
|
||||
|
||||
### Storing Time Series Data
|
||||
|
||||
After collecting metrics from exporters, Prometheus stores the time series in a local on-disk time series database. Prometheus optionally integrates with remote systems, but `rancher-monitoring` uses local storage for the time series database.
|
||||
|
||||
Once stored, users can query this TSDB using PromQL, the query language for Prometheus.
|
||||
|
||||
PromQL queries can be visualized in one of two ways:
|
||||
|
||||
1. By supplying the query in Prometheus's Graph UI, which will show a simple graphical view of the data.
|
||||
1. By creating a Grafana Dashboard that contains the PromQL query and additional formatting directives that label axes, add units, change colors, use alternative visualizations, etc.
|
||||
|
||||
### Defining Rules for Prometheus
|
||||
|
||||
Rules define queries that Prometheus needs to execute on a regular `evaluationInterval` to perform certain actions, such as firing an alert (alerting rules) or precomputing a query based on others existing in its TSDB (recording rules). These rules are encoded in PrometheusRules custom resources. When PrometheusRule custom resources are created or updated, the Prometheus Operator observes the change and calls the Prometheus API to synchronize the set of rules that Prometheus is currently evaluating on a regular interval.
|
||||
|
||||
A PrometheusRule allows you to define one or more RuleGroups. Each RuleGroup consists of a set of Rule objects that can each represent either an alerting or a recording rule with the following fields:
|
||||
|
||||
- The name of the new alert or record
|
||||
- A PromQL expression for the new alert or record
|
||||
- Labels that should be attached to the alert or record that identify it (e.g. cluster name or severity)
|
||||
- Annotations that encode any additional important pieces of information that need to be displayed on the notification for an alert (e.g. summary, description, message, runbook URL, etc.). This field is not required for recording rules.
|
||||
|
||||
Upon evaluating a [rule](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api-reference/api.md#rule), Prometheus runs the provided PromQL query, adds the provided labels, and runs the appropriate action for the rule. If the rule triggers an alert, Prometheus also adds the provided annotations. For example, an Alerting Rule that adds `team: front-end` as a label to the provided PromQL query will append that label to the fired alert, which will allow Alertmanager to forward the alert to the correct Receiver.
|
||||
|
||||
### Alerting and Recording Rules
|
||||
|
||||
Prometheus doesn't maintain the state of whether alerts are active. It fires alerts repetitively at every evaluation interval, relying on Alertmanager to group and filter the alerts into meaningful notifications.
|
||||
|
||||
The `evaluation_interval` constant defines how often Prometheus evaluates its alerting rules against the time series database. Similar to the `scrape_interval`, the `evaluation_interval` also defaults to one minute.
|
||||
|
||||
The rules are contained in a set of rule files. Rule files include both alerting rules and recording rules, but only alerting rules result in alerts being fired after their evaluation.
|
||||
|
||||
For recording rules, Prometheus runs a query, then stores it as a time series. This synthetic time series is useful for storing the results of an expensive or time-consuming query so that it can be queried more quickly in the future.
|
||||
|
||||
Alerting rules are more commonly used. Whenever an alerting rule evaluates to a positive number, Prometheus fires an alert.
|
||||
|
||||
The Rule file adds labels and annotations to alerts before firing them, depending on the use case:
|
||||
|
||||
- Labels indicate information that identifies the alert and could affect the routing of the alert. For example, if when sending an alert about a certain container, the container ID could be used as a label.
|
||||
|
||||
- Annotations denote information that doesn't affect where an alert is routed, for example, a runbook or an error message.
|
||||
|
||||
## 3. How Alertmanager Works
|
||||
|
||||
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of the following tasks:
|
||||
|
||||
- Deduplicating, grouping, and routing alerts to the correct receiver integration such as email, PagerDuty, or OpsGenie
|
||||
|
||||
- Silencing and inhibition of alerts
|
||||
|
||||
- Tracking alerts that fire over time
|
||||
|
||||
- Sending out the status of whether an alert is currently firing, or if it is resolved
|
||||
|
||||
### Alerts Forwarded by alertingDrivers
|
||||
|
||||
When alertingDrivers are installed, this creates a `Service` that can be used as the receiver's URL for Teams or SMS, based on the alertingDriver's configuration. The URL in the Receiver points to the alertingDrivers; so the Alertmanager sends alert first to alertingDriver, then alertingDriver forwards or sends alert to the proper destination.
|
||||
|
||||
### Routing Alerts to Receivers
|
||||
|
||||
Alertmanager coordinates where alerts are sent. It allows you to group alerts based on labels and fire them based on whether certain labels are matched. One top-level route accepts all alerts. From there, Alertmanager continues routing alerts to receivers based on whether they match the conditions of the next route.
|
||||
|
||||
While the Rancher UI forms only allow editing a routing tree that is two levels deep, you can configure more deeply nested routing structures by editing the Alertmanager Secret.
|
||||
|
||||
### Configuring Multiple Receivers
|
||||
|
||||
By editing the forms in the Rancher UI, you can set up a Receiver resource with all the information Alertmanager needs to send alerts to your notification system.
|
||||
|
||||
By editing custom YAML in the Alertmanager or Receiver configuration, you can also send alerts to multiple notification systems. For more information, see the section on configuring [Receivers.](../../reference-guides/monitoring-v2-configuration/receivers.md#configuring-multiple-receivers)
|
||||
|
||||
## 4. Monitoring V2 Specific Components
|
||||
|
||||
Prometheus Operator introduces a set of [Custom Resource Definitions](https://github.com/prometheus-operator/prometheus-operator#customresourcedefinitions) that allow users to deploy and manage Prometheus and Alertmanager instances by creating and modifying those custom resources on a cluster.
|
||||
|
||||
Prometheus Operator will automatically update your Prometheus configuration based on the live state of the resources and configuration options that are edited in the Rancher UI.
|
||||
|
||||
### Resources Deployed by Default
|
||||
|
||||
By default, a set of resources curated by the [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus) project are deployed onto your cluster as part of installing the Rancher Monitoring Application to set up a basic Monitoring/Alerting stack.
|
||||
|
||||
The resources that get deployed onto your cluster to support this solution can be found in the [`rancher-monitoring`](https://github.com/rancher/charts/tree/main/charts/rancher-monitoring) Helm chart, which closely tracks the upstream [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) Helm chart maintained by the Prometheus community with certain changes tracked in the [CHANGELOG.md](https://github.com/rancher/charts/blob/main/charts/rancher-monitoring/CHANGELOG.md).
|
||||
|
||||
### Default Exporters
|
||||
|
||||
Monitoring V2 deploys three default exporters that provide additional metrics for Prometheus to store:
|
||||
|
||||
1. `node-exporter`: exposes hardware and OS metrics for Linux hosts. For more information on `node-exporter`, refer to the [upstream documentation](https://prometheus.io/docs/guides/node-exporter/).
|
||||
|
||||
1. `windows-exporter`: exposes hardware and OS metrics for Windows hosts (only deployed on Windows clusters). For more information on `windows-exporter`, refer to the [upstream documentation](https://github.com/prometheus-community/windows_exporter).
|
||||
|
||||
1. `kube-state-metrics`: expose additional metrics that track the state of resources contained in the Kubernetes API (e.g., pods, workloads, etc.). For more information on `kube-state-metrics`, refer to the [upstream documentation](https://github.com/kubernetes/kube-state-metrics/tree/master/docs).
|
||||
|
||||
ServiceMonitors and PodMonitors will scrape these exporters, as defined [here](#defining-what-metrics-are-scraped). Prometheus stores these metrics, and you can query the results via either Prometheus's UI or Grafana.
|
||||
|
||||
See the [architecture](#1-architecture-overview) section for more information on recording rules, alerting rules, and Alertmanager.
|
||||
|
||||
### Components Exposed in the Rancher UI
|
||||
|
||||
When the monitoring application is installed, you will be able to edit the following components in the Rancher UI:
|
||||
|
||||
| Component | Type of Component | Purpose and Common Use Cases for Editing |
|
||||
|--------------|------------------------|---------------------------|
|
||||
| ServiceMonitor | Custom resource | Sets up Kubernetes Services to scrape custom metrics from. Automatically updates the scrape configuration in the Prometheus custom resource. |
|
||||
| PodMonitor | Custom resource | Sets up Kubernetes Pods to scrape custom metrics from. Automatically updates the scrape configuration in the Prometheus custom resource. |
|
||||
| Receiver | Configuration block (part of Alertmanager) | Modifies information on where to send an alert (e.g., Slack, PagerDuty, etc.) and any necessary information to send the alert (e.g., TLS certs, proxy URLs, etc.). Automatically updates the Alertmanager custom resource. |
|
||||
| Route | Configuration block (part of Alertmanager) | Modifies the routing tree that is used to filter, label, and group alerts based on labels and send them to the appropriate Receiver. Automatically updates the Alertmanager custom resource. |
|
||||
| PrometheusRule | Custom resource | Defines additional queries that need to trigger alerts or define materialized views of existing series that are within Prometheus's TSDB. Automatically updates the Prometheus custom resource. |
|
||||
|
||||
### PushProx
|
||||
|
||||
PushProx allows Prometheus to scrape metrics across a network boundary, which prevents users from having to expose metrics ports for internal Kubernetes components on each node in a Kubernetes cluster.
|
||||
|
||||
Since the metrics for Kubernetes components are generally exposed on the host network of nodes in the cluster, PushProx deploys a DaemonSet of clients that sit on the hostNetwork of each node and make an outbound connection to a single proxy that is sitting on the Kubernetes API. Prometheus can then be configured to proxy scrape requests through the proxy to each client, which allows it to scrape metrics from the internal Kubernetes components without requiring any inbound node ports to be open.
|
||||
|
||||
Refer to [Scraping Metrics with PushProx](#scraping-metrics-with-pushprox) for more.
|
||||
|
||||
## 5. Scraping and Exposing Metrics
|
||||
|
||||
### Defining what Metrics are Scraped
|
||||
|
||||
ServiceMonitors and PodMonitors define targets that are intended for Prometheus to scrape. The [Prometheus custom resource](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/getting-started/design.md#prometheus) tells Prometheus which ServiceMonitors or PodMonitors it should use to find out where to scrape metrics from.
|
||||
|
||||
The Prometheus Operator observes the ServiceMonitors and PodMonitors. When it observes that they are created or updated, it calls the Prometheus API to update the scrape configuration in the Prometheus custom resource and keep it in sync with the scrape configuration in the ServiceMonitors or PodMonitors. This scrape configuration tells Prometheus which endpoints to scrape metrics from and how it will label the metrics from those endpoints.
|
||||
|
||||
Prometheus scrapes all of the metrics defined in its scrape configuration at every `scrape_interval`, which is one minute by default.
|
||||
|
||||
The scrape configuration can be viewed as part of the Prometheus custom resource that is exposed in the Rancher UI.
|
||||
|
||||
### How the Prometheus Operator Sets up Metrics Scraping
|
||||
|
||||
The Prometheus Deployment or StatefulSet scrapes metrics, and the configuration of Prometheus is controlled by the Prometheus custom resources. The Prometheus Operator watches for Prometheus and Alertmanager resources, and when they are created, the Prometheus Operator creates a Deployment or StatefulSet for Prometheus or Alertmanager with the user-defined configuration.
|
||||
|
||||
When the Prometheus Operator observes ServiceMonitors, PodMonitors, and PrometheusRules being created, it knows that the scrape configuration needs to be updated in Prometheus. It updates Prometheus by first updating the configuration and rules files in the volumes of Prometheus's Deployment or StatefulSet. Then it calls the Prometheus API to sync the new configuration, resulting in the Prometheus Deployment or StatefulSet to be modified in place.
|
||||
|
||||
### How Kubernetes Component Metrics are Exposed
|
||||
|
||||
Prometheus scrapes metrics from deployments known as [exporters,](https://prometheus.io/docs/instrumenting/exporters/) which export the time series data in a format that Prometheus can ingest. In Prometheus, time series consist of streams of timestamped values belonging to the same metric and the same set of labeled dimensions.
|
||||
|
||||
### Scraping Metrics with PushProx
|
||||
|
||||
Certain internal Kubernetes components are scraped via a proxy deployed as part of Monitoring V2 called PushProx. For detailed information on PushProx, refer [here](#how-pushprox-works) and to the above [architecture](#1-architecture-overview) section.
|
||||
|
||||
### Scraping Metrics
|
||||
|
||||
The following Kubernetes components are directly scraped by Prometheus:
|
||||
|
||||
- kubelet*
|
||||
- ingress-nginx**
|
||||
- coreDns/kubeDns
|
||||
- kube-api-server
|
||||
|
||||
\* You can optionally use `hardenedKubelet.enabled` to use a PushProx, but that is not the default.
|
||||
|
||||
** For RKE2 clusters, ingress-nginx is deployed by default and treated as an internal Kubernetes component.
|
||||
|
||||
### Scraping Metrics Based on Kubernetes Distribution
|
||||
|
||||
Metrics are scraped differently based on the Kubernetes distribution. For help with terminology, refer [here](#terminology). For details, see the table below:
|
||||
|
||||
<figcaption>How Metrics are Exposed to Prometheus</figcaption>
|
||||
|
||||
| Kubernetes Component | RKE2 | KubeADM | K3s |
|
||||
|-----|-----|-----|-----|-----|
|
||||
| kube-controller-manager | rke2ControllerManager.enabled | kubeAdmControllerManager.enabled | k3sServer.enabled |
|
||||
| kube-scheduler | rke2Scheduler.enabled |kubeAdmScheduler.enabled | k3sServer.enabled |
|
||||
| etcd | rke2Etcd.enabled | kubeAdmEtcd.enabled | Not available |
|
||||
| kube-proxy | rke2Proxy.enabled | kubeAdmProxy.enabled | k3sServer.enabled |
|
||||
| kubelet | Collects metrics directly exposed by kubelet | Collects metrics directly exposed by kubelet | Collects metrics directly exposed by kubelet |
|
||||
| ingress-nginx* | Collects metrics directly exposed by kubelet, Exposed by rke2IngressNginx.enabled | Not available | Not available |
|
||||
| coreDns/kubeDns | Collects metrics directly exposed by coreDns/kubeDns | Collects metrics directly exposed by coreDns/kubeDns | Collects metrics directly exposed by coreDns/kubeDns |
|
||||
| kube-api-server | Collects metrics directly exposed by kube-api-server | Collects metrics directly exposed by kube-appi-server | Collects metrics directly exposed by kube-api-server |
|
||||
|
||||
\* For RKE2 clusters, ingress-nginx is deployed by default and treated as an internal Kubernetes component.
|
||||
|
||||
### Terminology
|
||||
|
||||
- **kube-scheduler:** The internal Kubernetes component that uses information in the pod spec to decide on which node to run a pod.
|
||||
- **kube-controller-manager:** The internal Kubernetes component that is responsible for node management (detecting if a node fails), pod replication and endpoint creation.
|
||||
- **etcd:** The internal Kubernetes component that is the distributed key/value store which Kubernetes uses for persistent storage of all cluster information.
|
||||
- **kube-proxy:** The internal Kubernetes component that watches the API server for pods/services changes in order to maintain the network up to date.
|
||||
- **kubelet:** The internal Kubernetes component that watches the API server for pods on a node and makes sure they are running.
|
||||
- **ingress-nginx:** An Ingress controller for Kubernetes using NGINX as a reverse proxy and load balancer.
|
||||
- **coreDns/kubeDns:** The internal Kubernetes component responsible for DNS.
|
||||
- **kube-api-server:** The main internal Kubernetes component that is responsible for exposing APIs for the other master components.
|
||||
+102
@@ -0,0 +1,102 @@
|
||||
---
|
||||
title: Monitoring and Alerting
|
||||
description: Prometheus lets you view metrics from your different Rancher and Kubernetes objects. Learn about the scope of monitoring and how to enable cluster monitoring
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/monitoring-and-alerting"/>
|
||||
</head>
|
||||
|
||||
The `rancher-monitoring` application can quickly deploy leading open-source monitoring and alerting solutions onto your cluster.
|
||||
|
||||
Introduced in Rancher v2.5, the application is powered by [Prometheus](https://prometheus.io/), [Grafana](https://grafana.com/grafana/), [Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/), the [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator), and the [Prometheus adapter.](https://github.com/DirectXMan12/k8s-prometheus-adapter)
|
||||
|
||||
For information on V1 monitoring and alerting, available in Rancher v2.2 up to v2.4, please see the Rancher v2.0—v2.4 docs on [cluster monitoring](https://github.com/rancher/rancher-docs/tree/main/archived_docs/en/version-2.0-2.4/explanations/integrations-in-rancher/cluster-monitoring/cluster-monitoring.md), [alerting](https://github.com/rancher/rancher-docs/tree/main/archived_docs/en/version-2.0-2.4/explanations/integrations-in-rancher/cluster-alerts/cluster-alerts.md), [notifiers](https://github.com/rancher/rancher-docs/tree/main/archived_docs/en/version-2.0-2.4/explanations/integrations-in-rancher/notifiers.md) and other [tools](https://github.com/rancher/rancher-docs/tree/main/archived_docs/en/version-2.0-2.4/reference-guides/rancher-project-tools/rancher-project-tools.md).
|
||||
|
||||
Using the `rancher-monitoring` application, you can quickly deploy leading open-source monitoring and alerting solutions onto your cluster.
|
||||
|
||||
## Features
|
||||
|
||||
Prometheus lets you view metrics from your Rancher and Kubernetes objects. Using timestamps, Prometheus lets you query and view these metrics in easy-to-read graphs and visuals, either through the Rancher UI or Grafana, which is an analytics viewing platform deployed along with Prometheus.
|
||||
|
||||
By viewing data that Prometheus scrapes from your cluster control plane, nodes, and deployments, you can stay on top of everything happening in your cluster. You can then use these analytics to better run your organization: stop system emergencies before they start, develop maintenance strategies, or restore crashed servers.
|
||||
|
||||
The monitoring application:
|
||||
|
||||
- Monitors the state and processes of your cluster nodes, Kubernetes components, and software deployments.
|
||||
- Defines alerts based on metrics collected via Prometheus.
|
||||
- Creates custom Grafana dashboards.
|
||||
- Configures alert-based notifications via email, Slack, PagerDuty, etc. using Prometheus Alertmanager.
|
||||
- Defines precomputed, frequently needed or computationally expensive expressions as new time series based on metrics collected via Prometheus.
|
||||
- Exposes collected metrics from Prometheus to the Kubernetes Custom Metrics API via Prometheus Adapter for use in HPA.
|
||||
|
||||
See [How Monitoring Works](how-monitoring-works.md) for an explanation of how the monitoring components work together.
|
||||
|
||||
## Default Components and Deployments
|
||||
|
||||
### Built-in Dashboards
|
||||
|
||||
By default, the monitoring application deploys Grafana dashboards (curated by the [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus) project) onto a cluster.
|
||||
|
||||
It also deploys an Alertmanager UI and a Prometheus UI. For more information about these tools, see [Built-in Dashboards.](built-in-dashboards.md)
|
||||
### Default Metrics Exporters
|
||||
|
||||
By default, Rancher Monitoring deploys exporters (such as [node-exporter](https://github.com/prometheus/node_exporter) and [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics)).
|
||||
|
||||
These default exporters automatically scrape metrics for CPU and memory from all components of your Kubernetes cluster, including your workloads.
|
||||
|
||||
### Default Alerts
|
||||
|
||||
The monitoring application deploys some alerts by default. To see the default alerts, go to the [Alertmanager UI](built-in-dashboards.md#alertmanager-ui) and click **Expand all groups.**
|
||||
|
||||
### Components Exposed in the Rancher UI
|
||||
|
||||
For a list of monitoring components exposed in the Rancher UI, along with common use cases for editing them, see [this section.](how-monitoring-works.md#components-exposed-in-the-rancher-ui)
|
||||
|
||||
## Role-based Access Control
|
||||
|
||||
For more information on configuring access to monitoring, see [this page.](rbac-for-monitoring.md)
|
||||
|
||||
:::note
|
||||
|
||||
Rancher and Project read permissions don't necessarily apply to monitoring resources. See [monitoring-ui-view](rbac-for-monitoring.md#additional-monitoring-clusterroles) for more details.
|
||||
|
||||
:::
|
||||
|
||||
## Guides
|
||||
|
||||
- [Enable monitoring](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/enable-monitoring.md)
|
||||
- [Uninstall monitoring](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/uninstall-monitoring.md)
|
||||
- [Monitoring workloads](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/set-up-monitoring-for-workloads.md)
|
||||
- [Customizing Grafana dashboards](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/customize-grafana-dashboard.md)
|
||||
- [Persistent Grafana dashboards](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/create-persistent-grafana-dashboard.md)
|
||||
- [Debugging high memory usage](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/debug-high-memory-usage.md)
|
||||
|
||||
## Configuration
|
||||
|
||||
### Configuring Monitoring Resources in Rancher
|
||||
|
||||
The configuration reference assumes familiarity with how monitoring components work together. For more information, see [How Monitoring Works.](how-monitoring-works.md)
|
||||
|
||||
- [ServiceMonitor and PodMonitor](../../reference-guides/monitoring-v2-configuration/servicemonitors-and-podmonitors.md)
|
||||
- [Receiver](../../reference-guides/monitoring-v2-configuration/receivers.md)
|
||||
- [Route](../../reference-guides/monitoring-v2-configuration/routes.md)
|
||||
- [PrometheusRule](../../how-to-guides/advanced-user-guides/monitoring-v2-configuration-guides/advanced-configuration/prometheusrules.md)
|
||||
- [Prometheus](../../how-to-guides/advanced-user-guides/monitoring-v2-configuration-guides/advanced-configuration/prometheus.md)
|
||||
- [Alertmanager](../../how-to-guides/advanced-user-guides/monitoring-v2-configuration-guides/advanced-configuration/alertmanager.md)
|
||||
|
||||
### Configuring Helm Chart Options
|
||||
|
||||
For more information on `rancher-monitoring` chart options, including options to set resource limits and requests, see [Helm Chart Options](../../reference-guides/monitoring-v2-configuration/helm-chart-options.md).
|
||||
|
||||
## Windows Cluster Support
|
||||
|
||||
To be able to fully deploy Monitoring V2 for Windows, all of your Windows hosts must have a minimum [wins](https://github.com/rancher/wins) version of v0.1.0.
|
||||
|
||||
For more details on how to upgrade wins on existing Windows hosts, see [Windows cluster support for Monitoring V2.](windows-support.md).
|
||||
|
||||
## Known Issues
|
||||
|
||||
There is a [known issue](https://github.com/rancher/rancher/issues/28787#issuecomment-693611821) that K3s clusters require more than the allotted default memory. If you enable monitoring on a K3s cluster, set `prometheus.prometheusSpec.resources.memory.limit` to 2500 Mi and `prometheus.prometheusSpec.resources.memory.request` to 1750 Mi.
|
||||
|
||||
See [Debugging High Memory Usage](../../how-to-guides/advanced-user-guides/monitoring-alerting-guides/debug-high-memory-usage.md) for advice and recommendations.
|
||||
+369
@@ -0,0 +1,369 @@
|
||||
---
|
||||
title: PromQL Expression Reference
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/monitoring-and-alerting/promql-expressions"/>
|
||||
</head>
|
||||
|
||||
The PromQL expressions in this doc can be used to configure alerts.
|
||||
|
||||
For more information about querying the Prometheus time series database, refer to the official [Prometheus documentation.](https://prometheus.io/docs/prometheus/latest/querying/basics/)
|
||||
|
||||
|
||||
## Cluster Metrics
|
||||
|
||||
### Cluster CPU Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `1 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance))` |
|
||||
| Summary | `1 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])))` |
|
||||
|
||||
### Cluster Load Average
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>load1</td><td>`sum(node_load1) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)`</td></tr><tr><td>load5</td><td>`sum(node_load5) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)`</td></tr><tr><td>load15</td><td>`sum(node_load15) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>load1</td><td>`sum(node_load1) by (instance) / count(node_cpu_seconds_total{mode="system"})`</td></tr><tr><td>load5</td><td>`sum(node_load5) by (instance) / count(node_cpu_seconds_total{mode="system"})`</td></tr><tr><td>load15</td><td>`sum(node_load15) by (instance) / count(node_cpu_seconds_total{mode="system"})`</td></tr></table> |
|
||||
|
||||
### Cluster Memory Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `1 - sum(node_memory_MemAvailable_bytes) by (instance) / sum(node_memory_MemTotal_bytes) by (instance)` |
|
||||
| Summary | `1 - sum(node_memory_MemAvailable_bytes) / sum(node_memory_MemTotal_bytes)` |
|
||||
|
||||
### Cluster Disk Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `(sum(node_filesystem_size_bytes{device!="rootfs"}) by (instance) - sum(node_filesystem_free_bytes{device!="rootfs"}) by (instance)) / sum(node_filesystem_size_bytes{device!="rootfs"}) by (instance)` |
|
||||
| Summary | `(sum(node_filesystem_size_bytes{device!="rootfs"}) - sum(node_filesystem_free_bytes{device!="rootfs"})) / sum(node_filesystem_size_bytes{device!="rootfs"})` |
|
||||
|
||||
### Cluster Disk I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total[5m])) by (instance)`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total[5m])) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total[5m]))`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total[5m]))`</td></tr></table> |
|
||||
|
||||
### Cluster Network Packets
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr></table> |
|
||||
| Summary | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr></table> |
|
||||
|
||||
### Cluster Network I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</code></td></tr></table> |
|
||||
| Summary | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</code></td></tr></table> |
|
||||
|
||||
## Node Metrics
|
||||
|
||||
### Node CPU Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `avg(irate(node_cpu_seconds_total{mode!="idle", instance=~"$instance"}[5m])) by (mode)` |
|
||||
| Summary | `1 - (avg(irate(node_cpu_seconds_total{mode="idle", instance=~"$instance"}[5m])))` |
|
||||
|
||||
### Node Load Average
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>load1</td><td>`sum(node_load1{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load5</td><td>`sum(node_load5{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load15</td><td>`sum(node_load15{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr></table> |
|
||||
| Summary | <table><tr><td>load1</td><td>`sum(node_load1{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load5</td><td>`sum(node_load5{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load15</td><td>`sum(node_load15{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr></table> |
|
||||
|
||||
### Node Memory Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `1 - sum(node_memory_MemAvailable_bytes{instance=~"$instance"}) / sum(node_memory_MemTotal_bytes{instance=~"$instance"})` |
|
||||
| Summary | `1 - sum(node_memory_MemAvailable_bytes{instance=~"$instance"}) / sum(node_memory_MemTotal_bytes{instance=~"$instance"}) ` |
|
||||
|
||||
### Node Disk Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `(sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) by (device) - sum(node_filesystem_free_bytes{device!="rootfs",instance=~"$instance"}) by (device)) / sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) by (device)` |
|
||||
| Summary | `(sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) - sum(node_filesystem_free_bytes{device!="rootfs",instance=~"$instance"})) / sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"})` |
|
||||
|
||||
### Node Disk I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total{instance=~"$instance"}[5m]))`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total{instance=~"$instance"}[5m]))`</td></tr></table> |
|
||||
| Summary | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total{instance=~"$instance"}[5m]))`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total{instance=~"$instance"}[5m]))`</td></tr></table> |
|
||||
|
||||
### Node Network Packets
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr></table> |
|
||||
| Summary | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr></table> |
|
||||
|
||||
### Node Network I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr></table> |
|
||||
| Summary | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</code></td></tr></table> |
|
||||
|
||||
## Etcd Metrics
|
||||
|
||||
### Etcd Has a Leader
|
||||
|
||||
`max(etcd_server_has_leader)`
|
||||
|
||||
### Number of Times the Leader Changes
|
||||
|
||||
`max(etcd_server_leader_changes_seen_total)`
|
||||
|
||||
### Number of Failed Proposals
|
||||
|
||||
`sum(etcd_server_proposals_failed_total)`
|
||||
|
||||
### GRPC Client Traffic
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>in</td><td>`sum(rate(etcd_network_client_grpc_received_bytes_total[5m])) by (instance)`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_client_grpc_sent_bytes_total[5m])) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>in</td><td>`sum(rate(etcd_network_client_grpc_received_bytes_total[5m]))`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_client_grpc_sent_bytes_total[5m]))`</td></tr></table> |
|
||||
|
||||
### Peer Traffic
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>in</td><td>`sum(rate(etcd_network_peer_received_bytes_total[5m])) by (instance)`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_peer_sent_bytes_total[5m])) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>in</td><td>`sum(rate(etcd_network_peer_received_bytes_total[5m]))`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_peer_sent_bytes_total[5m]))`</td></tr></table> |
|
||||
|
||||
### DB Size
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(etcd_debugging_mvcc_db_total_size_in_bytes) by (instance)` |
|
||||
| Summary | `sum(etcd_debugging_mvcc_db_total_size_in_bytes)` |
|
||||
|
||||
### Active Streams
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>lease-watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) by (instance) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) by (instance)`</td></tr><tr><td>watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) by (instance) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>lease-watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"})`</td></tr><tr><td>watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"})`</td></tr></table> |
|
||||
|
||||
### Raft Proposals
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>applied</td><td>`sum(increase(etcd_server_proposals_applied_total[5m])) by (instance)`</td></tr><tr><td>committed</td><td>`sum(increase(etcd_server_proposals_committed_total[5m])) by (instance)`</td></tr><tr><td>pending</td><td>`sum(increase(etcd_server_proposals_pending[5m])) by (instance)`</td></tr><tr><td>failed</td><td>`sum(increase(etcd_server_proposals_failed_total[5m])) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>applied</td><td>`sum(increase(etcd_server_proposals_applied_total[5m]))`</td></tr><tr><td>committed</td><td>`sum(increase(etcd_server_proposals_committed_total[5m]))`</td></tr><tr><td>pending</td><td>`sum(increase(etcd_server_proposals_pending[5m]))`</td></tr><tr><td>failed</td><td>`sum(increase(etcd_server_proposals_failed_total[5m]))`</td></tr></table> |
|
||||
|
||||
### RPC Rate
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>total</td><td>`sum(rate(grpc_server_started_total{grpc_type="unary"}[5m])) by (instance)`</td></tr><tr><td>fail</td><td>`sum(rate(grpc_server_handled_total{grpc_type="unary",grpc_code!="OK"}[5m])) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>total</td><td>`sum(rate(grpc_server_started_total{grpc_type="unary"}[5m]))`</td></tr><tr><td>fail</td><td>`sum(rate(grpc_server_handled_total{grpc_type="unary",grpc_code!="OK"}[5m]))`</td></tr></table> |
|
||||
|
||||
### Disk Operations
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>commit-called-by-backend</td><td>`sum(rate(etcd_disk_backend_commit_duration_seconds_sum[1m])) by (instance)`</td></tr><tr><td>fsync-called-by-wal</td><td>`sum(rate(etcd_disk_wal_fsync_duration_seconds_sum[1m])) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>commit-called-by-backend</td><td>`sum(rate(etcd_disk_backend_commit_duration_seconds_sum[1m]))`</td></tr><tr><td>fsync-called-by-wal</td><td>`sum(rate(etcd_disk_wal_fsync_duration_seconds_sum[1m]))`</td></tr></table> |
|
||||
|
||||
### Disk Sync Duration
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>wal</td><td>`histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le))`</td></tr><tr><td>db</td><td>`histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le))`</td></tr></table> |
|
||||
| Summary | <table><tr><td>wal</td><td>`sum(histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le)))`</td></tr><tr><td>db</td><td>`sum(histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le)))`</td></tr></table> |
|
||||
|
||||
## Kubernetes Components Metrics
|
||||
|
||||
### API Server Request Latency
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `avg(apiserver_request_latencies_sum / apiserver_request_latencies_count) by (instance, verb) /1e+06` |
|
||||
| Summary | `avg(apiserver_request_latencies_sum / apiserver_request_latencies_count) by (instance) /1e+06` |
|
||||
|
||||
### API Server Request Rate
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(rate(apiserver_request_count[5m])) by (instance, code)` |
|
||||
| Summary | `sum(rate(apiserver_request_count[5m])) by (instance)` |
|
||||
|
||||
### Scheduling Failed Pods
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(kube_pod_status_scheduled{condition="false"})` |
|
||||
| Summary | `sum(kube_pod_status_scheduled{condition="false"})` |
|
||||
|
||||
### Controller Manager Queue Depth
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>volumes</td><td>`sum(volumes_depth) by instance`</td></tr><tr><td>deployment</td><td>`sum(deployment_depth) by instance`</td></tr><tr><td>replicaset</td><td>`sum(replicaset_depth) by instance`</td></tr><tr><td>service</td><td>`sum(service_depth) by instance`</td></tr><tr><td>serviceaccount</td><td>`sum(serviceaccount_depth) by instance`</td></tr><tr><td>endpoint</td><td>`sum(endpoint_depth) by instance`</td></tr><tr><td>daemonset</td><td>`sum(daemonset_depth) by instance`</td></tr><tr><td>statefulset</td><td>`sum(statefulset_depth) by instance`</td></tr><tr><td>replicationmanager</td><td>`sum(replicationmanager_depth) by instance`</td></tr></table> |
|
||||
| Summary | <table><tr><td>volumes</td><td>`sum(volumes_depth)`</td></tr><tr><td>deployment</td><td>`sum(deployment_depth)`</td></tr><tr><td>replicaset</td><td>`sum(replicaset_depth)`</td></tr><tr><td>service</td><td>`sum(service_depth)`</td></tr><tr><td>serviceaccount</td><td>`sum(serviceaccount_depth)`</td></tr><tr><td>endpoint</td><td>`sum(endpoint_depth)`</td></tr><tr><td>daemonset</td><td>`sum(daemonset_depth)`</td></tr><tr><td>statefulset</td><td>`sum(statefulset_depth)`</td></tr><tr><td>replicationmanager</td><td>`sum(replicationmanager_depth)`</td></tr></table> |
|
||||
|
||||
### Scheduler E2E Scheduling Latency
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket) by (le, instance)) / 1e+06` |
|
||||
| Summary | `sum(histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket) by (le, instance)) / 1e+06)` |
|
||||
|
||||
### Scheduler Preemption Attempts
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(rate(scheduler_total_preemption_attempts[5m])) by (instance)` |
|
||||
| Summary | `sum(rate(scheduler_total_preemption_attempts[5m]))` |
|
||||
|
||||
### Ingress Controller Connections
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>reading</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="reading"}) by (instance)`</td></tr><tr><td>waiting</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="waiting"}) by (instance)`</td></tr><tr><td>writing</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="writing"}) by (instance)`</td></tr><tr><td>accepted</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="accepted"}[5m]))) by (instance)`</td></tr><tr><td>active</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="active"}[5m]))) by (instance)`</td></tr><tr><td>handled</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="handled"}[5m]))) by (instance)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>reading</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="reading"})`</td></tr><tr><td>waiting</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="waiting"})`</td></tr><tr><td>writing</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="writing"})`</td></tr><tr><td>accepted</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="accepted"}[5m])))`</td></tr><tr><td>active</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="active"}[5m])))`</td></tr><tr><td>handled</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="handled"}[5m])))`</td></tr></table> |
|
||||
|
||||
### Ingress Controller Request Process Time
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `topk(10, histogram_quantile(0.95,sum by (le, host, path)(rate(nginx_ingress_controller_request_duration_seconds_bucket{host!="_"}[5m]))))` |
|
||||
| Summary | `topk(10, histogram_quantile(0.95,sum by (le, host)(rate(nginx_ingress_controller_request_duration_seconds_bucket{host!="_"}[5m]))))` |
|
||||
|
||||
## Rancher Logging Metrics
|
||||
|
||||
|
||||
### Fluentd Buffer Queue Rate
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(rate(fluentd_output_status_buffer_queue_length[5m])) by (instance)` |
|
||||
| Summary | `sum(rate(fluentd_output_status_buffer_queue_length[5m]))` |
|
||||
|
||||
### Fluentd Input Rate
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(rate(fluentd_input_status_num_records_total[5m])) by (instance)` |
|
||||
| Summary | `sum(rate(fluentd_input_status_num_records_total[5m]))` |
|
||||
|
||||
### Fluentd Output Errors Rate
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(rate(fluentd_output_status_num_errors[5m])) by (type)` |
|
||||
| Summary | `sum(rate(fluentd_output_status_num_errors[5m]))` |
|
||||
|
||||
### Fluentd Output Rate
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(rate(fluentd_output_status_num_records_total[5m])) by (instance)` |
|
||||
| Summary | `sum(rate(fluentd_output_status_num_records_total[5m]))` |
|
||||
|
||||
## Workload Metrics
|
||||
|
||||
### Workload CPU Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
### Workload Memory Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""}) by (pod_name)` |
|
||||
| Summary | `sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""})` |
|
||||
|
||||
### Workload Network Packets
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
### Workload Network I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
### Workload Disk I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
## Pod Metrics
|
||||
|
||||
### Pod CPU Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
### Pod Memory Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | `sum(container_memory_working_set_bytes{container_name!="POD",namespace="$namespace",pod_name="$podName",container_name!=""}) by (container_name)` |
|
||||
| Summary | `sum(container_memory_working_set_bytes{container_name!="POD",namespace="$namespace",pod_name="$podName",container_name!=""})` |
|
||||
|
||||
### Pod Network Packets
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
| Summary | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
### Pod Network I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
| Summary | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
### Pod Disk I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| Detail | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m])) by (container_name)`</td></tr></table> |
|
||||
| Summary | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
|
||||
|
||||
## Container Metrics
|
||||
|
||||
### Container CPU Utilization
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| cfs throttled seconds | `sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
|
||||
| usage seconds | `sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
|
||||
| system seconds | `sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
|
||||
| user seconds | `sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
|
||||
|
||||
### Container Memory Utilization
|
||||
|
||||
`sum(container_memory_working_set_bytes{namespace="$namespace",pod_name="$podName",container_name="$containerName"})`
|
||||
|
||||
### Container Disk I/O
|
||||
|
||||
| Catalog | Expression |
|
||||
| --- | --- |
|
||||
| read | `sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
|
||||
| write | `sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
|
||||
+266
@@ -0,0 +1,266 @@
|
||||
---
|
||||
title: Role-based Access Control
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/monitoring-and-alerting/rbac-for-monitoring"/>
|
||||
</head>
|
||||
|
||||
This section describes the expectations for RBAC for Rancher Monitoring.
|
||||
|
||||
|
||||
## Cluster Admins
|
||||
|
||||
By default, only those with the cluster-admin `ClusterRole` should be able to:
|
||||
|
||||
- Install the `rancher-monitoring` App onto a cluster and all other relevant configuration performed on the chart deploy
|
||||
- e.g. whether default dashboards are created, what exporters are deployed onto the cluster to collect metrics, etc.
|
||||
- Create / modify / delete Prometheus deployments in the cluster via Prometheus CRs
|
||||
- Create / modify / delete Alertmanager deployments in the cluster via Alertmanager CRs
|
||||
- Persist new Grafana dashboards or datasources via creating ConfigMaps in the appropriate namespace
|
||||
- Expose certain Prometheus metrics to the k8s Custom Metrics API for HPA via a Secret in the `cattle-monitoring-system` namespace
|
||||
|
||||
## Users with Kubernetes ClusterRole-based Permissions
|
||||
|
||||
The `rancher-monitoring` chart installs the following three `ClusterRoles`. By default, they aggregate into the corresponding k8s `ClusterRoles`:
|
||||
|
||||
| ClusterRole | Aggregates To Default K8s ClusterRole |
|
||||
| ------------------------------| ---------------------------|
|
||||
| `monitoring-admin` | `admin`|
|
||||
| `monitoring-edit` | `edit` |
|
||||
| `monitoring-view` | `view ` |
|
||||
|
||||
These `ClusterRoles` provide different levels of access to the Monitoring CRDs based on the actions that can be performed:
|
||||
|
||||
| CRDs (monitoring.coreos.com) | Admin | Edit | View |
|
||||
| ------------------------------| ---------------------------| ---------------------------| ---------------------------|
|
||||
| <ul><li>`prometheuses`</li><li>`alertmanagers`</li></ul>| Get, List, Watch | Get, List, Watch | Get, List, Watch |
|
||||
| <ul><li>`servicemonitors`</li><li>`podmonitors`</li><li>`prometheusrules`</li></ul>| * | * | Get, List, Watch |
|
||||
|
||||
On a high level, the following permissions are assigned by default as a result.
|
||||
|
||||
### Users with Kubernetes Admin/Edit Permissions
|
||||
|
||||
Only those with the the cluster-admin, admin or edit `ClusterRole` should be able to:
|
||||
|
||||
- Modify the scrape configuration of Prometheus deployments via ServiceMonitor and PodMonitor CRs
|
||||
- Modify the alerting / recording rules of a Prometheus deployment via PrometheusRules CRs
|
||||
|
||||
### Users with Kubernetes View Permissions
|
||||
|
||||
Only those with who have some Kubernetes `ClusterRole` should be able to:
|
||||
|
||||
- View the configuration of Prometheuses that are deployed within the cluster
|
||||
- View the configuration of Alertmanagers that are deployed within the cluster
|
||||
- View the scrape configuration of Prometheus deployments via ServiceMonitor and PodMonitor CRs
|
||||
- View the alerting/recording rules of a Prometheus deployment via PrometheusRules CRs
|
||||
|
||||
### Additional Monitoring Roles
|
||||
|
||||
Monitoring also creates additional `Roles` that are not assigned to users by default but are created within the cluster. They can be bound to a namespace by deploying a `RoleBinding` that references it. To define a `RoleBinding` with `kubectl` instead of through Rancher, click [here](#assigning-roles-and-clusterroles-with-kubectl).
|
||||
|
||||
Admins should use these roles to provide more fine-grained access to users:
|
||||
|
||||
| Role | Purpose |
|
||||
| ------------------------------| ---------------------------|
|
||||
| monitoring-config-admin | Allow admins to assign roles to users to be able to view / modify Secrets and ConfigMaps within the cattle-monitoring-system namespace. Modifying Secrets / ConfigMaps in this namespace could allow users to alter the cluster's Alertmanager configuration, Prometheus Adapter configuration, additional Grafana datasources, TLS secrets, etc. |
|
||||
| monitoring-config-edit | Allow admins to assign roles to users to be able to view / modify Secrets and ConfigMaps within the cattle-monitoring-system namespace. Modifying Secrets / ConfigMaps in this namespace could allow users to alter the cluster's Alertmanager configuration, Prometheus Adapter configuration, additional Grafana datasources, TLS secrets, etc. |
|
||||
| monitoring-config-view | Allow admins to assign roles to users to be able to view Secrets and ConfigMaps within the cattle-monitoring-system namespace. Viewing Secrets / ConfigMaps in this namespace could allow users to observe the cluster's Alertmanager configuration, Prometheus Adapter configuration, additional Grafana datasources, TLS secrets, etc. |
|
||||
| monitoring-dashboard-admin | Allow admins to assign roles to users to be able to edit / view ConfigMaps within the cattle-dashboards namespace. ConfigMaps in this namespace will correspond to Grafana Dashboards that are persisted onto the cluster. |
|
||||
| monitoring-dashboard-edit | Allow admins to assign roles to users to be able to edit / view ConfigMaps within the cattle-dashboards namespace. ConfigMaps in this namespace will correspond to Grafana Dashboards that are persisted onto the cluster. |
|
||||
| monitoring-dashboard-view | Allow admins to assign roles to users to be able to view ConfigMaps within the cattle-dashboards namespace. ConfigMaps in this namespace will correspond to Grafana Dashboards that are persisted onto the cluster. |
|
||||
|
||||
|
||||
### Assigning Monitoring Roles through Custom Roles
|
||||
|
||||
Admins may assign custom roles in the Rancher UI for admin, editing, and viewing monitoring. These "roles" are created by default when the monitoring app is installed. Additionally, these roles are also deployed to the corresponding Kubernetes roles: admin, edit, and view `ClusterRoles`.
|
||||
|
||||
:::note Important
|
||||
|
||||
The UI won't offer `monitoring-admin`, `monitoring-edit`, and `monitoring-view` options when users are being added to a cluster. These monitoring roles can only be assigned by manually creating a custom role that inherits from Project Owner and Project Monitoring View roles.
|
||||
|
||||
:::
|
||||
|
||||
1. Create the custom role:
|
||||
|
||||
1.1 Click **☰ > Users & Authentication > Roles**.
|
||||
|
||||
1.2 Select the appropriate tab, e.g., **Cluster** role. Then click **Create Cluster Role**.
|
||||
|
||||
1.3 In the **Name** field, create a custom role such as `View Monitoring`, `Edit Monitoring`, or `Admin Monitoring`.
|
||||
|
||||
1.4 Click **Inherit From > Add Resource**, then select the Kubernetes role, as applicable, from the dropdown.
|
||||
|
||||
1.5 Click **Create**.
|
||||
|
||||
|
||||
2. Assign the custom role to a new user:
|
||||
|
||||
2.1 Click **☰ > Cluster Management > Cluster Explore > Cluster > Cluster Members > Add**.
|
||||
|
||||
2.2 Search for your new user name from **Select Member** options displayed.
|
||||
|
||||
2.3 Assign the new custom role from **Cluster Permissions** to the new user.
|
||||
|
||||
2.4 Click **Create**.
|
||||
|
||||
**Result:** The new user should now be able to see the monitoring tools.
|
||||
|
||||
### Additional Monitoring ClusterRoles
|
||||
|
||||
Monitoring also creates additional `ClusterRoles` that aren't assigned to users by default but are created within the cluster. They aren't aggregated by default but can be bound to a namespace by deploying a `RoleBinding` or `ClusterRoleBinding` that references it. To define a `RoleBinding` with `kubectl` instead of through Rancher, click [here](#assigning-roles-and-clusterroles-with-kubectl).
|
||||
|
||||
| Role | Purpose |
|
||||
| ------------------------------| ---------------------------|
|
||||
| monitoring-ui-view | This ClusterRole allows users with write access to the project to view metrics graphs for the specified cluster in the Rancher UI. This is done by granting Read-only access to external Monitoring UIs. Users with this role have permission to list the Prometheus, Alertmanager, and Grafana endpoints and make GET requests to Prometheus, Alertmanager, and Grafana UIs through the Rancher proxy. <br/> <br/> This role doesn't grant access to monitoring endpoints. As a result, users with this role won't be able to view cluster monitoring graphs and dashboards in the Rancher UI; however, they are able to access the monitoring Grafana, Prometheus, and Alertmanager UIs if provided those links. |
|
||||
|
||||
:::note
|
||||
|
||||
A user bound to the **View Monitoring** Rancher role and read-only project permissions can't view links in the Monitoring UI. They can still access external monitoring UIs if provided links to those UIs. If you wish to grant access to users with the **View Monitoring** role and read-only project permissions, move the `cattle-monitoring-system` namespace into the project.
|
||||
|
||||
:::
|
||||
|
||||
### Assigning Roles and ClusterRoles with kubectl
|
||||
|
||||
#### Using `kubectl create`
|
||||
|
||||
One method is to use either `kubectl create clusterrolebinding` or `kubectl create rolebinding` to assign a `Role` or `ClusterRole`. This is shown in the following examples:
|
||||
|
||||
- Assign to a specific user:
|
||||
<Tabs groupId="role-type">
|
||||
<TabItem value="clusterrolebinding">
|
||||
|
||||
```plain
|
||||
kubectl create clusterrolebinding my-binding --clusterrole=monitoring-ui-view --user=u-l4npx
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="rolebinding">
|
||||
|
||||
```plain
|
||||
kubectl create rolebinding my-binding --clusterrole=monitoring-ui-view --user=u-l4npx --namespace=my-namespace
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
- Assign to all authenticated users:
|
||||
<Tabs groupId="role-type">
|
||||
<TabItem value="clusterrolebinding">
|
||||
|
||||
```plain
|
||||
kubectl create clusterrolebinding my-binding --clusterrole=monitoring-ui-view --group=system:authenticated
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="rolebinding">
|
||||
|
||||
```plain
|
||||
kubectl create rolebinding my-binding --clusterrole=monitoring-ui-view --group=system:authenticated --namespace=my-namespace
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
#### Using YAML Files
|
||||
|
||||
Another method is to define bindings in YAML files that you create. You must first configure the `RoleBinding` or `ClusterRoleBinding` with a YAML file. Then, apply the configuration changes by running the `kubectl apply` command.
|
||||
|
||||
- **Roles**: Below is an example YAML file to help you configure `RoleBindings` in Kubernetes. You'll need to fill in the name below.
|
||||
|
||||
:::note
|
||||
|
||||
Names are case-sensitive.
|
||||
|
||||
:::
|
||||
|
||||
```yaml
|
||||
# monitoring-config-view-role-binding.yaml
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: RoleBinding
|
||||
metadata:
|
||||
name: monitoring-config-view
|
||||
namespace: cattle-monitoring-system
|
||||
roleRef:
|
||||
kind: Role
|
||||
name: monitoring-config-view
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
subjects:
|
||||
- kind: User
|
||||
name: u-b4qkhsnliz # this can be found via `kubectl get users -A`
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
```
|
||||
|
||||
- **kubectl**: Below is an example of a `kubectl` command used to apply the binding you've created in the YAML file. Remember to fill in your YAML filename accordingly.
|
||||
```plain
|
||||
kubectl apply -f monitoring-config-view-role-binding.yaml
|
||||
```
|
||||
|
||||
## Users with Rancher Based Permissions
|
||||
|
||||
The relationship between the default roles deployed by Rancher (i.e. cluster-owner, cluster-member, project-owner, project-member), the default Kubernetes roles, and the roles deployed by the rancher-monitoring chart are detailed in the table below:
|
||||
|
||||
<figcaption>Default Rancher Permissions and Corresponding Kubernetes ClusterRoles</figcaption>
|
||||
|
||||
| Rancher Role | Kubernetes Role | Monitoring ClusterRole / Role | ClusterRoleBinding or RoleBinding? |
|
||||
| --------- | --------- | --------- | --------- |
|
||||
| cluster-owner | cluster-admin | N/A | ClusterRoleBinding |
|
||||
| cluster-member | admin | monitoring-admin | ClusterRoleBinding |
|
||||
| project-owner | admin | monitoring-admin | RoleBinding within Project namespace |
|
||||
| project-member | edit | monitoring-edit | RoleBinding within Project namespace |
|
||||
|
||||
In addition to these default roles, the following Rancher project roles can be applied to members of your cluster to provide access to monitoring. These Rancher roles are tied to ClusterRoles deployed by the monitoring chart:
|
||||
|
||||
<figcaption>Non-default Rancher Permissions and Corresponding Kubernetes ClusterRoles</figcaption>
|
||||
|
||||
| Rancher Role | Kubernetes ClusterRole | Available In Rancher From | Available in Monitoring v2 From |
|
||||
|--------------------------|-------------------------------|-------|------|
|
||||
| View Monitoring* | [monitoring-ui-view](#additional-monitoring-clusterroles) | 2.4.8+ | 9.4.204+ |
|
||||
|
||||
:::note
|
||||
|
||||
A user bound to the **View Monitoring** Rancher role and read-only project permissions can't view links in the Monitoring UI. They can still access external monitoring UIs if provided links to those UIs. If you wish to grant access to users with the **View Monitoring** role and read-only project permissions, move the `cattle-monitoring-system` namespace into the project.
|
||||
|
||||
:::
|
||||
|
||||
### Differences in 2.5.x
|
||||
|
||||
Users with the project-member or project-owners roles assigned will not be given access to either Prometheus or Grafana in Rancher 2.5.x since we only create Grafana or Prometheus on a cluster-level.
|
||||
|
||||
In addition, while project owners will still be only able to add ServiceMonitors / PodMonitors that scrape resources within their project's namespace by default, PrometheusRules are not scoped to a single namespace / project. Therefore, any alert rules or recording rules created by project-owners within their project namespace will be applied across the entire cluster, although they will be unable to view / edit / delete any rules that were created outside the project's namespace.
|
||||
|
||||
### Assigning Additional Access
|
||||
|
||||
If cluster-admins would like to provide additional admin/edit access to users outside of the roles offered by the rancher-monitoring chart, the following table identifies the potential impact:
|
||||
|
||||
|CRDs (monitoring.coreos.com) | Can it cause impact outside of a namespace / project? | Impact |
|
||||
|----------------------------| ------| ----------------------------|
|
||||
| `prometheuses`| Yes, this resource can scrape metrics from any targets across the entire cluster (unless the Operator itself is otherwise configured). | User will be able to define the configuration of new cluster-level Prometheus deployments that should be created in the cluster. |
|
||||
| `alertmanagers`| No | User will be able to define the configuration of new cluster-level Alertmanager deployments that should be created in the cluster. Note: if you just want to allow users to configure settings like Routes and Receivers, you should just provide access to the Alertmanager Config Secret instead. |
|
||||
| <ul><li>`servicemonitors`</li><li>`podmonitors`</li></ul>| No, not by default; this is configurable via `ignoreNamespaceSelectors` on the Prometheus CR. | User will be able to set up scrapes by Prometheus on endpoints exposed by Services / Pods within the namespace they are given this permission in. |
|
||||
| `prometheusrules`| Yes, PrometheusRules are cluster-scoped. | User will be able to define alert or recording rules on Prometheus based on any series collected across the entire cluster. |
|
||||
|
||||
| k8s Resources | Namespace | Can it cause impact outside of a namespace / project? | Impact |
|
||||
|----------------------------| ------| ------| ----------------------------|
|
||||
| <ul><li>`secrets`</li><li>`configmaps`</li></ul>| `cattle-monitoring-system` | Yes, Configs and Secrets in this namespace can impact the entire monitoring / alerting pipeline. | User will be able to create or edit Secrets / ConfigMaps such as the Alertmanager Config, Prometheus Adapter Config, TLS secrets, additional Grafana datasources, etc. This can have broad impact on all cluster monitoring / alerting. |
|
||||
| <ul><li>`secrets`</li><li>`configmaps`</li></ul>| `cattle-dashboards` | Yes, Configs and Secrets in this namespace can create dashboards that make queries on all metrics collected at a cluster-level. | User will be able to create Secrets / ConfigMaps that persist new Grafana Dashboards only. |
|
||||
|
||||
## Role-based Access Control for Grafana
|
||||
|
||||
Rancher allows any users who are authenticated by Kubernetes and have access the Grafana service deployed by the Rancher Monitoring chart to access Grafana via the Rancher Dashboard UI. By default, all users who are able to access Grafana are given the [Viewer](https://grafana.com/docs/grafana/latest/permissions/organization_roles/#viewer-role) role, which allows them to view any of the default dashboards deployed by Rancher.
|
||||
|
||||
However, users can choose to log in to Grafana as an [Admin](https://grafana.com/docs/grafana/latest/permissions/organization_roles/#admin-role) if necessary. The default Admin username and password for the Grafana instance will be `admin`/`prom-operator`, but alternative credentials can also be supplied on deploying or upgrading the chart.
|
||||
|
||||
To see the Grafana UI, install `rancher-monitoring`. Then:
|
||||
|
||||
1. In the upper left corner, click **☰ > Cluster Management**.
|
||||
1. On the **Clusters** page, go to the cluster where you want to see the visualizations and click **Explore**.
|
||||
1. In the left navigation bar, click **Monitoring**.
|
||||
1. Click **Grafana**.
|
||||
|
||||
<figcaption>Cluster Compute Resources Dashboard in Grafana</figcaption>
|
||||
|
||||

|
||||
|
||||
<figcaption>Default Dashboards in Grafana</figcaption>
|
||||
|
||||

|
||||
+43
@@ -0,0 +1,43 @@
|
||||
---
|
||||
title: Windows Cluster Support for Monitoring V2
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/monitoring-and-alerting/windows-support"/>
|
||||
</head>
|
||||
|
||||
Monitoring V2 can be deployed on a Windows cluster to scrape metrics from Windows nodes using [prometheus-community/windows_exporter](https://github.com/prometheus-community/windows_exporter) (previously named `wmi_exporter`).
|
||||
|
||||
## Cluster Requirements
|
||||
|
||||
Monitoring V2 for Windows can only scrape metrics from Windows hosts that have a minimum `wins` version of v0.1.0. To be able to fully deploy Monitoring V2 for Windows, all of your hosts must meet this requirement.
|
||||
|
||||
### Upgrading Existing Clusters to wins v0.1.0
|
||||
|
||||
If the cluster was provisioned before Rancher 2.5.8 (even if the current Rancher version is 2.5.8), you will not be able to successfully deploy Monitoring V2 for Windows until you upgrade the wins version on each host to at least v0.1.0.
|
||||
|
||||
To facilitate this upgrade, Rancher 2.5.8 has released a brand new Helm chart called `rancher-wins-upgrader`.
|
||||
|
||||
1. Deploy `rancher-wins-upgrader` with the following override:
|
||||
|
||||
```yaml
|
||||
# Masquerading bootstraps the wins-upgrader installation via
|
||||
# a previously whitelisted process path since the normal install path,
|
||||
# c:\etc\rancher\wins\wins-upgrade.exe is not normally whitelisted.
|
||||
# In this case, we are using the previously whitelisted process
|
||||
# path used by Monitoring V1.
|
||||
masquerade:
|
||||
enabled: true
|
||||
as: c:\\etc\wmi-exporter\wmi-exporter.exe
|
||||
```
|
||||
|
||||
2. Once all your hosts have been successfully upgraded, please ensure that you deploy the Helm chart once again with default values to avoid conflicts with the following settings:
|
||||
|
||||
```yaml
|
||||
masquerade:
|
||||
enabled: false
|
||||
```
|
||||
|
||||
**Result:** The hosts are ready for Monitoring V2 to be installed. You may choose to uninstall the `rancher-wins-upgrader` chart or keep it in your cluster to facilitate future upgrades.
|
||||
|
||||
For more information on how it can be used, please see the [README.md](https://github.com/rancher/wins/blob/master/charts/rancher-wins-upgrader/README.md) of the chart.
|
||||
@@ -0,0 +1,27 @@
|
||||
---
|
||||
title: Container Security with NeuVector
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/neuvector"/>
|
||||
</head>
|
||||
|
||||
NeuVector is the only 100% open source, Zero Trust container security platform. Continuously scan throughout the container lifecycle. Remove security roadblocks. Bake in security policies at the start to maximize developer agility. NeuVector provides vulnerability and compliance scanning and management from build to production. The unique NeuVector run-time protection protects network connections within and ingress/egress to the cluster with a Layer7 container firewall. Additionally, NeuVector monitors process and file activity in containers and on hosts to stop unauthorized activity.
|
||||
|
||||
## NeuVector with Rancher
|
||||
|
||||
All NeuVector features are available through Rancher with integrated deployment and single-sign on to the NeuVector console. Rancher cluster admins are able to deploy and manage the NeuVector deployment on their clusters and easily configure NeuVector through Helm values, configMaps, custom resource definitions (CRDs) and the NeuVector console.
|
||||
|
||||
With NeuVector and Rancher:
|
||||
|
||||
- Deploy, manage and secure multiple clusters.
|
||||
- Manage and report vulnerabilities and compliance results for Rancher workloads and nodes.
|
||||
|
||||
## NeuVector Prime with Rancher Prime
|
||||
|
||||
The NeuVector UI Extension for Rancher Manager is available and supported for Rancher Prime and NeuVector Prime customers. This extension provides:
|
||||
|
||||
- Automated deployment of NeuVector, including the Rancher Prime NeuVector Extension dashboard.
|
||||
- Access to important security information from each cluster, such as critical security events, vulnerability scan results, and ingress/egress exposures.
|
||||
- Integrated vulnerability (CVE) and compliance scan results directly in Rancher resources such as nodes and containers/pods.
|
||||
- Integrated actions such as manual triggers of scans on Rancher resources.
|
||||
@@ -0,0 +1,170 @@
|
||||
---
|
||||
title: Overview
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/neuvector/overview"/>
|
||||
</head>
|
||||
|
||||
[NeuVector 5.x](https://open-docs.neuvector.com/) is an open-source container-centric security platform that is integrated with Rancher. NeuVector offers real-time compliance, visibility, and protection for critical applications and data during runtime. NeuVector provides a firewall, container process/file system monitoring, security auditing with CIS benchmarks, and vulnerability scanning. For more information on Rancher security, please see the [security documentation](../../reference-guides/rancher-security/rancher-security.md).
|
||||
|
||||
NeuVector can be enabled through a Helm chart that may be installed either through **Apps** or through the **Cluster Tools** button in the Rancher UI. Once the Helm chart is installed, users can easily [deploy and manage NeuVector clusters within Rancher](https://open-docs.neuvector.com/deploying/rancher#deploy-and-manage-neuvector-through-rancher-apps-marketplace).
|
||||
|
||||
## Installing NeuVector with Rancher
|
||||
|
||||
The Harvester Helm Chart is used to manage access to the NeuVector UI in Rancher where users can navigate directly to deploy and manage their NeuVector clusters.
|
||||
|
||||
**To navigate to and install the NeuVector chart through Apps:**
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. On the Clusters page, go to the cluster where you want to deploy NeuVector, and click **Explore**.
|
||||
1. Go to **Apps > Charts**, and install **NeuVector** from the chart repo.
|
||||
1. Different cluster types require different container runtimes. When configuring Helm chart values, go to the **Container Runtime** section, and select your runtime in accordance with the cluster type. Finally, click **Install** again.
|
||||
|
||||
Some examples are as follows:
|
||||
|
||||
- K3s and RKE2: `k3scontainerd`
|
||||
- AKS: `containerd` for v1.19 and up
|
||||
- EKS: `docker` for v1.22 and below; `containerd` for v1.23 and up
|
||||
- GKE: `containerd` (see the [Google docs](https://cloud.google.com/kubernetes-engine/docs/concepts/using-containerd) for more)
|
||||
|
||||
:::note
|
||||
|
||||
Only one container runtime engine may be selected at a time during installation.
|
||||
|
||||
:::
|
||||
|
||||
**To navigate to and install the NeuVector chart through Cluster Tools:**
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. On the Clusters page, go to the cluster where you want to deploy NeuVector, and click **Explore**.
|
||||
1. Click on **Cluster Tools** at the bottom of the left navigation bar.
|
||||
1. Repeat step 4 above to select your container runtime accordingly, then click **Install** again.
|
||||
|
||||
## Accessing NeuVector from the Rancher UI
|
||||
|
||||
1. Navigate to the cluster explorer of the cluster where NeuVector is installed. In the left navigation bar, click **NeuVector**.
|
||||
1. Click the external link to go to the NeuVector UI. Once the link is selected, users must accept the `END USER LICENSE AGREEMENT` to access the NeuVector UI.
|
||||
|
||||
## Uninstalling NeuVector from the Rancher UI
|
||||
|
||||
**To uninstall from Apps:**
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Under **Apps**, click **Installed Apps**.
|
||||
1. Under `cattle-neuvector-system`, select both the NeuVector app (and the associated CRD if desired), then click **Delete**.
|
||||
|
||||
**To uninstall from Cluster Tools:**
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Click on **Cluster Tools** at the bottom-left of the screen, then click on the trash can icon under the NeuVector chart. Select `Delete the CRD associated with this app` if desired, then click **Delete**.
|
||||
|
||||
## GitHub Repository
|
||||
|
||||
The NeuVector project is available [here](https://github.com/neuvector/neuvector).
|
||||
|
||||
## Documentation
|
||||
|
||||
The NeuVector documentation is [here](https://open-docs.neuvector.com/).
|
||||
|
||||
## Architecture
|
||||
|
||||
The NeuVector security solution contains four types of security containers: Controllers, Enforcers, Managers, and Scanners. A special container called an All-in-One is also provided to combine the Controller, Enforcer, and Manager functions all in one container, primarily for Docker-native deployments. There is also an Updater which, when run, will update the CVE database.
|
||||
|
||||
- **Controller:** Manages the NeuVector Enforcer container; provides REST APIs for the management console.
|
||||
- **Enforcer:** Enforces security policies.
|
||||
- **Manager:** Provides a web-UI and CLI console to manage the NeuVector platform.
|
||||
- **All-in-One:** Includes the Controller, Enforcer, and Manager.
|
||||
- **Scanner:** Performs the vulnerability and compliance scanning for images, containers, and nodes.
|
||||
- **Updater:** Updates the CVE database for Neuvector (when run); redeploys scanner pods.
|
||||
|
||||
<figcaption>**NeuVector Security Containers:**</figcaption>
|
||||
|
||||

|
||||
|
||||
<figcaption>**NeuVector Architecture:**</figcaption>
|
||||
|
||||

|
||||
|
||||
To learn more about NeuVector's architecture, please refer [here](https://open-docs.neuvector.com/basics/overview#architecture).
|
||||
|
||||
## CPU and Memory Allocations
|
||||
|
||||
Below are the minimum recommended computing resources for the NeuVector chart installation in a default deployment. Note that the resource limit is not set.
|
||||
|
||||
| Container | CPU - Request | Memory - Request |
|
||||
|------------|--------|---------|
|
||||
| Controller | 3 (1GB 1vCPU needed per controller) | *
|
||||
| Enforcer | On all nodes (500MB .5vCPU) | 1GB
|
||||
| Manager | 1 (500MB .5vCPU) | *
|
||||
| Scanner | 3 (100MB .5vCPU) | *
|
||||
|
||||
\* Minimum 1GB of memory total required for Controller, Manager, and Scanner containers combined.
|
||||
|
||||
## Hardened Cluster Support - Calico and Canal
|
||||
|
||||
NeuVector components Controller and Enforcer are deployable if PSP is set to true.
|
||||
|
||||
**Applicable to NeuVector chart version 100.0.0+up2.2.0 only:**
|
||||
|
||||
For Manager, Scanner, and Updater components, additional configuration is required as shown below:
|
||||
|
||||
```sh
|
||||
kubectl patch deploy neuvector-manager-pod -n cattle-neuvector-system --patch '{"spec":{"template":{"spec":{"securityContext":{"runAsUser": 5400}}}}}'
|
||||
kubectl patch deploy neuvector-scanner-pod -n cattle-neuvector-system --patch '{"spec":{"template":{"spec":{"securityContext":{"runAsUser": 5400}}}}}'
|
||||
kubectl patch cronjob neuvector-updater-pod -n cattle-neuvector-system --patch '{"spec":{"jobTemplate":{"spec":{"template":{"spec":{"securityContext":{"runAsUser": 5400}}}}}}}'
|
||||
```
|
||||
|
||||
You will need to set additional configuration for your hardened cluster environment.
|
||||
|
||||
:::note
|
||||
You must update your config in both RKE2 and K3s hardened clusters as shown below.
|
||||
:::
|
||||
|
||||
1. Click **☰ > Cluster Management**.
|
||||
1. Go to the cluster that you created and click **Explore**.
|
||||
1. In the left navigation bar, click **Apps**.
|
||||
1. Install (or upgrade to) NeuVector version `100.0.1+up2.2.2`.
|
||||
|
||||
- Under **Edit Options** > **Other Configuration**, enable **Pod Security Policy** by checking the box. Note that you must also enter a value greater than `zero` for `Manager runAsUser ID`, `Scanner runAsUser ID`, and `Updater runAsUser ID`:
|
||||
|
||||

|
||||
|
||||
1. Click **Install** at the bottom-right to complete.
|
||||
|
||||
## SELinux-enabled Cluster Support - Calico and Canal
|
||||
|
||||
To enable SELinux on RKE2 clusters, follow the steps below:
|
||||
|
||||
- NeuVector components Controller and Enforcer are deployable if PSP is set to true.
|
||||
|
||||
|
||||
**Applicable to NeuVector chart version 100.0.0+up2.2.0 only:**
|
||||
|
||||
- For Manager, Scanner, and Updater components, additional configuration is required as shown below:
|
||||
|
||||
```
|
||||
kubectl patch deploy neuvector-manager-pod -n cattle-neuvector-system --patch '{"spec":{"template":{"spec":{"securityContext":{"runAsUser": 5400}}}}}'
|
||||
kubectl patch deploy neuvector-scanner-pod -n cattle-neuvector-system --patch '{"spec":{"template":{"spec":{"securityContext":{"runAsUser": 5400}}}}}'
|
||||
kubectl patch cronjob neuvector-updater-pod -n cattle-neuvector-system --patch '{"spec":{"jobTemplate":{"spec":{"template":{"spec":{"securityContext":{"runAsUser": 5400}}}}}}}'
|
||||
```
|
||||
|
||||
## Cluster Support in an Air-Gapped Environment
|
||||
|
||||
- All NeuVector components are deployable on a cluster in an air-gapped environment without any additional configuration needed.
|
||||
|
||||
## Support Limitations
|
||||
|
||||
* Only admins and cluster owners are currently supported.
|
||||
|
||||
* Fleet multi-cluster deployment is not supported.
|
||||
|
||||
* NeuVector is not supported on a Windows cluster.
|
||||
|
||||
## Other Limitations
|
||||
|
||||
* Currently, NeuVector feature chart installation fails when a NeuVector partner chart already exists. To work around this issue, uninstall the NeuVector partner chart and reinstall the NeuVector feature chart.
|
||||
|
||||
* Sometimes when the controllers are not ready, the NeuVector UI is not accessible from the Rancher UI. During this time, controllers will try to restart, and it takes a few minutes for the controllers to be active.
|
||||
|
||||
* Container runtime is not auto-detected for different cluster types when installing the NeuVector chart. To work around this, you can specify the runtime manually.
|
||||
@@ -0,0 +1,34 @@
|
||||
---
|
||||
title: Kubernetes on the Desktop with Rancher Desktop
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/rancher-desktop"/>
|
||||
</head>
|
||||
|
||||
|
||||
Rancher Desktop bundles together essential tools for developing and testing cloud-native applications from your desktop.
|
||||
|
||||
If you're working from your local machine on apps intended for cloud environments, you normally need a lot of preparation. You need to select a container run-time, install Kubernetes and popular utilities, and possibly set up a virtual machine. Installing components individually and getting them to work together can be a time-consuming process.
|
||||
|
||||
To reduce the complexity, Rancher Desktop offers teams the following key features:
|
||||
|
||||
- Simple and easy installation on macOS, Linux and Windows operating systems.
|
||||
- K3s, a ready-to-use, light-weight Kubernetes distribution.
|
||||
- The ability to easily switch between Kubernetes versions.
|
||||
- A GUI-based cluster dashboard powered by Rancher to explore your local cluster.
|
||||
- Freedom to choose your container engine: dockerd (moby) or containerd.
|
||||
- Preference settings to configure the application to suit your needs.
|
||||
- Bundled tools required for your container, for Kubernetes-based development, and for operation workflows.
|
||||
- Periodic updates to keep bundled tools up to date.
|
||||
- Integration with popular tools/IDEs, including VS Code and Skaffold.
|
||||
- Image & Registry access control.
|
||||
- Support for Docker extensions.
|
||||
|
||||
Visit the [Rancher Desktop](https://rancherdesktop.io) website and read the [docs](https://docs.rancherdesktop.io/) to learn more.
|
||||
|
||||
To install Rancher Desktop on your machine, refer to the [installation guide](https://docs.rancherdesktop.io/getting-started/installation).
|
||||
|
||||
## Trying Rancher on Rancher Desktop
|
||||
|
||||
Rancher Desktop offers the setup and tools you need to easily try out containerized, Helm-based applications. You can get started with the Rancher Kubernetes Management platform using Rancher Desktop, by following this [how-to guide](https://docs.rancherdesktop.io/how-to-guides/rancher-on-rancher-desktop).
|
||||
@@ -0,0 +1,198 @@
|
||||
---
|
||||
title: Rancher Extensions
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/rancher-extensions"/>
|
||||
</head>
|
||||
|
||||
Extensions allow users, developers, partners, and customers to extend and enhance the Rancher UI. In addition, users can make changes and create enhancements to their UI functionality independent of Rancher releases. Extensions will enable users to build on top of Rancher to better tailor it to their respective environments. Note that users will also have the ability to update to new versions as well as roll back to a previous version.
|
||||
|
||||
Extensions are Helm charts that can only be installed once into a cluster; therefore, these charts have been simplified and separated from the general Helm charts listed under **Apps**.
|
||||
|
||||
Examples of built-in Rancher extensions are Fleet, Explorer, and Harvester. Examples of other extensions that use the Extensions API that can be manually added are Kubewarden and Elemental.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
> You must log in as an admin in order to view and interact with the extensions management page.
|
||||
|
||||
## Installing Extensions
|
||||
|
||||
1. Click **☰ > Extensions** under **Configuration**.
|
||||
|
||||
2. If not already installed in **Apps**, you must enable the extension operator by clicking the **Enable** button.
|
||||
|
||||
- Click **OK** to add the Rancher extension repository if your installation is not air-gapped. Otherwise, uncheck the box to do so and click **OK**.
|
||||
|
||||

|
||||
|
||||
3. On the **Extensions** page, click on the **Available** tab to select which extensions you want to install.
|
||||
|
||||
4. If no extensions are showing as available, you may manually add repos as follows:
|
||||
|
||||
4.1. On the upper right of screen, click on **⋮ > Manage Repositories > Create**.
|
||||
|
||||
4.2. Add the desired repo name, making sure to also specify the Git Repo URL and the Git Branch.
|
||||
|
||||
4.3. Click **Create** in the lower right again to complete.
|
||||
|
||||

|
||||
|
||||
5. Under the **Available** tab, click **Install** on the desired extension and version as in the example below. You can also update your extension from this screen, as the button to **Update** will appear on the extension if one is available.
|
||||
|
||||

|
||||
|
||||
6. Click the **Reload** page button that will appear after your extension successfully installs. Note that a logged-in user who has just installed an extension will not see a change to the UI **unless** they reload the page.
|
||||
|
||||

|
||||
|
||||
## Updating and Upgrading Extensions
|
||||
|
||||
1. Click **☰ > Extensions** under **Configuration**.
|
||||
1. Select the **Updates** tab.
|
||||
1. Click **Update**.
|
||||
|
||||
If there is a new version of the extension, there will also be an **Update** button visible on the associated card for the extension in the **Available** tab.
|
||||
|
||||
## Deleting Extensions
|
||||
|
||||
1. Click **☰**, then click on the name of your local cluster.
|
||||
1. From the sidebar, select **Apps > Installed Apps**.
|
||||
1. Find the name of the chart you want to delete and select the checkbox next to it.
|
||||
1. Click **Delete**.
|
||||
|
||||
## Deleting Extension Repositories
|
||||
|
||||
1. Click **☰ > Extensions** under **Configuration**.
|
||||
1. On the top right, click **⋮ > Manage Repositories**.
|
||||
1. Find the name of the extension repository you want to delete. Select the checkbox next to the repository name, then click **Delete**.
|
||||
|
||||
## Deleting Extension Repository Container Images
|
||||
|
||||
1. Click **☰**, then select **Extensions**, under **Configuration**.
|
||||
1. On the top right, click **⋮ > Manage Extension Catalogs**.
|
||||
1. Find the name of the container image you want to delete, then click **⋮ > Uninstall**.
|
||||
|
||||
## Uninstalling Extensions
|
||||
|
||||
There are two ways to uninstall or disable an extension:
|
||||
|
||||
1. Under the **Installed** tab, click the **Uninstall** button on the extension you wish to remove.
|
||||
|
||||

|
||||
|
||||
1. On the extensions management page, click **⋮ > Disable Extension Support**. This will disable all installed extensions.
|
||||
|
||||

|
||||
|
||||
:::caution
|
||||
|
||||
You must reload the page after disabling extensions or display issues may occur.
|
||||
|
||||
:::
|
||||
|
||||
## Developing Extensions
|
||||
|
||||
To learn how to develop your own extensions, refer to the official [Getting Started](https://rancher.github.io/dashboard/extensions/extensions-getting-started) guide.
|
||||
|
||||
## Working with Extensions in an Air-gapped Environment
|
||||
|
||||
If you intend to work with extensions in an air-gapped environment, you must perform some extra steps before you can complete certain tasks.
|
||||
|
||||
### Accessing Rancher UI Extensions in an Air-Gapped Environment
|
||||
|
||||
Rancher provides some extensions, such as Kubewarden and Elemental, through the `ui-plugin-catalog` container image at https://hub.docker.com/r/rancher/ui-plugin-catalog/tags. If you're trying to install these extensions in an air-gapped environment, you must make the `ui-plugin-catalog` image accessible.
|
||||
|
||||
1. Mirror the `ui-plugin-catalog` image to a private registry:
|
||||
|
||||
```bash
|
||||
export REGISTRY_ENDPOINT=<my-private-registry-endpoint> # e.g. "my-private-registry.com"
|
||||
docker pull rancher/ui-plugin-catalog:<tag>
|
||||
docker tag rancher/ui-plugin-catalog:<tag> $REGISTRY_ENDPOINT/rancher/ui-plugin-catalog:<tag>
|
||||
docker push $REGISTRY_ENDPOINT/rancher/ui-plugin-catalog:<tag>
|
||||
2. Use the mirrored image to create a Kubernetes [deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/):
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: ui-plugin-catalog
|
||||
namespace: cattle-ui-plugin-system
|
||||
labels:
|
||||
catalog.cattle.io/ui-extensions-catalog-image: ui-plugin-catalog
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
catalog.cattle.io/ui-extensions-catalog-image: ui-plugin-catalog
|
||||
template:
|
||||
metadata:
|
||||
namespace: cattle-ui-plugin-system
|
||||
labels:
|
||||
catalog.cattle.io/ui-extensions-catalog-image: ui-plugin-catalog
|
||||
spec:
|
||||
containers:
|
||||
- name: server
|
||||
image: <my-private-registry-endpoint>/rancher/ui-plugin-catalog:<tag>
|
||||
imagePullPolicy: Always
|
||||
imagePullSecrets:
|
||||
- name: <my-registry-credentials>
|
||||
```
|
||||
3. Expose the deployment by creating a [ClusterIP service](https://kubernetes.io/docs/concepts/services-networking/service/#type-clusterip):
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: ui-plugin-catalog-svc
|
||||
namespace: cattle-ui-plugin-system
|
||||
spec:
|
||||
ports:
|
||||
- name: catalog-svc-port
|
||||
port: 8080
|
||||
protocol: TCP
|
||||
targetPort: 8080
|
||||
selector:
|
||||
catalog.cattle.io/ui-extensions-catalog-image: ui-plugin-catalog
|
||||
type: ClusterIP
|
||||
```
|
||||
4. Create a [ClusterRepo](../how-to-guides/new-user-guides/helm-charts-in-rancher/helm-charts-in-rancher.md) that targets the ClusterIP service:
|
||||
```yaml
|
||||
apiVersion: catalog.cattle.io/v1
|
||||
kind: ClusterRepo
|
||||
metadata:
|
||||
name: ui-plugin-catalog-repo
|
||||
spec:
|
||||
url: http://ui-plugin-catalog-svc.cattle-ui-plugin-system:8080
|
||||
```
|
||||
|
||||
After you successfully set up these resources, you can install the extensions from the `ui-plugin-charts` manifest into your air-gapped environment.
|
||||
|
||||
### Importing and Installing Extensions in an Air-gapped Environment
|
||||
|
||||
1. Find the address of the container image repository that you want to import as an extension. You should import and use the latest tagged version of the image to ensure you receive the latest features and security updates.
|
||||
- **(Optional)** If the container image is private: [Create](../how-to-guides/new-user-guides/kubernetes-resources-setup/secrets.md) a registry secret within the `cattle-ui-plugin-system` namespace. Enter the domain of the image address in the **Registry Domain Name** field.
|
||||
1. Click **☰**, then select **Extensions**, under **Configuration**.
|
||||
1. On the top right, click **⋮ > Manage Extension Catalogs**.
|
||||
1. Select the **Import Extension Catalog** button.
|
||||
1. Enter the image address in the **Catalog Image Reference** field.
|
||||
* **(Optional)** If the container image is private, select the secret you just created from the **Pull Secrets** drop-down menu.
|
||||
1. Click **Load**. The extension will now be **Pending**.
|
||||
1. Return to the **Extensions** page.
|
||||
1. Select the **Available** tab, and click **Reload** to make sure that the list of extensions is up to date.
|
||||
1. Find the extension you just added, and click **Install**.
|
||||
|
||||
### Updating and Upgrading an Extensions Repository in an Air-gapped Environment
|
||||
|
||||
Extensions repositories that aren't air-gapped are automatically updated. If the repository is air-gapped, you must update it manually.
|
||||
|
||||
First, mirror the latest changes to your private registry by following the same steps for initially [importing and installing an extension repository](#importing-and-installing-extensions-in-an-air-gapped-environment).
|
||||
|
||||
After you mirror the latest changes, follow these steps:
|
||||
|
||||
1. Click **☰ > Local**.
|
||||
1. From the sidebar, select **Workloads > Deployments**.
|
||||
1. From the namespaces dropdown menu, select **cattle-ui-plugin-system**.
|
||||
1. Find the **cattle-ui-plugin-system** namespace.
|
||||
1. Select the `ui-plugin-catalog` deployment.
|
||||
1. Click **⋮ > Edit config**.
|
||||
1. Update the **Container Image** field within the deployment's container with the latest image.
|
||||
1. Click **Save**.
|
||||
+21
@@ -0,0 +1,21 @@
|
||||
---
|
||||
title: SUSE Observability
|
||||
---
|
||||
|
||||
<head>
|
||||
<link rel="canonical" href="https://ranchermanager.docs.rancher.com/integrations-in-rancher/suse-observability"/>
|
||||
</head>
|
||||
|
||||
SUSE Observability is a complete observability solution that provides deep insights into the health of your clusters and nodes, and the workloads running on them. Designed to give you clear visibility into your entire Kubernetes environment, SUSE Observability’s full-stack approach allows you to seamlessly explore everything from services to infrastructure within a single platform, eliminating the need for multiple observability tools.
|
||||
|
||||
SUSE Observability securely collects and correlates data, offering actionable insights into both existing and potential issues in your cluster. This helps you address current problems swiftly and take preventative measures against future challenges.
|
||||
|
||||
The intuitive dashboards highlight problem areas and offer remediation steps, guiding you from issue identification to root cause analysis, and ultimately to resolution, in the quickest possible time.
|
||||
|
||||
For more information and to set up SUSE Observability in your SUSE Rancher-managed Kubernetes cluster, please refer to the [documentation](https://docs.stackstate.com/).
|
||||
|
||||
:::note
|
||||
|
||||
The documentation portal for SUSE Observability is currently under development. In the coming months, the portal will be rolled out featuring comprehensive guides, tutorials, and references to support you on your SUSE Observability journey. Stay tuned!
|
||||
|
||||
:::
|
||||
Reference in New Issue
Block a user