Merge pull request #2749 from catherineluse/staging-merge

Pull master into staging
2026-05-13 16:43:22 +00:00 · 2020-10-05 20:53:06 -07:00
parent b901e87bb7 841ba4e973
commit a2d5176209
61 changed files with 4899 additions and 180 deletions
@@ -314,6 +314,29 @@ rpm -i https://rpm.rancher.io/k3s-selinux-0.1.1-rc1.el7.noarch.rpm

 To force the install script to log a warning rather than fail, you can set the following environment variable: `INSTALL_K3S_SELINUX_WARN=true`.

+The way that SELinux enforcement is enabled or disabled depends on the K3s version. Prior to v1.19.x, SELinux enablement for the builtin containerd was automatic but could be disabled by passing `--disable-selinux`. With v1.19.x and beyond, enabling SELinux must be affirmatively configured via the `--selinux` flag or config file entry. Servers and agents that specify both the `--selinux` and (deprecated) `--disable-selinux` flags will fail to start.
+
+Using a custom `--data-dir` under SELinux is not supported. To customize it, you would most likely need to write your own custom policy. For guidance, you could refer to the [containers/container-selinux](https://github.com/containers/container-selinux) repository, which contains the SELinux policy files for Container Runtimes, and the [rancher/k3s-selinux](https://github.com/rancher/k3s-selinux) repository, which contains the SELinux policy for K3s .
+
+{{% tabs %}}
+{{% tab "K3s v1.19.1+k3s1" %}}
+
+To leverage experimental SELinux, specify the `--selinux` flag when starting K3s servers and agents.
+
+This option can also be specified in the K3s [configuration file:]({{<baseurl>}}/k3s/latest/en/installation/install-options/#configuration-file)
+
+```
+selinux: true
+```
+
+The `--disable-selinux` option should not be used. It is deprecated and will be either ignored or will be unrecognized, resulting in an error, in future minor releases.
+
+{{%/tab%}}
+{{% tab "K3s prior to v1.19.1+k3s1" %}}
+
 You can turn off SELinux enforcement in the embedded containerd by launching K3s with the `--disable-selinux` flag.

+{{%/tab%}}
+{{% /tabs %}}
+
 Note that support for SELinux in containerd is still under development. Progress can be tracked in [this pull request](https://github.com/containerd/cri/pull/1246).
@@ -56,4 +56,4 @@ A unique node ID can be appended to the hostname by launching K3s servers or age

 # Automatically Deployed Manifests

-The [manifests](https://github.com/rancher/k3s/tree/master/manifests) located at the directory path `/var/lib/rancher/k3s/server/manifests` are bundled into the K3s binary at build time.
+The [manifests](https://github.com/rancher/k3s/tree/master/manifests) located at the directory path `/var/lib/rancher/k3s/server/manifests` are bundled into the K3s binary at build time.  These will be installed at runtime by the [rancher/helm-controller.](https://github.com/rancher/helm-controller#helm-controller)
@@ -0,0 +1,46 @@
+---
+title: Backup and Restore Embedded etcd Datastore (Experimental)
+shortTitle: Backup and Restore
+weight: 26
+---
+
+_Available as of v1.19.1+k3s1_
+
+In this section, you'll learn how to create backups of the K3s cluster data and to restore the cluster from backup.
+
+> This is an experimental feature available for K3s clusters with an embedded etcd datastore. If you installed K3s with an external datastore, refer to the upstream documentation for the database for information on backing up the cluster data.
+
+### Creating Snapshots
+
+Snapshots are enabled by default.
+
+The snapshot directory defaults to `/server/db/snapshots`.
+
+To configure the snapshot interval or the number of retained snapshots, refer to the [options.](#options)
+
+### Restoring a Cluster from a Snapshot
+
+When K3s is restored from backup, the old data directory will be moved to `/server/db/etcd-old/`. Then K3s will attempt to restore the snapshot by creating a new data directory, then starting etcd with a new K3s cluster with one etcd member.
+
+To restore the cluster from backup, run K3s with the `--cluster-reset` option, with the `--cluster-reset-restore-path` also given:
+
+```
+./k3s server \
+  --cluster-reset \
+  --cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
+```
+
+**Result:**  A message in the logs says that K3s can be restarted without the flags. Start k3s again and should run successfully and be restored from the specified snapshot.
+
+### Options
+
+These options can be passed in with the command line, or in the [configuration file,]({{<baseurl>}}/k3s/latest/en/installation/install-options/#configuration-file ) which may be easier to use.
+
+| Options | Description |
+| ----------- | --------------- |
+| `--etcd-disable-snapshots` | Disable automatic etcd snapshots |
+| `--etcd-snapshot-schedule-cron` value  |  Snapshot interval time in cron spec. eg. every 5 hours `* */5 * * *`(default: `0 */12 * * *`) |
+| `--etcd-snapshot-retention` value  | Number of snapshots to retain (default: 5) |
+| `--etcd-snapshot-dir` value  | Directory to save db snapshots. (Default location: `${data-dir}/db/snapshots`) |
+| `--cluster-reset`  | Forget all peers and become sole member of a new cluster. This can also be set with the environment variable `[$K3S_CLUSTER_RESET]`.
+| `--cluster-reset-restore-path` value | Path to snapshot file to be restored
@@ -3,48 +3,28 @@ title: Helm
 weight: 42
 ---

-K3s release _v1.17.0+k3s.1_ added support for Helm 3. You can access the Helm 3 documentation [here](https://helm.sh/docs/intro/quickstart/).
+Helm is the package management tool of choice for Kubernetes. Helm charts provide templating syntax for Kubernetes YAML manifest documents. With Helm we can create configurable deployments instead of just using static files. For more information about creating your own catalog of deployments, check out the docs at [https://helm.sh/docs/intro/quickstart/](https://helm.sh/docs/intro/quickstart/).

-Helm is the package management tool of choice for Kubernetes. Helm charts provide templating syntax for Kubernetes YAML manifest documents. With Helm we can create configurable deployments instead of just using static files. For more information about creating your own catalog of deployments, check out the docs at https://helm.sh/.
-
-K3s does not require any special configuration to start using Helm 3. Just be sure you have properly set up your kubeconfig as per the section about [cluster access.](../cluster-access)
+K3s does not require any special configuration to use with Helm command-line tools. Just be sure you have properly set up your kubeconfig as per the section about [cluster access](../cluster-access). K3s does include some extra functionality to make deploying both traditional Kubernetes resource manifests and Helm Charts even easier with the [rancher/helm-release CRD.](#using-the-helm-crd)

 This section covers the following topics:

- [Upgrading Helm](#upgrading-helm)
- [Deploying manifests and Helm charts](#deploying-manifests-and-helm-charts)
+- [Automatically Deploying Manifests and Helm Charts](#automatically-deploying-manifests-and-helm-charts)
 - [Using the Helm CRD](#using-the-helm-crd)
+- [Customizing Packaged Components with HelmChartConfig](#customizing-packaged-components-with-helmchartconfig)
+- [Upgrading from Helm v2](#upgrading-from-helm-v2)

-### Upgrading Helm
+### Automatically Deploying Manifests and Helm Charts

-If you were using Helm v2 in previous versions of K3s, you may upgrade to v1.17.0+k3s.1 or newer and Helm 2 will still function. If you wish to migrate to Helm 3, [this](https://helm.sh/blog/migrate-from-helm-v2-to-helm-v3/) blog post by Helm explains how to use a plugin to successfully migrate. Refer to the official Helm 3 documentation [here](https://helm.sh/docs/) for more information. K3s will handle either Helm v2 or Helm v3 as of v1.17.0+k3s.1. Just be sure you have properly set your kubeconfig as per the examples in the section about [cluster access.](../cluster-access)
+Any Kubernetes manifests found in `/var/lib/rancher/k3s/server/manifests` will automatically be deployed to K3s in a manner similar to `kubectl apply`. Manifests deployed in this manner are managed as AddOn custom resources, and can be viewed by running `kubectl get addon -A`. You will find AddOns for packaged components such as CoreDNS, Local-Storage, Traefik, etc. AddOns are created automatically by the deploy controller, and are named based on their filename in the manifests directory.

-Note that Helm 3 no longer requires Tiller and the `helm init` command. Refer to the official documentation for details.
+It is also possible to deploy Helm charts as AddOns. K3s includes a [Helm Controller](https://github.com/rancher/helm-controller/) that manages Helm charts using a HelmChart Custom Resource Definition (CRD).

-### Deploying Manifests and Helm Charts
+### Using the Helm CRD

-Any file found in `/var/lib/rancher/k3s/server/manifests` will automatically be deployed to Kubernetes in a manner similar to `kubectl apply`.
+> **Note:** K3s versions through v0.5.0 used `k3s.cattle.io/v1` as the apiVersion for HelmCharts. This has been changed to `helm.cattle.io/v1` for later versions.

-It is also possible to deploy Helm charts. K3s supports a CRD controller for installing charts. A YAML file specification can look as following (example taken from `/var/lib/rancher/k3s/server/manifests/traefik.yaml`):
-
-```yaml
-apiVersion: helm.cattle.io/v1
-kind: HelmChart
-metadata:
-  name: traefik
-  namespace: kube-system
-spec:
-  chart: stable/traefik
-  set:
-    rbac.enabled: "true"
-    ssl.enabled: "true"
-```
-
-Keep in mind that `namespace` in your HelmChart resource metadata section should always be `kube-system`, because the K3s deploy controller is configured to watch this namespace for new HelmChart resources. If you want to specify the namespace for the actual Helm release, you can do that using `targetNamespace` key under the `spec` directive, as shown in the configuration example below.
-
-> **Note:** In order for the Helm Controller to know which version of Helm to use to Auto-Deploy a helm app, please specify the `helmVersion` in the spec of your YAML file.
-
-Also note that besides `set`, you can use `valuesContent` under the `spec` directive. And it's okay to use both of them:
+The [HelmChart resource definition](https://github.com/rancher/helm-controller#helm-controller) captures most of the options you would normally pass to the `helm` command-line tool. Here's an example of how you might deploy Grafana from the default chart repository, overriding some of the default chart values. Note that the HelmChart resource itself is in the `kube-system` namespace, but the chart's resources will be deployed to the `monitoring` namespace.

 ```yaml
 apiVersion: helm.cattle.io/v1
@@ -68,34 +48,58 @@ spec:
        enabled: true
 ```

-K3s versions `<= v0.5.0` used `k3s.cattle.io` for the API group of HelmCharts. This has been changed to `helm.cattle.io` for later versions.
+#### HelmChart Field Definitions

-### Using the Helm CRD
+| Field | Default | Description | Helm Argument / Flag Equivalent |
+|-------|---------|-------------|-------------------------------|
+| name |   | Helm Chart name | NAME |
+| spec.chart |   | Helm Chart name in repository, or complete HTTPS URL to chart archive (.tgz) | CHART |
+| spec.targetNamespace | default | Helm Chart target namespace | `--namespace` |
+| spec.version |   | Helm Chart version (when installing from repository) | `--version` |
+| spec.repo |   | Helm Chart repository URL | `--repo` |
+| spec.helmVersion | v3 | Helm version to use (`v2` or `v3`) |  |
+| spec.bootstrap | False | Set to True if this chart is needed to bootstrap the cluster (Cloud Controller Manager, etc) |  |
+| spec.set |   | Override simple default Chart values. These take precedence over options set via valuesContent. | `--set` / `--set-string` |
+| spec.valuesContent |   | Override complex default Chart values via YAML file content | `--values` |
+| spec.chartContent |   | Base64-encoded chart archive .tgz - overrides spec.chart | CHART |

-You can deploy a third-party Helm chart using an example like this:
+Content placed in `/var/lib/rancher/k3s/server/static/` can be accessed anonymously via the Kubernetes APIServer from within the cluster. This URL can be templated using the special variable `%{KUBERNETES_API}%` in the `spec.chart` field. For example, the packaged Traefik component loads its chart from `https://%{KUBERNETES_API}%/static/charts/traefik-1.81.0.tgz`.
+
+### Customizing Packaged Components with HelmChartConfig
+
+To allow overriding values for packaged components that are deployed as HelmCharts (such as Traefik), K3s versions starting with v1.19.0+k3s1 support customizing deployments via a HelmChartConfig resources. The HelmChartConfig resource must match the name and namespace of its corresponding HelmChart, and supports providing additional `valuesContent`, which is passed to the `helm` command as an additional value file.
+
+> **Note:** HelmChart `spec.set` values override HelmChart and HelmChartConfig `spec.valuesContent` settings.
+
+For example, to customize the packaged Traefik ingress configuration, you can create a file named `/var/lib/rancher/k3s/server/manifests/traefik-config.yaml` and populate it with the following content:

 ```yaml
 apiVersion: helm.cattle.io/v1
-kind: HelmChart
+kind: HelmChartConfig
 metadata:
-  name: nginx
+  name: traefik
  namespace: kube-system
 spec:
-  chart: nginx
-  repo: https://charts.bitnami.com/bitnami
-  targetNamespace: default
+  valuesContent: |-
+    image: traefik
+    imageTag: v1.7.26-alpine
+    proxyProtocol:
+      enabled: true
+      trustedIPs:
+        - 10.0.0.0/8
+    forwardedHeaders:
+      enabled: true
+      trustedIPs:
+        - 10.0.0.0/8
+    ssl:
+      enabled: true
+      permanentRedirect: false
 ```

-You can install a specific version of a Helm chart using an example like this:
+### Upgrading from Helm v2

-```yaml
-apiVersion: helm.cattle.io/v1
-kind: HelmChart
-metadata:
-  name: stable/nginx-ingress
-  namespace: kube-system
-spec:
-  chart: nginx-ingress
-  version: 1.24.4
-  targetNamespace: default
-```
+> **Note:** K3s versions starting with v1.17.0+k3s.1 support Helm v3, and will use it by default. Helm v2 charts can be used by setting `helmVersion: v2` in the spec.
+
+If you were using Helm v2 in previous versions of K3s, you may upgrade to v1.17.0+k3s.1 or newer and Helm 2 will still function. If you wish to migrate to Helm 3, [this](https://helm.sh/blog/migrate-from-helm-v2-to-helm-v3/) blog post by Helm explains how to use a plugin to successfully migrate. Refer to the official Helm 3 documentation [here](https://helm.sh/docs/) for more information. K3s will handle either Helm v2 or Helm v3 as of v1.17.0+k3s.1. Just be sure you have properly set your kubeconfig as per the examples in the section about [cluster access.](../cluster-access)
+
+Note that Helm 3 no longer requires Tiller and the `helm init` command. Refer to the official documentation for details.
@@ -7,7 +7,7 @@ The ability to run Kubernetes using a datastore other than etcd sets K3s apart f

 * If your team doesn't have expertise in operating etcd, you can choose an enterprise-grade SQL database like MySQL or PostgreSQL
 * If you need to run a simple, short-lived cluster in your CI/CD environment, you can use the embedded SQLite database
-* If you wish to deploy Kubernetes on the edge and require a highly available solution but can't afford the operational overhead of managing a database at the edge, you can use K3s's embedded HA datastore built on top of DQLite (currently experimental)
+* If you wish to deploy Kubernetes on the edge and require a highly available solution but can't afford the operational overhead of managing a database at the edge, you can use K3s's embedded HA datastore built on top of embedded etcd (currently experimental)

 K3s supports the following datastore options:

@@ -16,7 +16,7 @@ K3s supports the following datastore options:
 * [MySQL](https://www.mysql.com/) (certified against version 5.7)
 * [MariaDB](https://mariadb.org/) (certified against version 10.3.20)
 * [etcd](https://etcd.io/) (certified against version 3.3.15)
-* Embedded [DQLite](https://dqlite.io/) for High Availability (experimental)
+* Embedded etcd for High Availability (experimental)

 ### External Datastore Configuration Parameters
 If you wish to use an external datastore such as PostgreSQL, MySQL, or etcd you must set the `datastore-endpoint` parameter so that K3s knows how to connect to it. You may also specify parameters to configure the authentication and encryption of the connection. The below table summarizes these parameters, which can be passed as either CLI flags or environment variables.
@@ -94,5 +94,6 @@ K3S_DATASTORE_KEYFILE='/path/to/client.key' \
 k3s server
 ```

-### Embedded DQLite for HA (Experimental)
-K3s's use of DQLite is similar to its use of SQLite. It is simple to set up and manage. As such, there is no external configuration or additional steps to take in order to use this option. Please see [High Availability with Embedded DB (Experimental)]({{<baseurl>}}/k3s/latest/en/installation/ha-embedded/) for instructions on how to run with this option.
+### Embedded etcd for HA (Experimental)
+
+Please see [High Availability with Embedded DB (Experimental)]({{<baseurl>}}/k3s/latest/en/installation/ha-embedded/) for instructions on how to run with this option.
@@ -3,9 +3,15 @@ title: "High Availability with Embedded DB (Experimental)"
 weight: 40
 ---

-As of v1.0.0, K3s is previewing support for running a highly available control plane without the need for an external database. This means there is no need to manage an external etcd or SQL datastore in order to run a reliable production-grade setup. While this feature is currently experimental, we expect it to be the primary architecture for running HA K3s clusters in the future.
+K3s is previewing support for running a highly available control plane without the need for an external database. This means there is no need to manage an external etcd or SQL datastore.

-This architecture is achieved by embedding a dqlite database within the K3s server process. DQLite is short for "distributed SQLite." According to https://dqlite.io, it is "*a fast, embedded, persistent SQL database with Raft consensus that is perfect for fault-tolerant IoT and Edge devices.*" This makes it a natural fit for K3s.
+In K3s 1.0.0, Dqlite was used as the experimental embedded database. In K3s v1.19.1+, embedded etcd is used.
+
+Please note that upgrades from experimental Dqlite to experimental embedded etcd are not supported. If you attempt an upgrade it will not succeed and data will be lost.
+
+### Embedded etcd (Experimental)
+
+_Available as of K3s v1.19.1_

 To run K3s in this mode, you must have an odd number of server nodes. We recommend starting with three nodes.

@@ -20,3 +26,13 @@ K3S_TOKEN=SECRET k3s server --server https://<ip or hostname of server1>:6443
 ```

 Now you have a highly available control plane. Joining additional worker nodes to the cluster follows the same procedure as a single server cluster.
+
+### Embedded Dqlite (Deprecated)
+
+> **Warning:** Experimental etcd replaced experimental Dqlite in the K3s v1.19.1 release. This is a breaking change. Please note that upgrades from experimental Dqlite to experimental embedded etcd are not supported. If you attempt an upgrade it will not succeed and data will be lost.
+
+As of v1.0.0, K3s previewed support for running a highly available control plane without the need for an external database.
+
+This architecture is achieved by embedding a Dqlite database within the K3s server process. DQLite is short for "distributed SQLite." According to https://dqlite.io, it is "*a fast, embedded, persistent SQL database with Raft consensus that is perfect for fault-tolerant IoT and Edge devices.*"
+
+To run K3s with the embedded Dqlite database, follow the same steps as the [embedded etcd database](#embedded-etcd-experimental) using a K3s release between v1.0.0 and v1.19.1.
@@ -48,7 +48,7 @@ To configure TLS certificates when launching server nodes, refer to the [datasto

 > **Note:** The same installation options available to single-server installs are also available for high-availability installs. For more details, see the [Installation and Configuration Options]({{<baseurl>}}/k3s/latest/en/installation/install-options/) documentation.

-By default, server nodes will be schedulable and thus your workloads can get launched on them. If you wish to have a dedicated control plane where no user workloads will run, you can use taints. The <span style='white-space: nowrap'>`node-taint`</span> parameter will allow you to configure nodes with taints, for example <span style='white-space: nowrap'>`--node-taint k3s-controlplane=true:NoExecute`</span>.
+By default, server nodes will be schedulable and thus your workloads can get launched on them. If you wish to have a dedicated control plane where no user workloads will run, you can use taints. The <span style='white-space: nowrap'>`node-taint`</span> parameter will allow you to configure nodes with taints, for example <span style='white-space: nowrap'>`--node-taint CriticalAddonsOnly=true:NoExecute`</span>.

 Once you've launched the `k3s server` process on all server nodes, ensure that the cluster has come up properly with `k3s kubectl get nodes`. You should see your server nodes in the Ready state.

@@ -9,6 +9,9 @@ This page focuses on the options that can be used when you set up K3s for the fi
 - [Options for installation from binary](#options-for-installation-from-binary)
 - [Registration options for the K3s server](#registration-options-for-the-k3s-server)
 - [Registration options for the K3s agent](#registration-options-for-the-k3s-agent)
+- [Configuration File](#configuration-file)
+
+In addition to configuring K3s with environment variables and CLI arguments, K3s can also use a [config file.](#configuration-file)

 For more advanced options, refer to [this page.]({{<baseurl>}}/k3s/latest/en/advanced)

@@ -71,3 +74,36 @@ For details on configuring the K3s server, refer to the [server configuration re
 ### Registration Options for the K3s Agent

 For details on configuring the K3s agent, refer to the [agent configuration reference.]({{<baseurl>}}/k3s/latest/en/installation/install-options/agent-config)
+
+### Configuration File
+
+In addition to configuring K3s with environment variables and CLI arguments, K3s can also use a config file.
+
+By default, values present in a YAML file located at `/etc/rancher/k3s/config.yaml` will be used on install.
+
+An example of a basic `server` config file is below:
+
+```yaml
+write-kubeconfig-mode: "0644"
+tls-san:
+  - "foo.local"
+node-label:
+  - "foo=bar"
+  - "something=amazing"
+```
+
+In general, CLI arguments map to their respective YAML key, with repeatable CLI arguments being represented as YAML lists.
+
+An identical configuration using solely CLI arguments is shown below to demonstrate this:
+
+```bash
+k3s server \
+  --write-kubeconfig-mode "0644"    \
+  --tls-san "foo.local"             \
+  --node-label "foo=bar"            \
+  --node-label "something=amazing"
+```
+
+It is also possible to use both a configuration file and CLI arguments.  In these situations, values will be loaded from both sources, but CLI arguments will take precedence.  For repeatable arguments such as `--node-label`, the CLI arguments will overwrite all values in the list.
+
+Finally, the location of the config file can be changed either through the cli argument `--config FILE, -c FILE`, or the environment variable `$K3S_CONFIG_FILE`.
@@ -33,7 +33,7 @@ If you are using **Alpine Linux**, follow [these steps]({{<baseurl>}}/k3s/latest

 Hardware requirements scale based on the size of your deployments. Minimum recommendations are outlined here.

-*    RAM: 512MB Minimum
+*    RAM: 512MB Minimum (we recommend at least 1GB)
 *    CPU: 1 Minimum

 #### Disks
@@ -25,7 +25,7 @@ Mirrors is a directive that defines the names and endpoints of the private regis

 ```
 mirrors:
-  docker.io:
+  mycustomreg.com:
    endpoint:
      - "https://mycustomreg.com:5000"
 ```
@@ -41,6 +41,7 @@ Directive | Description
 `cert_file` | The client certificate path that will be used to authenticate with the registry
 `key_file` | The client key path that will be used to authenticate with the registry
 `ca_file` | Defines the CA certificate path to be used to verify the registry's server cert file
+`insecure_skip_verify` | Boolean that defines if TLS verification should be skipped for the registry

 The credentials consist of either username/password or authentication token:

@@ -3,21 +3,27 @@ title: "Networking"
 weight: 35
 ---

->**Note:** CNI options are covered in detail on the [Installation Network Options]({{<baseurl>}}/k3s/latest/en/installation/network-options/) page. Please reference that page for details on Flannel and the various flannel backend options or how to set up your own CNI.
+This page explains how CoreDNS, the Traefik Ingress controller, and Klipper service load balancer work within K3s.

-Open Ports
----------
-Please reference the [Installation Requirements]({{<baseurl>}}/k3s/latest/en/installation/installation-requirements/#networking) page for port information.
+Refer to the [Installation Network Options]({{<baseurl>}}/k3s/latest/en/installation/network-options/) page for details on Flannel configuration options and backend selection, or how to set up your own CNI.

-CoreDNS
-------
+For information on which ports need to be opened for K3s, refer to the [Installation Requirements.]({{<baseurl>}}/k3s/latest/en/installation/installation-requirements/#networking)
+
+- [CoreDNS](#coredns)
+- [Traefik Ingress Controller](#traefik-ingress-controller)
+- [Service Load Balancer](#service-load-balancer)
+  - [How the Service LB Works](#how-the-service-lb-works)
+  - [Usage](#usage)
+  - [Excluding the Service LB from Nodes](#excluding-the-service-lb-from-nodes)
+  - [Disabling the Service LB](#disabling-the-service-lb)
+
+# CoreDNS

 CoreDNS is deployed on start of the agent. To disable, run each server with the `--disable coredns` option.

 If you don't install CoreDNS, you will need to install a cluster DNS provider yourself.

-Traefik Ingress Controller
--------------------------
+# Traefik Ingress Controller

 [Traefik](https://traefik.io/) is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It simplifies networking complexity while designing, deploying, and running applications.

@@ -25,15 +31,59 @@ Traefik is deployed by default when starting the server. For more information se

 The Traefik ingress controller will use ports 80, 443, and 8080 on the host (i.e. these will not be usable for HostPort or NodePort).

-You can tweak traefik to meet your needs by setting options in the traefik.yaml file. Refer to the official [Traefik for Helm Configuration Parameters](https://github.com/helm/charts/tree/master/stable/traefik#configuration) readme for more information.
+Traefik can be configured by editing the `traefik.yaml` file. To prevent k3s from using or overwriting the modified version, deploy k3s with `--no-deploy traefik` and store the modified copy in the `k3s/server/manifests` directory. For more information, refer to the official [Traefik for Helm Configuration Parameters.](https://github.com/helm/charts/tree/master/stable/traefik#configuration)

 To disable it, start each server with the `--disable traefik` option.

-Service Load Balancer
---------------------
+# Service Load Balancer

-K3s includes a basic service load balancer that uses available host ports. If you try to create a load balancer that listens on port 80, for example, it will try to find a free host in the cluster for port 80. If no port is available, the load balancer will stay in Pending.
+Any service load balancer (LB) can be leveraged in your Kubernetes cluster. K3s provides a load balancer known as [Klipper Load Balancer](https://github.com/rancher/klipper-lb) that uses available host ports.

+Upstream Kubernetes allows a Service of type LoadBalancer to be created, but doesn't include the implementation of the LB. Some LB services require a cloud provider such as Amazon EC2 or Microsoft Azure. By contrast, the K3s service LB makes it possible to use an LB service without a cloud provider.
+
+### How the Service LB Works
+
+K3s creates a controller that creates a Pod for the service load balancer, which is a Kubernetes object of kind [Service.](https://kubernetes.io/docs/concepts/services-networking/service/)
+
+For each service load balancer, a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) is created. The DaemonSet creates a pod with the `svc` prefix on each node.
+
+The Service LB controller listens for other Kubernetes Services. After it finds a Service, it creates a proxy Pod for the service using a DaemonSet on all of the nodes. This Pod becomes a proxy to the other Service, so that for example, requests coming to port 8000 on a node could be routed to your workload on port 8888.
+
+If the Service LB runs on a node that has an external IP, it uses the external IP.
+
+If multiple Services are created, a separate DaemonSet is created for each Service.
+
+It is possible to run multiple Services on the same node, as long as they use different ports.
+
+If you try to create a Service LB that listens on port 80, the Service LB will try to find a free host in the cluster for port 80. If no host with that port is available, the LB will stay in Pending.
+
+### Usage
+
+Create a [Service of type LoadBalancer](https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer) in K3s.
+
+### Excluding the Service LB from Nodes
+
+To exclude nodes from using the Service LB, add the following label to the nodes that should not be excluded:
+
+```
+svccontroller.k3s.cattle.io/enablelb
+```
+
+If the label is used, the service load balancer only runs on the labeled nodes.
+
+### Disabling the Service LB
+
+<<<<<<< HEAD
+To disable it, start each server with the `--disable traefik` option.
+=======
+To disable the embedded LB, run the server with the `--disable servicelb` option.
+>>>>>>> 6d1d47358a30f3687e515ee9a5b4ce1c3afd9a33
+
+This is necessary if you wish to run a different LB, such as MetalLB.
+
+# Nodes Without a Hostname
+
+<<<<<<< HEAD
 To disable the embedded load balancer, run the server with the `--disable servicelb` option. This is necessary if you wish to run a different load balancer, such as MetalLB.

 Nodes Without a Hostname
@@ -41,3 +91,6 @@ Nodes Without a Hostname

 Some cloud providers, such as Linode, will create machines with "localhost" as the hostname and others may not have a hostname set at all. This can cause problems with domain name resolution. You can run K3s with the `--node-name` flag or `K3S_NODE_NAME` environment variable and this will pass the node name to resolve this issue.

+=======
+Some cloud providers, such as Linode, will create machines with "localhost" as the hostname and others may not have a hostname set at all. This can cause problems with domain name resolution. You can run K3s with the `--node-name` flag or `K3S_NODE_NAME` environment variable and this will pass the node name to resolve this issue.
+>>>>>>> 6d1d47358a30f3687e515ee9a5b4ce1c3afd9a33
@@ -8,3 +8,5 @@ This section describes how to upgrade your K3s cluster.
 [Upgrade basics]({{< baseurl >}}/k3s/latest/en/upgrades/basic/) describes several techniques for upgrading your cluster manually. It can also be used as a basis for upgrading through third-party Infrastructure-as-Code tools like [Terraform](https://www.terraform.io/).

 [Automated upgrades]({{< baseurl >}}/k3s/latest/en/upgrades/automated/) describes how to perform Kubernetes-native automated upgrades using Rancher's [system-upgrade-controller](https://github.com/rancher/system-upgrade-controller).
+
+> The experimental embedded Dqlite data store was deprecated in K3s v1.19.1. Please note that upgrades from experimental Dqlite to experimental embedded etcd are not supported. If you attempt an upgrade it will not succeed and data will be lost.
@@ -38,4 +38,4 @@ For example, on my current test system, I have set the kernel boot line to:
 printk.devkmsg=on console=tty1 rancher.autologin=tty1 console=ttyS0 rancher.autologin=ttyS0    rancher.state.dev=LABEL=RANCHER_STATE rancher.state.autoformat=[/dev/sda,/dev/vda] rancher.rm_usr loglevel=8 netconsole=+9999@10.0.2.14/,514@192.168.42.223/
 ```

-The kernel boot parameters can be set during installation using `sudo ros install --append "...."`, or on an installed RancherOS system,  by running `sudo ros config syslinx` (which will start vi in a container, editing the `global.cfg` boot config file.
+The kernel boot parameters can be set during installation using `sudo ros install --append "...."`, or on an installed RancherOS system,  by running `sudo ros config syslinux` (which will start vi in a container, editing the `global.cfg` boot config file.
@@ -3,7 +3,7 @@ title: Quick Start
 weight: 1
 ---

-If you have a specific RanchersOS machine requirements, please check out our [guides on running RancherOS]({{< baseurl >}}/os/v1.x/en/installation/platform/). With the rest of this guide, we'll start up a RancherOS using [Docker machine]({{< baseurl >}}/os/v1.x/en/installation/workstation//docker-machine/) and show you some of what RancherOS can do.
+If you have a specific RanchersOS machine requirements, please check out our [guides on running RancherOS]({{< baseurl >}}/os/v1.x/en/installation/). With the rest of this guide, we'll start up a RancherOS using [Docker machine]({{< baseurl >}}/os/v1.x/en/installation/workstation//docker-machine/) and show you some of what RancherOS can do.

 ### Launching RancherOS using Docker Machine

@@ -100,7 +100,7 @@ The table below details the parameters for the group schema configuration.
 | Search Attribute | Attribute used to construct search filters when adding groups to clusters or projects. See description of user schema `Search Attribute`. |
 | Search Filter | This filter gets applied to the list of groups that is searched when Rancher attempts to add groups to a site access list or tries to add groups to clusters or projects. For example, a group search filter could be <code>(&#124;(cn=group1)(cn=group2))</code>. Note: If the search filter does not use [valid AD search syntax,](https://docs.microsoft.com/en-us/windows/win32/adsi/search-filter-syntax) the list of groups will be empty. |
 | Group DN Attribute | The name of the group attribute whose format matches the values in the user attribute describing a the user's memberships. See  `User Member Attribute`. |
-| Nested Group Membership | This settings defines whether Rancher should resolve nested group memberships. Use only if your organisation makes use of these nested memberships (ie. you have groups that contain other groups as members). |
+| Nested Group Membership | This settings defines whether Rancher should resolve nested group memberships. Use only if your organisation makes use of these nested memberships (ie. you have groups that contain other groups as members. We advise avoiding nested groups when possible). |

 ---

@@ -24,9 +24,36 @@ If your organization uses Keycloak Identity Provider (IdP) for user authenticati

      ><sup>1</sup>: Optionally, you can enable either one or both of these settings.
      ><sup>2</sup>: Rancher SAML metadata won't be generated until a SAML provider is configured and saved.
+  
+  {{< img "/img/rancher/keycloak/keycloak-saml-client-configuration.png" "">}}
+      
+- In the new SAML client, create Mappers to expose the users fields
+  - Add all "Builtin Protocol Mappers"
+    {{< img "/img/rancher/keycloak/keycloak-saml-client-builtin-mappers.png" "">}}
+  - Create a new "Group list" mapper to map the member attribute to a user's groups
+    {{< img "/img/rancher/keycloak/keycloak-saml-client-group-mapper.png" "">}}       
 - Export a `metadata.xml` file from your Keycloak client:
  From the `Installation` tab, choose the `SAML Metadata IDPSSODescriptor` format option and download your file.
-
+  
+  >**Note**
+  > Keycloak versions 6.0.0 and up no longer provide the IDP metadata under the `Installation` tab.
+  > You can still get the XML from the following url:
+  >  
+  > `https://{KEYCLOAK-URL}/auth/realms/{REALM-NAME}/protocol/saml/descriptor`
+  >  
+  > The XML obtained from this URL contains `EntitiesDescriptor` as the root element. Rancher expects the root element to be `EntityDescriptor` rather than `EntitiesDescriptor`. So before passing this XML to Rancher, follow these steps to adjust it:
+  >  
+  >    * Copy all the attributes from `EntitiesDescriptor` to the `EntityDescriptor` that are not present.
+  >    * Remove the `<EntitiesDescriptor>` tag from the beginning.
+  >    * Remove the `</EntitiesDescriptor>` from the end of the xml.
+  >  
+  > You are left with something similar as the example below:
+  >  
+  > ```
+  > <EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:dsig="http://www.w3.org/2000/09/xmldsig#" entityID="https://{KEYCLOAK-URL}/auth/realms/{REALM-NAME}">
+  >   .... 
+  > </EntityDescriptor>
+  > ```

 ## Configuring Keycloak in Rancher

@@ -35,18 +62,18 @@ If your organization uses Keycloak Identity Provider (IdP) for user authenticati

 1.	Select **Keycloak**.

-1.	Complete the **Configure Keycloak Account** form. Keycloak IdP lets you specify what data store you want to use. You can either add a database or use an existing LDAP server. For example, if you select your Active Directory (AD) server, the examples below describe how you can map AD attributes to fields within Rancher.
+1.	Complete the **Configure Keycloak Account** form.


-    | Field                     | Description                                                                   |
-    | ------------------------- | ----------------------------------------------------------------------------- |
-    | Display Name Field        | The AD attribute that contains the display name of users.                     |
-    | User Name Field           | The AD attribute that contains the user name/given name.                      |
-    | UID Field                 | An AD attribute that is unique to every user.                                 |
-    | Groups Field              | Make entries for managing group memberships.                                  |
-    | Rancher API Host          | The URL for your Rancher Server.                                              |
-    | Private Key / Certificate | A key/certificate pair to create a secure shell between Rancher and your IdP. |
-    | IDP-metadata              | The `metadata.xml` file that you exported from your IdP server.               |
+    | Field                     | Description                                                                            |
+    | ------------------------- | -------------------------------------------------------------------------------------- |
+    | Display Name Field        | The attribute that contains the display name of users. <br/><br/>Example: `givenName`  |
+    | User Name Field           | The attribute that contains the user name/given name. <br/><br/>Example: `email`       |
+    | UID Field                 | An attribute that is unique to every user. <br/><br/>Example: `email`                  |
+    | Groups Field              | Make entries for managing group memberships. <br/><br/>Example: `member`               |
+    | Rancher API Host          | The URL for your Rancher Server.                                                       |
+    | Private Key / Certificate | A key/certificate pair to create a secure shell between Rancher and your IdP.          |
+    | IDP-metadata              | The `metadata.xml` file that you exported from your IdP server.                        |

    >**Tip:** You can generate a key/certificate pair using an openssl command. For example:
    >
@@ -96,25 +123,3 @@ Try configuring and saving keycloak as your SAML provider and then accessing the

  * Check your Keycloak log.
  * If the log displays `request validation failed: org.keycloak.common.VerificationException: SigAlg was null`, set `Client Signature Required` to `OFF` in your Keycloak client.
-
-### Keycloak 6.0.0+: IDPSSODescriptor missing from options
-
-Keycloak versions 6.0.0 and up no longer provide the IDP metadata under the `Installation` tab.
-You can still get the XML from the following url:
-
-`https://{KEYCLOAK-URL}/auth/realms/{REALM-NAME}/protocol/saml/descriptor`
-
-The XML obtained from this URL contains `EntitiesDescriptor` as the root element. Rancher expects the root element to be `EntityDescriptor` rather than `EntitiesDescriptor`. So before passing this XML to Rancher, follow these steps to adjust it:
-
-  * Copy all the tags from `EntitiesDescriptor` to the `EntityDescriptor`.
-  * Remove the `<EntitiesDescriptor>` tag from the beginning.
-  * Remove the `</EntitiesDescriptor>` from the end of the xml.
-
-You are left with something similar as the example below:
-
-```
-<EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:dsig="http://www.w3.org/2000/09/xmldsig#" entityID="https://{KEYCLOAK-URL}/auth/realms/{REALM-NAME}">
-  ....
-
-</EntityDescriptor>
-```
@@ -33,6 +33,8 @@ The option to refresh the Kubernetes metadata is available for administrators by

 To force Rancher to refresh the Kubernetes metadata, a manual refresh action is available under **Tools > Drivers > Refresh Kubernetes Metadata** on the right side corner. 

+You can configure Rancher to only refresh metadata when desired by setting `refresh-interval-minutes` to `0` (see below) and using this button to perform the metadata refresh manually when desired.
+
 ### Configuring the Metadata Synchronization

 > Only administrators can change these settings.
@@ -88,4 +90,4 @@ After new Kubernetes versions are loaded into the Rancher setup, additional step
 1. Download `rancher-images.txt`.
 1. Prepare the private registry using the same steps during the [air gap install]({{<baseurl>}}/rancher/v2.x/en/installation/other-installation-methods/air-gap/populate-private-registry), but instead of using the `rancher-images.txt` from the releases page, use the one obtained from the previous steps.

-**Result:** The air gap installation of Rancher can now sync the Kubernetes metadata. If you update your private registry when new versions of Kubernetes are released, you can provision clusters with the new version without having to upgrade Rancher.
+**Result:** The air gap installation of Rancher can now sync the Kubernetes metadata. If you update your private registry when new versions of Kubernetes are released, you can provision clusters with the new version without having to upgrade Rancher.
@@ -51,6 +51,8 @@ The steps to add custom roles differ depending on the version of Rancher.
 1.  Use the **Grant Resources** options to assign individual [Kubernetes API endpoints](https://kubernetes.io/docs/reference/) to the role.

    > When viewing the resources associated with default roles created by Rancher, if there are multiple Kubernetes API resources on one line item, the resource will have `(Custom)` appended to it. These are not custom resources but just an indication that there are multiple Kubernetes API resources as one resource.
+    
+    > The Resource text field provides a method to search for pre-defined Kubernetes API resources, or enter a custom resource name for the grant. The pre-defined or `(Custom)` resource must be selected from the dropdown, after entering a resource name into this field.

    You can also choose the individual cURL methods (`Create`, `Delete`, `Get`, etc.) available for use with each endpoint you assign.

@@ -82,6 +84,8 @@ The steps to add custom roles differ depending on the version of Rancher.
 1.  Use the **Grant Resources** options to assign individual [Kubernetes API endpoints](https://kubernetes.io/docs/reference/) to the role.

    > When viewing the resources associated with default roles created by Rancher, if there are multiple Kubernetes API resources on one line item, the resource will have `(Custom)` appended to it. These are not custom resources but just an indication that there are multiple Kubernetes API resources as one resource.
+    
+    > The Resource text field provides a method to search for pre-defined Kubernetes API resources, or enter a custom resource name for the grant. The pre-defined or `(Custom)` resource must be selected from the dropdown, after entering a resource name into this field.

    You can also choose the individual cURL methods (`Create`, `Delete`, `Get`, etc.) available for use with each endpoint you assign.

@@ -109,6 +113,9 @@ To create a custom global role based on an existing role,
 1. Enter a name for the role.
 1. Optional: To assign the custom role default for new users, go to the **New User Default** section and click **Yes: Default role for new users.**
 1. In the **Grant Resources** section, select the Kubernetes resource operations that will be enabled for users with the custom role.
+
+    > The Resource text field provides a method to search for pre-defined Kubernetes API resources, or enter a custom resource name for the grant. The pre-defined or `(Custom)` resource must be selected from the dropdown, after entering a resource name into this field.
+
 1. Click **Save.**

 ### Creating a Custom Global Role that Does Not Copy Rules from Another Role
@@ -120,6 +127,9 @@ Custom global roles don't have to be based on existing roles. To create a custom
 1. Enter a name for the role.
 1. Optional: To assign the custom role default for new users, go to the **New User Default** section and click **Yes: Default role for new users.**
 1. In the **Grant Resources** section, select the Kubernetes resource operations that will be enabled for users with the custom role.
+
+    > The Resource text field provides a method to search for pre-defined Kubernetes API resources, or enter a custom resource name for the grant. The pre-defined or `(Custom)` resource must be selected from the dropdown, after entering a resource name into this field.
+    
 1. Click **Save.**

 ## Deleting a Custom Global Role
@@ -7,8 +7,6 @@ _Permissions_ are individual access rights that you can assign when selecting a

 Global Permissions define user authorization outside the scope of any particular cluster. Out-of-the-box, there are three default global permissions: `Administrator`, `Standard User` and `User-base`.

-In Rancher v2.5, a restricted-admin role was added.
-
 - **Administrator:** These users have full control over the entire Rancher system and all clusters within it.

 - <a id="user"></a>**Standard User:** These users can create new clusters and use them. Standard users can also assign other users permissions to their clusters.
@@ -7,6 +7,7 @@ By default, some cluster-level API tokens are generated with infinite time-to-li

 You can deactivate API tokens by deleting them or by deactivating the user account.

+### Deleting tokens
 To delete a token,

 1. Go to the list of all tokens in the Rancher API view at `https://<Rancher-Server-IP>/v3/tokens`.
@@ -19,7 +20,7 @@ Here is the complete list of tokens that are generated with `ttl=0`:

 | Token | Description |
 |-------|-------------|
-| `kubeconfig-*` | Kubeconfig token |
+| `kubeconfig-*` | Kubeconfig token | 
 | `kubectl-shell-*` | Access to `kubectl` shell in the browser |
 | `agent-*` | Token for agent deployment |
 | `compose-token-*` | Token for compose |
@@ -27,3 +28,21 @@ Here is the complete list of tokens that are generated with `ttl=0`:
 | `*-pipeline*` | Pipeline token for project |
 | `telemetry-*` | Telemetry token |
 | `drain-node-*` | Token for drain (we use `kubectl` for drain because there is no native Kubernetes API) |
+
+
+### Setting TTL on Kubeconfig Tokens
+_**Available as of v2.4.6**_
+
+Starting Rancher v2.4.6, admins can set a global TTL on Kubeconfig tokens. Once the token expires the kubectl command will require the user to authenticate to Rancher.
+
+1. Disable the kubeconfig-generate-token setting in the Rancher API view at `https://<Rancher-Server-IP/v3/settings/kubeconfig-generate-token`. This setting instructs Rancher to no longer automatically generate a token when a user clicks on download a kubeconfig file. The kubeconfig file will now provide a command to login to Rancher.
+
+2. Edit the setting and set the value to `false`. 
+
+3. Go to setting kubeconfig-token-ttl-minutes in the Rancher API view at `https://<Rancher-Server-IP/v3/settings/kubeconfig-token-ttl-minutes`. By default, kubeconfig-token-ttl-minutes is 960 (16 hours).
+
+4. Edit the setting and set the value to desired duration in minutes.
+_**Note:**_ This value cannot exceed max-ttl of API tokens.(`https://<Rancher-Server-IP/v3/settings/auth-token-max-ttl-minutes`). In Rancher v2.4.6, auth-token-max-ttl-minutes is set to 1440 (24 hours) by default. Starting Rancher v2.4.7, auth-token-max-ttl-minutes would default to 0 allowing tokens to never expire, similar to v2.4.5.
+
+
+ 
@@ -0,0 +1,134 @@
+---
+title: Restoring Backups—Kubernetes installs
+shortTitle: Kubernetes Installs
+weight: 370
+aliases:
+  - /rancher/v2.x/en/installation/after-installation/ha-backup-and-restoration/
+---
+
+This procedure describes how to use RKE to restore a snapshot of the Rancher Kubernetes cluster. 
+This will restore the Kubernetes configuration and the Rancher database and state.
+
+> **Note:** This document covers clusters set up with RKE >= v0.2.x, for older RKE versions refer to the [RKE Documentation]({{<baseurl>}}/rke/latest/en/etcd-snapshots/restoring-from-backup).
+
+## Restore Outline
+
+<!-- TOC -->
+
+- [1. Preparation](#1-preparation)
+- [2. Place Snapshot](#2-place-snapshot)
+- [3. Configure RKE](#3-configure-rke)
+- [4. Restore the Database and bring up the Cluster](#4-restore-the-database-and-bring-up-the-cluster)
+
+<!-- /TOC -->
+
+### 1. Preparation
+
+It is advised that you run the restore from your local host or a jump box/bastion where your cluster yaml, rke statefile, and kubeconfig are stored.  You will need [RKE]({{<baseurl>}}/rke/latest/en/installation/) and [kubectl]({{<baseurl>}}/rancher/v2.x/en/faq/kubectl/) CLI utilities installed locally.
+
+Prepare by creating 3 new nodes to be the target for the restored Rancher instance.  We recommend that you start with fresh nodes and a clean state. For clarification on the requirements, review the [Installation Requirements](https://rancher.com/docs/rancher/v2.x/en/installation/requirements/).  
+
+Alternatively you can re-use the existing nodes after clearing Kubernetes and Rancher configurations. This will destroy the data on these nodes. See [Node Cleanup]({{<baseurl>}}/rancher/v2.x/en/faq/cleaning-cluster-nodes/) for the procedure.
+
+> **IMPORTANT:** Before starting the restore make sure all the Kubernetes services on the old cluster nodes are stopped. We recommend powering off the nodes to be sure.
+
+### 2. Place Snapshot
+
+As of RKE v0.2.0, snapshots could be saved in an S3 compatible backend. To restore your cluster from the snapshot stored in S3 compatible backend, you can skip this step and retrieve the snapshot in [4. Restore the Database and bring up the Cluster](#4-restore-the-database-and-bring-up-the-cluster). Otherwise, you will need to place the snapshot directly on one of the etcd nodes.
+
+Pick one of the clean nodes that will have the etcd role assigned and place the zip-compressed snapshot file in `/opt/rke/etcd-snapshots` on that node.
+
+> **Note:** Because of a current limitation in RKE, the restore process does not work correctly if `/opt/rke/etcd-snapshots` is a NFS share that is mounted on all nodes with the etcd role. The easiest options are to either keep `/opt/rke/etcd-snapshots` as a local folder during the restore process and only mount the NFS share there after it has been completed, or to only mount the NFS share to one node with an etcd role in the beginning.
+
+### 3. Configure RKE
+
+Use your original `rancher-cluster.yml` and `rancher-cluster.rkestate` files. If they are not stored in a version control system, it is a good idea to back them up before making any changes.
+
+```
+cp rancher-cluster.yml rancher-cluster.yml.bak
+cp rancher-cluster.rkestate rancher-cluster.rkestate.bak
+```
+
+If the replaced or cleaned nodes have been configured with new IP addresses, modify the `rancher-cluster.yml` file to ensure the address and optional internal_address fields reflect the new addresses.
+
+> **IMPORTANT:** You should not rename the `rancher-cluster.yml` or `rancher-cluster.rkestate` files. It is important that the filenames match each other.
+
+### 4. Restore the Database and bring up the Cluster
+
+You will now use the RKE command-line tool with the `rancher-cluster.yml` and the `rancher-cluster.rkestate` configuration files to restore the etcd database and bring up the cluster on the new nodes.
+
+> **Note:** Ensure your `rancher-cluster.rkestate` is present in the same directory as the `rancher-cluster.yml` file before starting the restore, as this file contains the certificate data for the cluster.
+
+#### Restoring from a Local Snapshot
+
+When restoring etcd from a local snapshot, the snapshot is assumed to be located on the target node in the directory `/opt/rke/etcd-snapshots`.
+
+```
+rke etcd snapshot-restore --name snapshot-name --config ./rancher-cluster.yml
+```
+
+> **Note:** The --name parameter expects the filename of the snapshot without the extension.
+
+#### Restoring from a Snapshot in S3
+
+_Available as of RKE v0.2.0_
+
+When restoring etcd from a snapshot located in an S3 compatible backend, the command needs the S3 information in order to connect to the S3 backend and retrieve the snapshot.
+
+```
+$ rke etcd snapshot-restore --config ./rancher-cluster.yml --name snapshot-name \
+--s3 --access-key S3_ACCESS_KEY --secret-key S3_SECRET_KEY \
+--bucket-name s3-bucket-name --s3-endpoint s3.amazonaws.com \
+--folder folder-name # Available as of v2.3.0
+```
+
+#### Options for `rke etcd snapshot-restore`
+
+S3 specific options are only available for RKE v0.2.0+.
+
+| Option | Description | S3 Specific |
+| --- | --- | ---|
+| `--name` value            |  Specify snapshot name | |
+| `--config` value          |  Specify an alternate cluster YAML file (default: "cluster.yml") [$RKE_CONFIG] | |
+| `--s3`                    |  Enabled backup to s3 |* |
+| `--s3-endpoint` value     |  Specify s3 endpoint url (default: "s3.amazonaws.com") | * |
+| `--access-key` value      |  Specify s3 accessKey | *|
+| `--secret-key` value      |  Specify s3 secretKey | *|
+| `--bucket-name` value     |  Specify s3 bucket name | *|
+| `--folder` value |  Specify s3 folder in the bucket name _Available as of v2.3.0_ | *|
+| `--region` value          |  Specify the s3 bucket location (optional) | *|
+| `--ssh-agent-auth`      |   [Use SSH Agent Auth defined by SSH_AUTH_SOCK]({{<baseurl>}}/rke/latest/en/config-options/#ssh-agent) | |
+| `--ignore-docker-version`  | [Disable Docker version check]({{<baseurl>}}/rke/latest/en/config-options/#supported-docker-versions) |
+
+#### Testing the Cluster
+
+Once RKE completes it will have created a credentials file in the local directory.  Configure `kubectl` to use the `kube_config_rancher-cluster.yml` credentials file and check on the state of the cluster. See [Installing and Configuring kubectl]({{<baseurl>}}/rancher/v2.x/en/faq/kubectl/#configuration) for details.
+
+#### Check Kubernetes Pods
+
+Wait for the pods running in `kube-system`, `ingress-nginx` and the `rancher` pod in `cattle-system` to return to the `Running` state.
+
+> **Note:** `cattle-cluster-agent` and `cattle-node-agent` pods will be in an `Error` or `CrashLoopBackOff` state until Rancher server is up and the DNS/Load Balancer have been pointed at the new cluster.
+
+```
+kubectl get pods --all-namespaces
+
+NAMESPACE       NAME                                    READY     STATUS    RESTARTS   AGE
+cattle-system   cattle-cluster-agent-766585f6b-kj88m    0/1       Error     6          4m
+cattle-system   cattle-node-agent-wvhqm                 0/1       Error     8          8m
+cattle-system   rancher-78947c8548-jzlsr                0/1       Running   1          4m
+ingress-nginx   default-http-backend-797c5bc547-f5ztd   1/1       Running   1          4m
+ingress-nginx   nginx-ingress-controller-ljvkf          1/1       Running   1          8m
+kube-system     canal-4pf9v                             3/3       Running   3          8m
+kube-system     cert-manager-6b47fc5fc-jnrl5            1/1       Running   1          4m
+kube-system     kube-dns-7588d5b5f5-kgskt               3/3       Running   3          4m
+kube-system     kube-dns-autoscaler-5db9bbb766-s698d    1/1       Running   1          4m
+kube-system     metrics-server-97bc649d5-6w7zc          1/1       Running   1          4m
+kube-system     tiller-deploy-56c4cf647b-j4whh          1/1       Running   1          4m
+```
+
+#### Finishing Up
+
+Rancher should now be running and available to manage your Kubernetes clusters. Review the [recommended architecture]({{<baseurl>}}/rancher/v2.x/en/installation/k8s-install/#recommended-architecture) for Kubernetes installations and update the endpoints for Rancher DNS or the Load Balancer that you built during Step 1 of the Kubernetes install ([1. Create Nodes and Load Balancer]({{<baseurl>}}/rancher/v2.x/en/installation/k8s-install/create-nodes-lb/#load-balancer)) to target the new cluster. Once the endpoints are updated, the agents on your managed clusters should automatically reconnect. This may take 10-15 minutes due to reconnect back off timeouts.
+
+> **IMPORTANT:** Remember to save your updated RKE config (`rancher-cluster.yml`) state file (`rancher-cluster.rkestate`) and `kubectl` credentials (`kube_config_rancher-cluster.yml`) files in a safe place for future maintenance for example in a version control system.
@@ -35,3 +35,5 @@ Rancher contains a variety of tools that aren't included in Kubernetes to assist
 - Monitoring
 - Istio Service Mesh
 - OPA Gatekeeper
+
+For more information, see [Tools]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/tools/)
@@ -13,6 +13,11 @@ This kubeconfig file and its contents are specific to the cluster you are viewin

 After you download the kubeconfig file, you will be able to use the kubeconfig file and its Kubernetes [contexts](https://kubernetes.io/docs/reference/kubectl/cheatsheet/#kubectl-context-and-configuration) to access your downstream cluster.

+_Available as of v2.4.6_ 
+
+If admins have [enforced TTL on kubeconfig tokens](../../api/api-tokens/#setting-ttl-on-kubeconfig-tokens), the kubeconfig file requires [rancher cli](../cli) to be present in your PATH. 
+
+
 ### Two Authentication Methods for RKE Clusters

 If the cluster is not an [RKE cluster,]({{<baseurl>}}/rancher/v2.x/en/cluster-provisioning/rke-clusters/) the kubeconfig file allows you to access the cluster in only one way: it lets you be authenticated with the Rancher server, then Rancher allows you to run kubectl commands on the cluster.
@@ -0,0 +1,31 @@
+---
+title: Release Notes
+---
+
+
+# Important note on Istio 1.5.x versions
+
+When upgrading from any 1.4 version of Istio to any 1.5 version, the Rancher installer will delete several resources in order to complete the upgrade, at which point they will be immediately re-installed. This includes the `istio-reader-service-account`. If your Istio installation is using this service account be aware that any secrets tied to the service account will be deleted. Most notably this will **break specific [multi-cluster deployments](https://archive.istio.io/v1.4/docs/setup/install/multicluster/)**. Downgrades back to 1.4 are not possible.
+
+See the official upgrade notes for additional information on the 1.5 release and upgrading from 1.4: https://istio.io/latest/news/releases/1.5.x/announcing-1.5/upgrade-notes/
+
+> **Note:** Rancher continues to use the Helm installation method, which produces a different architecture from an istioctl installation.
+
+
+
+## Istio 1.5.9 release notes
+
+**Bug fixes**
+
+* The Kiali traffic graph is now working [#28109](https://github.com/rancher/rancher/issues/28109)
+
+**Known Issues**
+
+* The Kiali traffic graph is offset in the UI [#28207](https://github.com/rancher/rancher/issues/28207)
+
+
+## Istio 1.5.8 release notes
+
+**Known Issues**
+
+* The Kiali traffic graph is currently not working [#24924](https://github.com/istio/istio/issues/24924)
@@ -0,0 +1,489 @@
+---
+title: Prometheus Custom Metrics Adapter
+weight: 5
+---
+
+After you've enabled [cluster level monitoring]({{< baseurl >}}/rancher/v2.x/en/cluster-admin/tools/monitoring/#enabling-cluster-monitoring), You can view the metrics data from Rancher. You can also deploy the Prometheus custom metrics adapter then you can use the HPA with metrics stored in cluster monitoring.
+
+## Deploy Prometheus Custom Metrics Adapter
+
+We are going to use the [Prometheus custom metrics adapter](https://github.com/DirectXMan12/k8s-prometheus-adapter/releases/tag/v0.5.0), version v0.5.0. This is a great example for the [custom metrics server](https://github.com/kubernetes-incubator/custom-metrics-apiserver). And you must be the *cluster owner* to execute following steps.
+
+- Get the service account of the cluster monitoring is using. It should be configured in the workload ID: `statefulset:cattle-prometheus:prometheus-cluster-monitoring`. And if you didn't customize anything, the service account name should be `cluster-monitoring`.
+
+- Grant permission to that service account. You will need two kinds of permission. 
+One role is `extension-apiserver-authentication-reader` in `kube-system`, so you will need to create a `Rolebinding` to in `kube-system`. This permission is to get api aggregation configuration from config map in `kube-system`.
+
+```yaml
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: custom-metrics-auth-reader
+  namespace: kube-system
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: Role
+  name: extension-apiserver-authentication-reader
+subjects:
+- kind: ServiceAccount
+  name: cluster-monitoring
+  namespace: cattle-prometheus
+```
+
+The other one is cluster role `system:auth-delegator`, so you will need to create a `ClusterRoleBinding`. This permission is to have subject access review permission.
+
+```yaml
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: custom-metrics:system:auth-delegator
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: system:auth-delegator
+subjects:
+- kind: ServiceAccount
+  name: cluster-monitoring
+  namespace: cattle-prometheus
+```
+
+- Create configuration for custom metrics adapter. Following is an example configuration. There will be a configuration details in next session.
+
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: adapter-config
+  namespace: cattle-prometheus
+data:
+  config.yaml: |
+    rules:
+    - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
+      seriesFilters: []
+      resources:
+        overrides:
+          namespace:
+            resource: namespace
+          pod_name:
+            resource: pod
+      name:
+        matches: ^container_(.*)_seconds_total$
+        as: ""
+      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
+    - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
+      seriesFilters:
+      - isNot: ^container_.*_seconds_total$
+      resources:
+        overrides:
+          namespace:
+            resource: namespace
+          pod_name:
+            resource: pod
+      name:
+        matches: ^container_(.*)_total$
+        as: ""
+      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
+    - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
+      seriesFilters:
+      - isNot: ^container_.*_total$
+      resources:
+        overrides:
+          namespace:
+            resource: namespace
+          pod_name:
+            resource: pod
+      name:
+        matches: ^container_(.*)$
+        as: ""
+      metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}) by (<<.GroupBy>>)
+    - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
+      seriesFilters:
+      - isNot: .*_total$
+      resources:
+        template: <<.Resource>>
+      name:
+        matches: ""
+        as: ""
+      metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
+    - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
+      seriesFilters:
+      - isNot: .*_seconds_total
+      resources:
+        template: <<.Resource>>
+      name:
+        matches: ^(.*)_total$
+        as: ""
+      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
+    - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
+      seriesFilters: []
+      resources:
+        template: <<.Resource>>
+      name:
+        matches: ^(.*)_seconds_total$
+        as: ""
+      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
+    resourceRules:
+      cpu:
+        containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
+        nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>, id='/'}[1m])) by (<<.GroupBy>>)
+        resources:
+          overrides:
+            instance:
+              resource: node
+            namespace:
+              resource: namespace
+            pod_name:
+              resource: pod
+        containerLabel: container_name
+      memory:
+        containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)
+        nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
+        resources:
+          overrides:
+            instance:
+              resource: node
+            namespace:
+              resource: namespace
+            pod_name:
+              resource: pod
+        containerLabel: container_name
+      window: 1m
+```
+
+- Create HTTPS TLS certs for your api server. You can use following command to create a self-signed cert.
+
+```bash
+openssl req -new -newkey rsa:4096 -x509 -sha256 -days 365 -nodes -out serving.crt -keyout serving.key -subj "/C=CN/CN=custom-metrics-apiserver.cattle-prometheus.svc.cluster.local"
+# And you will find serving.crt and serving.key in your path. And then you are going to create a secret in cattle-prometheus namespace.
+kubectl create secret generic -n cattle-prometheus cm-adapter-serving-certs --from-file=serving.key=./serving.key --from-file=serving.crt=./serving.crt 
+```
+
+- Then you can create the prometheus custom metrics adapter. And you will need a service for this deployment too. Creating it via Import YAML or Rancher would do. Please create those resources in `cattle-prometheus` namespaces.
+
+Here is the prometheus custom metrics adapter deployment.
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  labels:
+    app: custom-metrics-apiserver
+  name: custom-metrics-apiserver
+  namespace: cattle-prometheus
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: custom-metrics-apiserver
+  template:
+    metadata:
+      labels:
+        app: custom-metrics-apiserver
+      name: custom-metrics-apiserver
+    spec:
+      serviceAccountName: cluster-monitoring
+      containers:
+      - name: custom-metrics-apiserver
+        image: directxman12/k8s-prometheus-adapter-amd64:v0.5.0
+        args:
+        - --secure-port=6443
+        - --tls-cert-file=/var/run/serving-cert/serving.crt
+        - --tls-private-key-file=/var/run/serving-cert/serving.key
+        - --logtostderr=true
+        - --prometheus-url=http://prometheus-operated/
+        - --metrics-relist-interval=1m
+        - --v=10
+        - --config=/etc/adapter/config.yaml
+        ports:
+        - containerPort: 6443
+        volumeMounts:
+        - mountPath: /var/run/serving-cert
+          name: volume-serving-cert
+          readOnly: true
+        - mountPath: /etc/adapter/
+          name: config
+          readOnly: true
+        - mountPath: /tmp
+          name: tmp-vol
+      volumes:
+      - name: volume-serving-cert
+        secret:
+          secretName: cm-adapter-serving-certs
+      - name: config
+        configMap:
+          name: adapter-config
+      - name: tmp-vol
+        emptyDir: {}
+
+```
+
+Here is the service of the deployment.
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: custom-metrics-apiserver
+  namespace: cattle-prometheus
+spec:
+  ports:
+  - port: 443
+    targetPort: 6443
+  selector:
+    app: custom-metrics-apiserver
+```
+
+- Create API service for your custom metric server.
+
+```yaml
+apiVersion: apiregistration.k8s.io/v1beta1
+kind: APIService
+metadata:
+  name: v1beta1.custom.metrics.k8s.io
+spec:
+  service:
+    name: custom-metrics-apiserver
+    namespace: cattle-prometheus
+  group: custom.metrics.k8s.io
+  version: v1beta1
+  insecureSkipTLSVerify: true
+  groupPriorityMinimum: 100
+  versionPriority: 100
+
+```
+
+- Then you can verify your custom metrics server by `kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1`. If you see the return datas from the api, it means that the metrics server has been successfully set up.
+
+- You create HPA with custom metrics now. Here is an example of HPA. You will need to create a nginx deployment in your namespace first.
+
+```yaml
+kind: HorizontalPodAutoscaler
+apiVersion: autoscaling/v2beta1
+metadata:
+  name: nginx
+spec:
+  scaleTargetRef:
+    # point the HPA at the nginx deployment you just created
+    apiVersion: apps/v1
+    kind: Deployment
+    name: nginx
+  # autoscale between 1 and 10 replicas
+  minReplicas: 1
+  maxReplicas: 10
+  metrics:
+  # use a "Pods" metric, which takes the average of the
+  # given metric across all pods controlled by the autoscaling target
+  - type: Pods
+    pods:
+      metricName: memory_usage_bytes
+      targetAverageValue: 5000000
+```
+
+And then, you should see your nginx is scaling up. HPA with custom metrics works.
+
+## Configuration of prometheus custom metrics adapter
+
+> Refer to https://github.com/DirectXMan12/k8s-prometheus-adapter/blob/master/docs/config.md
+
+The adapter determines which metrics to expose, and how to expose them,
+through a set of "discovery" rules.  Each rule is executed independently
+(so make sure that your rules are mutually exclusive), and specifies each
+of the steps the adapter needs to take to expose a metric in the API.
+
+Each rule can be broken down into roughly four parts:
+
+- *Discovery*, which specifies how the adapter should find all Prometheus
+  metrics for this rule.
+
+- *Association*, which specifies how the adapter should determine which
+  Kubernetes resources a particular metric is associated with.
+
+- *Naming*, which specifies how the adapter should expose the metric in
+  the custom metrics API.
+
+- *Querying*, which specifies how a request for a particular metric on one
+  or more Kubernetes objects should be turned into a query to Prometheus.
+
+A more comprehensive configuration file can be found in
+[sample-config.yaml](sample-config.yaml), but a basic config with one rule
+might look like:
+
+```yaml
+rules:
+# this rule matches cumulative cAdvisor metrics measured in seconds
+- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
+  resources:
+    # skip specifying generic resource<->label mappings, and just
+    # attach only pod and namespace resources by mapping label names to group-resources
+    overrides:
+      namespace: {resource: "namespace"},
+      pod_name: {resource: "pod"},
+  # specify that the `container_` and `_seconds_total` suffixes should be removed.
+  # this also introduces an implicit filter on metric family names
+  name:
+    # we use the value of the capture group implicitly as the API name
+    # we could also explicitly write `as: "$1"`
+    matches: "^container_(.*)_seconds_total$"
+  # specify how to construct a query to fetch samples for a given series
+  # This is a Go template where the `.Series` and `.LabelMatchers` string values
+  # are available, and the delimiters are `<<` and `>>` to avoid conflicts with
+  # the prometheus query language
+  metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[2m])) by (<<.GroupBy>>)"
+```
+
+### Discovery
+
+Discovery governs the process of finding the metrics that you want to
+expose in the custom metrics API.  There are two fields that factor into
+discovery: `seriesQuery` and `seriesFilters`.
+
+`seriesQuery` specifies Prometheus series query (as passed to the
+`/api/v1/series` endpoint in Prometheus) to use to find some set of
+Prometheus series.  The adapter will strip the label values from this
+series, and then use the resulting metric-name-label-names combinations
+later on.
+
+In many cases, `seriesQuery` will be sufficient to narrow down the list of
+Prometheus series.  However, sometimes (especially if two rules might
+otherwise overlap), it's useful to do additional filtering on metric
+names.  In this case, `seriesFilters` can be used.  After the list of
+series is returned from `seriesQuery`, each series has its metric name
+filtered through any specified filters.
+
+Filters may be either:
+
+- `is: <regex>`, which matches any series whose name matches the specified
+  regex.
+
+- `isNot: <regex>`, which matches any series whose name does not match the
+  specified regex.
+
+For example:
+
+```yaml
+# match all cAdvisor metrics that aren't measured in seconds
+seriesQuery: '{__name__=~"^container_.*_total",container_name!="POD",namespace!="",pod_name!=""}'
+seriesFilters:
+  isNot: "^container_.*_seconds_total"
+```
+
+### Association
+
+Association governs the process of figuring out which Kubernetes resources
+a particular metric could be attached to.  The `resources` field controls
+this process.
+
+There are two ways to associate resources with a particular metric.  In
+both cases, the value of the label becomes the name of the particular
+object.
+
+One way is to specify that any label name that matches some particular
+pattern refers to some group-resource based on the label name.  This can
+be done using the `template` field.   The pattern is specified as a Go
+template, with the `Group` and `Resource` fields representing group and
+resource. You don't necessarily have to use the `Group` field (in which
+case the group is guessed by the system). For instance:
+
+```yaml
+# any label `kube_<group>_<resource>` becomes <group>.<resource> in Kubernetes
+resources:
+  template: "kube_<<.Group>>_<<.Resource>>"
+```
+
+The other way is to specify that some particular label represents some
+particular Kubernetes resource.  This can be done using the `overrides`
+field.  Each override maps a Prometheus label to a Kubernetes
+group-resource. For instance:
+
+```yaml
+# the microservice label corresponds to the apps.deployment resource
+resource:
+  overrides:
+    microservice: {group: "apps", resource: "deployment"}
+```
+
+These two can be combined, so you can specify both a template and some
+individual overrides.
+
+The resources mentioned can be any resource available in your kubernetes
+cluster, as long as you've got a corresponding label.
+
+### Naming
+
+Naming governs the process of converting a Prometheus metric name into
+a metric in the custom metrics API, and vice versa.  It's controlled by
+the `name` field.
+
+Naming is controlled by specifying a pattern to extract an API name from
+a Prometheus name, and potentially a transformation on that extracted
+value.
+
+The pattern is specified in the `matches` field, and is just a regular
+expression.  If not specified, it defaults to `.*`.
+
+The transformation is specified by the `as` field.  You can use any
+capture groups defined in the `matches` field.  If the `matches` field
+doesn't contain capture groups, the `as` field defaults to `$0`.  If it
+contains a single capture group, the `as` field defautls to `$1`.
+Otherwise, it's an error not to specify the as field.
+
+For example:
+
+```yaml
+# match turn any name <name>_total to <name>_per_second
+# e.g. http_requests_total becomes http_requests_per_second
+name:
+  matches: "^(.*)_total$"
+  as: "${1}_per_second"
+```
+
+### Querying
+
+Querying governs the process of actually fetching values for a particular
+metric.  It's controlled by the `metricsQuery` field.
+
+The `metricsQuery` field is a Go template that gets turned into
+a Prometheus query, using input from a particular call to the custom
+metrics API. A given call to the custom metrics API is distilled down to
+a metric name, a group-resource, and one or more objects of that
+group-resource.  These get turned into the following fields in the
+template:
+
+- `Series`: the metric name
+- `LabelMatchers`: a comma-separated list of label matchers matching the
+  given objects.  Currently, this is the label for the particular
+  group-resource, plus the label for namespace, if the group-resource is
+  namespaced.
+- `GroupBy`: a comma-separated list of labels to group by.  Currently,
+  this contains the group-resource label used in `LabelMatchers`.
+
+For instance, suppose we had a series `http_requests_total` (exposed as
+`http_requests_per_second` in the API) with labels `service`, `pod`,
+`ingress`, `namespace`, and `verb`. The first four correspond to
+Kubernetes resources.  Then, if someone requested the metric
+`pods/http_request_per_second` for the pods `pod1` and `pod2` in the
+`somens` namespace, we'd have:
+
+- `Series: "http_requests_total"`
+- `LabelMatchers: "pod=~\"pod1|pod2",namespace="somens"`
+- `GroupBy`: `pod`
+
+Additionally, there are two advanced fields that are "raw" forms of other
+fields:
+
+- `LabelValuesByName`: a map mapping the labels and values from the
+  `LabelMatchers` field.  The values are pre-joined by `|`
+  (for used with the `=~` matcher in Prometheus).
+- `GroupBySlice`: the slice form of `GroupBy`.
+
+In general, you'll probably want to use the `Series`, `LabelMatchers`, and
+`GroupBy` fields.  The other two are for advanced usage.
+
+The query is expected to return one value for each object requested.  The
+adapter will use the labels on the returned series to associate a given
+series back to its corresponding object.
+
+For example:
+
+```yaml
+# convert cumulative cAdvisor metrics into rates calculated over 2 minutes
+metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[2m])) by (<<.GroupBy>>)"
+```
@@ -0,0 +1,430 @@
+---
+title: Prometheus Expressions
+weight: 4
+---
+
+The PromQL expressions in this doc can be used to configure [alerts.]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/tools/alerts/)
+
+> Before expression can be used in alerts, monitoring must be enabled. For more information, refer to the documentation on enabling monitoring [at the cluster level]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/tools/monitoring/#enabling-cluster-monitoring) or [at the project level.]({{<baseurl>}}/rancher/v2.x/en/project-admin/tools/monitoring/#enabling-project-monitoring)
+
+For more information about querying Prometheus, refer to the official [Prometheus documentation.](https://prometheus.io/docs/prometheus/latest/querying/basics/)
+
+<!-- TOC -->
+
+- [Cluster Metrics](#cluster-metrics)
+  - [Cluster CPU Utilization](#cluster-cpu-utilization)
+  - [Cluster Load Average](#cluster-load-average)
+  - [Cluster Memory Utilization](#cluster-memory-utilization)
+  - [Cluster Disk Utilization](#cluster-disk-utilization)
+  - [Cluster Disk I/O](#cluster-disk-i-o)
+  - [Cluster Network Packets](#cluster-network-packets)
+  - [Cluster Network I/O](#cluster-network-i-o)
+- [Node Metrics](#node-metrics)
+  - [Node CPU Utilization](#node-cpu-utilization)
+  - [Node Load Average](#node-load-average)
+  - [Node Memory Utilization](#node-memory-utilization)
+  - [Node Disk Utilization](#node-disk-utilization)
+  - [Node Disk I/O](#node-disk-i-o)
+  - [Node Network Packets](#node-network-packets)
+  - [Node Network I/O](#node-network-i-o)
+- [Etcd Metrics](#etcd-metrics)
+  - [Etcd Has a Leader](#etcd-has-a-leader)
+  - [Number of Times the Leader Changes](#number-of-times-the-leader-changes)
+  - [Number of Failed Proposals](#number-of-failed-proposals)
+  - [GRPC Client Traffic](#grpc-client-traffic)
+  - [Peer Traffic](#peer-traffic)
+  - [DB Size](#db-size)
+  - [Active Streams](#active-streams)
+  - [Raft Proposals](#raft-proposals)
+  - [RPC Rate](#rpc-rate)
+  - [Disk Operations](#disk-operations)
+  - [Disk Sync Duration](#disk-sync-duration)
+- [Kubernetes Components Metrics](#kubernetes-components-metrics)
+  - [API Server Request Latency](#api-server-request-latency)
+  - [API Server Request Rate](#api-server-request-rate)
+  - [Scheduling Failed Pods](#scheduling-failed-pods)
+  - [Controller Manager Queue Depth](#controller-manager-queue-depth)
+  - [Scheduler E2E Scheduling Latency](#scheduler-e2e-scheduling-latency)
+  - [Scheduler Preemption Attempts](#scheduler-preemption-attempts)
+  - [Ingress Controller Connections](#ingress-controller-connections)
+  - [Ingress Controller Request Process Time](#ingress-controller-request-process-time)
+- [Rancher Logging Metrics](#rancher-logging-metrics)
+  - [Fluentd Buffer Queue Rate](#fluentd-buffer-queue-rate)
+  - [Fluentd Input Rate](#fluentd-input-rate)
+  - [Fluentd Output Errors Rate](#fluentd-output-errors-rate)
+  - [Fluentd Output Rate](#fluentd-output-rate)
+- [Workload Metrics](#workload-metrics)
+  - [Workload CPU Utilization](#workload-cpu-utilization)
+  - [Workload Memory Utilization](#workload-memory-utilization)
+  - [Workload Network Packets](#workload-network-packets)
+  - [Workload Network I/O](#workload-network-i-o)
+  - [Workload Disk I/O](#workload-disk-i-o)
+- [Pod Metrics](#pod-metrics)
+  - [Pod CPU Utilization](#pod-cpu-utilization)
+  - [Pod Memory Utilization](#pod-memory-utilization)
+  - [Pod Network Packets](#pod-network-packets)
+  - [Pod Network I/O](#pod-network-i-o)
+  - [Pod Disk I/O](#pod-disk-i-o)
+- [Container Metrics](#container-metrics)
+  - [Container CPU Utilization](#container-cpu-utilization)
+  - [Container Memory Utilization](#container-memory-utilization)
+  - [Container Disk I/O](#container-disk-i-o)
+
+<!-- /TOC -->
+
+# Cluster Metrics
+
+### Cluster CPU Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `1 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance))` |
+| Summary | `1 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])))` |
+
+### Cluster Load Average
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>load1</td><td>`sum(node_load1) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)`</td></tr><tr><td>load5</td><td>`sum(node_load5) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)`</td></tr><tr><td>load15</td><td>`sum(node_load15) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>load1</td><td>`sum(node_load1) by (instance) / count(node_cpu_seconds_total{mode="system"})`</td></tr><tr><td>load5</td><td>`sum(node_load5) by (instance) / count(node_cpu_seconds_total{mode="system"})`</td></tr><tr><td>load15</td><td>`sum(node_load15) by (instance) / count(node_cpu_seconds_total{mode="system"})`</td></tr></table> |
+
+### Cluster Memory Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `1 - sum(node_memory_MemAvailable_bytes) by (instance) / sum(node_memory_MemTotal_bytes) by (instance)` |
+| Summary | `1 - sum(node_memory_MemAvailable_bytes) / sum(node_memory_MemTotal_bytes)` |
+
+### Cluster Disk Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `(sum(node_filesystem_size_bytes{device!="rootfs"}) by (instance) - sum(node_filesystem_free_bytes{device!="rootfs"}) by (instance)) / sum(node_filesystem_size_bytes{device!="rootfs"}) by (instance)` |
+| Summary | `(sum(node_filesystem_size_bytes{device!="rootfs"}) - sum(node_filesystem_free_bytes{device!="rootfs"})) / sum(node_filesystem_size_bytes{device!="rootfs"})` |
+
+### Cluster Disk I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total[5m])) by (instance)`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total[5m])) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total[5m]))`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total[5m]))`</td></tr></table> |
+
+### Cluster Network Packets
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr></table> |
+| Summary | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr></table> |
+
+### Cluster Network I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m])) by (instance)</code></td></tr></table> |
+| Summary | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*"}[5m]))</code></td></tr></table> |
+
+# Node Metrics
+
+### Node CPU Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `avg(irate(node_cpu_seconds_total{mode!="idle", instance=~"$instance"}[5m])) by (mode)` |
+| Summary | `1 - (avg(irate(node_cpu_seconds_total{mode="idle", instance=~"$instance"}[5m])))` |
+
+### Node Load Average
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>load1</td><td>`sum(node_load1{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load5</td><td>`sum(node_load5{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load15</td><td>`sum(node_load15{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr></table> |
+| Summary | <table><tr><td>load1</td><td>`sum(node_load1{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load5</td><td>`sum(node_load5{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr><tr><td>load15</td><td>`sum(node_load15{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})`</td></tr></table> |
+
+### Node Memory Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `1 - sum(node_memory_MemAvailable_bytes{instance=~"$instance"}) / sum(node_memory_MemTotal_bytes{instance=~"$instance"})` |
+| Summary | `1 - sum(node_memory_MemAvailable_bytes{instance=~"$instance"}) / sum(node_memory_MemTotal_bytes{instance=~"$instance"}) ` |
+
+### Node Disk Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `(sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) by (device) - sum(node_filesystem_free_bytes{device!="rootfs",instance=~"$instance"}) by (device)) / sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) by (device)` |
+| Summary | `(sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) - sum(node_filesystem_free_bytes{device!="rootfs",instance=~"$instance"})) / sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"})` |
+
+### Node Disk I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total{instance=~"$instance"}[5m]))`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total{instance=~"$instance"}[5m]))`</td></tr></table> |
+| Summary | <table><tr><td>read</td><td>`sum(rate(node_disk_read_bytes_total{instance=~"$instance"}[5m]))`</td></tr><tr><td>written</td><td>`sum(rate(node_disk_written_bytes_total{instance=~"$instance"}[5m]))`</td></tr></table> |
+
+### Node Network Packets
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr></table> |
+| Summary | <table><tr><td>receive-dropped</td><td><code>sum(rate(node_network_receive_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>receive-errs</td><td><code>sum(rate(node_network_receive_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>receive-packets</td><td><code>sum(rate(node_network_receive_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit-dropped</td><td><code>sum(rate(node_network_transmit_drop_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit-errs</td><td><code>sum(rate(node_network_transmit_errs_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit-packets</td><td><code>sum(rate(node_network_transmit_packets_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr></table> |
+
+### Node Network I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m])) by (device)</code></td></tr></table> |
+| Summary | <table><tr><td>receive</td><td><code>sum(rate(node_network_receive_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr><tr><td>transmit</td><td><code>sum(rate(node_network_transmit_bytes_total{device!~"lo &#124; veth.* &#124; docker.* &#124; flannel.* &#124; cali.* &#124; cbr.*",instance=~"$instance"}[5m]))</code></td></tr></table> |
+
+# Etcd Metrics
+
+### Etcd Has a Leader
+
+`max(etcd_server_has_leader)`
+
+### Number of Times the Leader Changes
+
+`max(etcd_server_leader_changes_seen_total)`
+
+### Number of Failed Proposals
+
+`sum(etcd_server_proposals_failed_total)`
+
+### GRPC Client Traffic
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>in</td><td>`sum(rate(etcd_network_client_grpc_received_bytes_total[5m])) by (instance)`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_client_grpc_sent_bytes_total[5m])) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>in</td><td>`sum(rate(etcd_network_client_grpc_received_bytes_total[5m]))`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_client_grpc_sent_bytes_total[5m]))`</td></tr></table> |
+
+### Peer Traffic
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>in</td><td>`sum(rate(etcd_network_peer_received_bytes_total[5m])) by (instance)`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_peer_sent_bytes_total[5m])) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>in</td><td>`sum(rate(etcd_network_peer_received_bytes_total[5m]))`</td></tr><tr><td>out</td><td>`sum(rate(etcd_network_peer_sent_bytes_total[5m]))`</td></tr></table> |
+
+### DB Size
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(etcd_debugging_mvcc_db_total_size_in_bytes) by (instance)` |
+| Summary | `sum(etcd_debugging_mvcc_db_total_size_in_bytes)` |
+
+### Active Streams
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>lease-watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) by (instance) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) by (instance)`</td></tr><tr><td>watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) by (instance) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>lease-watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"})`</td></tr><tr><td>watch</td><td>`sum(grpc_server_started_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"})`</td></tr></table> |
+
+### Raft Proposals
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>applied</td><td>`sum(increase(etcd_server_proposals_applied_total[5m])) by (instance)`</td></tr><tr><td>committed</td><td>`sum(increase(etcd_server_proposals_committed_total[5m])) by (instance)`</td></tr><tr><td>pending</td><td>`sum(increase(etcd_server_proposals_pending[5m])) by (instance)`</td></tr><tr><td>failed</td><td>`sum(increase(etcd_server_proposals_failed_total[5m])) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>applied</td><td>`sum(increase(etcd_server_proposals_applied_total[5m]))`</td></tr><tr><td>committed</td><td>`sum(increase(etcd_server_proposals_committed_total[5m]))`</td></tr><tr><td>pending</td><td>`sum(increase(etcd_server_proposals_pending[5m]))`</td></tr><tr><td>failed</td><td>`sum(increase(etcd_server_proposals_failed_total[5m]))`</td></tr></table> |
+
+### RPC Rate
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>total</td><td>`sum(rate(grpc_server_started_total{grpc_type="unary"}[5m])) by (instance)`</td></tr><tr><td>fail</td><td>`sum(rate(grpc_server_handled_total{grpc_type="unary",grpc_code!="OK"}[5m])) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>total</td><td>`sum(rate(grpc_server_started_total{grpc_type="unary"}[5m]))`</td></tr><tr><td>fail</td><td>`sum(rate(grpc_server_handled_total{grpc_type="unary",grpc_code!="OK"}[5m]))`</td></tr></table> |
+
+### Disk Operations
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>commit-called-by-backend</td><td>`sum(rate(etcd_disk_backend_commit_duration_seconds_sum[1m])) by (instance)`</td></tr><tr><td>fsync-called-by-wal</td><td>`sum(rate(etcd_disk_wal_fsync_duration_seconds_sum[1m])) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>commit-called-by-backend</td><td>`sum(rate(etcd_disk_backend_commit_duration_seconds_sum[1m]))`</td></tr><tr><td>fsync-called-by-wal</td><td>`sum(rate(etcd_disk_wal_fsync_duration_seconds_sum[1m]))`</td></tr></table> |
+
+### Disk Sync Duration
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>wal</td><td>`histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le))`</td></tr><tr><td>db</td><td>`histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le))`</td></tr></table> |
+| Summary | <table><tr><td>wal</td><td>`sum(histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le)))`</td></tr><tr><td>db</td><td>`sum(histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le)))`</td></tr></table> |
+
+# Kubernetes Components Metrics
+
+### API Server Request Latency
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `avg(apiserver_request_latencies_sum / apiserver_request_latencies_count) by (instance, verb) /1e+06` |
+| Summary | `avg(apiserver_request_latencies_sum / apiserver_request_latencies_count) by (instance) /1e+06` |
+
+### API Server Request Rate
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(rate(apiserver_request_count[5m])) by (instance, code)` |
+| Summary | `sum(rate(apiserver_request_count[5m])) by (instance)` |
+
+### Scheduling Failed Pods
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(kube_pod_status_scheduled{condition="false"})` |
+| Summary | `sum(kube_pod_status_scheduled{condition="false"})` |
+
+### Controller Manager Queue Depth
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>volumes</td><td>`sum(volumes_depth) by instance`</td></tr><tr><td>deployment</td><td>`sum(deployment_depth) by instance`</td></tr><tr><td>replicaset</td><td>`sum(replicaset_depth) by instance`</td></tr><tr><td>service</td><td>`sum(service_depth) by instance`</td></tr><tr><td>serviceaccount</td><td>`sum(serviceaccount_depth) by instance`</td></tr><tr><td>endpoint</td><td>`sum(endpoint_depth) by instance`</td></tr><tr><td>daemonset</td><td>`sum(daemonset_depth) by instance`</td></tr><tr><td>statefulset</td><td>`sum(statefulset_depth) by instance`</td></tr><tr><td>replicationmanager</td><td>`sum(replicationmanager_depth) by instance`</td></tr></table> |
+| Summary | <table><tr><td>volumes</td><td>`sum(volumes_depth)`</td></tr><tr><td>deployment</td><td>`sum(deployment_depth)`</td></tr><tr><td>replicaset</td><td>`sum(replicaset_depth)`</td></tr><tr><td>service</td><td>`sum(service_depth)`</td></tr><tr><td>serviceaccount</td><td>`sum(serviceaccount_depth)`</td></tr><tr><td>endpoint</td><td>`sum(endpoint_depth)`</td></tr><tr><td>daemonset</td><td>`sum(daemonset_depth)`</td></tr><tr><td>statefulset</td><td>`sum(statefulset_depth)`</td></tr><tr><td>replicationmanager</td><td>`sum(replicationmanager_depth)`</td></tr></table> |
+
+### Scheduler E2E Scheduling Latency
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket) by (le, instance)) / 1e+06` |
+| Summary | `sum(histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket) by (le, instance)) / 1e+06)` |
+
+### Scheduler Preemption Attempts
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(rate(scheduler_total_preemption_attempts[5m])) by (instance)` |
+| Summary | `sum(rate(scheduler_total_preemption_attempts[5m]))` |
+
+### Ingress Controller Connections
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>reading</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="reading"}) by (instance)`</td></tr><tr><td>waiting</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="waiting"}) by (instance)`</td></tr><tr><td>writing</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="writing"}) by (instance)`</td></tr><tr><td>accepted</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="accepted"}[5m]))) by (instance)`</td></tr><tr><td>active</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="active"}[5m]))) by (instance)`</td></tr><tr><td>handled</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="handled"}[5m]))) by (instance)`</td></tr></table> |
+| Summary | <table><tr><td>reading</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="reading"})`</td></tr><tr><td>waiting</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="waiting"})`</td></tr><tr><td>writing</td><td>`sum(nginx_ingress_controller_nginx_process_connections{state="writing"})`</td></tr><tr><td>accepted</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="accepted"}[5m])))`</td></tr><tr><td>active</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="active"}[5m])))`</td></tr><tr><td>handled</td><td>`sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="handled"}[5m])))`</td></tr></table> |
+
+### Ingress Controller Request Process Time
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `topk(10, histogram_quantile(0.95,sum by (le, host, path)(rate(nginx_ingress_controller_request_duration_seconds_bucket{host!="_"}[5m]))))` |
+| Summary | `topk(10, histogram_quantile(0.95,sum by (le, host)(rate(nginx_ingress_controller_request_duration_seconds_bucket{host!="_"}[5m]))))` |
+
+# Rancher Logging Metrics
+
+
+### Fluentd Buffer Queue Rate
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(rate(fluentd_output_status_buffer_queue_length[5m])) by (instance)` |
+| Summary | `sum(rate(fluentd_output_status_buffer_queue_length[5m]))` |
+
+### Fluentd Input Rate
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(rate(fluentd_input_status_num_records_total[5m])) by (instance)` |
+| Summary | `sum(rate(fluentd_input_status_num_records_total[5m]))` |
+
+### Fluentd Output Errors Rate
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(rate(fluentd_output_status_num_errors[5m])) by (type)` |
+| Summary | `sum(rate(fluentd_output_status_num_errors[5m]))` |
+
+### Fluentd Output Rate
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(rate(fluentd_output_status_num_records_total[5m])) by (instance)` |
+| Summary | `sum(rate(fluentd_output_status_num_records_total[5m]))` |
+
+# Workload Metrics
+
+### Workload CPU Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
+| Summary | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+### Workload Memory Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""}) by (pod_name)` |
+| Summary | `sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""})` |
+
+### Workload Network Packets
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
+| Summary | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+### Workload Network I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
+| Summary | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+### Workload Disk I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)`</td></tr></table> |
+| Summary | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+# Pod Metrics
+
+### Pod CPU Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)`</td></tr></table> |
+| Summary | <table><tr><td>cfs throttled seconds</td><td>`sum(rate(container_cpu_cfs_throttled_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr><tr><td>usage seconds</td><td>`sum(rate(container_cpu_usage_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr><tr><td>system seconds</td><td>`sum(rate(container_cpu_system_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr><tr><td>user seconds</td><td>`sum(rate(container_cpu_user_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))`</td></tr></table> |
+
+### Pod Memory Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | `sum(container_memory_working_set_bytes{container_name!="POD",namespace="$namespace",pod_name="$podName",container_name!=""}) by (container_name)` |
+| Summary | `sum(container_memory_working_set_bytes{container_name!="POD",namespace="$namespace",pod_name="$podName",container_name!=""})` |
+
+### Pod Network Packets
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
+| Summary | <table><tr><td>receive-packets</td><td>`sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-dropped</td><td>`sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>receive-errors</td><td>`sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-packets</td><td>`sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-dropped</td><td>`sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit-errors</td><td>`sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+### Pod Network I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
+| Summary | <table><tr><td>receive</td><td>`sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>transmit</td><td>`sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+### Pod Disk I/O
+
+| Catalog | Expression |
+| --- | --- |
+| Detail | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m])) by (container_name)`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m])) by (container_name)`</td></tr></table> |
+| Summary | <table><tr><td>read</td><td>`sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr><tr><td>write</td><td>`sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))`</td></tr></table> |
+
+# Container Metrics
+
+### Container CPU Utilization
+
+| Catalog | Expression |
+| --- | --- |
+| cfs throttled seconds | `sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
+| usage seconds | `sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
+| system seconds | `sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
+| user seconds | `sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
+
+### Container Memory Utilization
+
+`sum(container_memory_working_set_bytes{namespace="$namespace",pod_name="$podName",container_name="$containerName"})`
+
+### Container Disk I/O
+
+| Catalog | Expression |
+| --- | --- |
+| read | `sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
+| write | `sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))` |
@@ -50,8 +50,7 @@ When upgrading the Kubernetes version of a cluster, we recommend that you:

 1. Take a snapshot.
 1. Initiate a Kubernetes upgrade.
-1. If the upgrade fails, revert the cluster to the pre-upgrade Kubernetes version. Before restoring the cluster from the snapshot in the etcd datastore, the cluster should be running the pre-upgrade Kubernetes version.
-1. Restore the cluster from the etcd snapshot.
+1. If the upgrade fails, revert the cluster to the pre-upgrade Kubernetes version. This is achieved by selecting the **Restore etcd and Kubernetes version** option. This will return your cluster to the pre-upgrade kubernetes version before restoring the etcd snapshot.

 The restore operation will work on a cluster that is not in a healthy or active state.
 {{% /tab %}}
@@ -158,4 +157,4 @@ A failed node could be in many different states:
 - User drains a node while upgrade is in process, so there are no kubelets on the node
 - The upgrade itself failed

-If the max unavailable number of nodes is reached during an upgrade, Rancher user clusters will be stuck in updating state and not move forward with upgrading any other control plane nodes. It will continue to evaluate the set of unavailable nodes in case one of the nodes becomes available. If the node cannot be fixed, you must remove the node in order to continue the upgrade.
+If the max unavailable number of nodes is reached during an upgrade, Rancher user clusters will be stuck in updating state and not move forward with upgrading any other control plane nodes. It will continue to evaluate the set of unavailable nodes in case one of the nodes becomes available. If the node cannot be fixed, you must remove the node in order to continue the upgrade.
@@ -0,0 +1,22 @@
+---
+headless: true
+---
+| Action | [Rancher launched Kubernetes Clusters]({{<baseurl>}}/rancher/v2.x/en/cluster-provisioning/rke-clusters/) | [Hosted Kubernetes Clusters]({{<baseurl>}}/rancher/v2.x/en/cluster-provisioning/hosted-kubernetes-clusters/) | [Imported Clusters]({{<baseurl>}}/rancher/v2.x/en/cluster-provisioning/imported-clusters) |
+| --- | --- | ---| ---|
+| [Using kubectl and a kubeconfig file to Access a Cluster]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/kubectl/) | ✓ | ✓ | ✓ |
+| [Managing Cluster Members]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/cluster-members/) | ✓ | ✓ | ✓ |
+| [Editing and Upgrading Clusters]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/editing-clusters/) | ✓ | ✓ | * |
+| [Managing Nodes]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/nodes) | ✓ | ✓ | ✓ |
+| [Managing Persistent Volumes and Storage Classes]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/volumes-and-storage/) | ✓ | ✓ | ✓ |
+| [Managing Projects, Namespaces and Workloads]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/projects-and-namespaces/) | ✓ | ✓ | ✓ |
+| [Using App Catalogs]({{<baseurl>}}/rancher/v2.x/en/catalog/) | ✓ | ✓ | ✓ |
+| [Configuring Tools (Alerts, Notifiers, Logging, Monitoring, Istio)]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/tools/) | ✓ | ✓ | ✓ |
+| [Cloning Clusters]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cloning-clusters/)| ✓ | ✓ | |
+| [Ability to rotate certificates]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/certificate-rotation/) | ✓ |  | |
+| [Ability to back up your Kubernetes Clusters]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/backing-up-etcd/) | ✓ | | |
+| [Ability to recover and restore etcd]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/restoring-etcd/) | ✓ | | |
+| [Cleaning Kubernetes components when clusters are no longer reachable from Rancher]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cleaning-cluster-nodes/) | ✓ | | |
+| [Configuring Pod Security Policies]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/pod-security-policy/) | ✓ |  | |
+| [Running Security Scans]({{<baseurl>}}/rancher/v2.x/en/security/security-scan/) | ✓ |  | |
+
+/* Cluster configuration options can't be edited for imported clusters, except for [K3s clusters.]({{<baseurl>}}/rancher/v2.x/en/cluster-provisioning/imported-clusters/#additional-features-for-imported-k3s-clusters)
@@ -6,6 +6,18 @@ aliases:
  - /rancher/v2.x/en/tasks/clusters/creating-a-cluster/create-cluster-azure/
 ---

+In this section, you'll learn how to set up a Kubernetes cluster in Azure through Rancher. During this process, Rancher will provision new nodes in Azure.
+
+- [Creating an Azure Cluster](#creating-an-azure-cluster)
+- [Creating an Azure Node Template](#creating-an-azure-node-template)
+  - [Preparation in Azure](#preparation-in-azure)
+  - [Creating the Template](#creating-the-template)
+  - [Template Configuration](#template-configuration)
+
+# Creating an Azure Cluster
+
+> **Prerequisite:** Before Rancher can create a cluster in Azure, a node template needs to be created using your Azure credentials and configuration. For details, see [this section.](#creating-an-azure-node-template)
+
 Use {{< product >}} to create a Kubernetes cluster in Azure.

 1. From the **Clusters** page, click **Add Cluster**.
@@ -16,35 +28,62 @@ Use {{< product >}} to create a Kubernetes cluster in Azure.

 4. {{< step_create-cluster_member-roles >}}

-5. {{< step_create-cluster_cluster-options >}}
+5. {{< step_create-cluster_cluster-options >}} For more information, see the [cluster configuration reference.](../../options)

 6. {{< step_create-cluster_node-pools >}}

-	1.	Click **Add Node Template**.
+7. **Optional:** Add additional node pools.

-	2.	Complete the **Azure Options** form.
-
-		- **Account Access** stores your account information for authenticating with Azure. Note: As of v2.2.0, account access information is stored as a cloud credentials. Cloud credentials are stored as Kubernetes secrets. Multiple node templates can use the same cloud credential. You can use an existing cloud credential or create a new one. To create a new cloud credential, enter **Name** and **Account Access** data, then click **Create.**
-
-		- **Placement** sets the geographical region where your cluster is hosted and other location metadata.
-
-		- **Network** configures the networking used in your cluster.
-
-		- **Instance** customizes your VM configuration.
-
-	3. {{< step_rancher-template >}}
-
-	4. Click **Create**.
-
-	5. **Optional:** Add additional node pools.
-<br>
-7. Review your options to confirm they're correct. Then click **Create**.
+8. Review your options to confirm they're correct. Then click **Create**.

 {{< result_create-cluster >}}

-# Optional Next Steps
+### Optional Next Steps

 After creating your cluster, you can access it through the Rancher UI. As a best practice, we recommend setting up these alternate ways of accessing your cluster:

 - **Access your cluster with the kubectl CLI:** Follow [these steps]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/kubectl/#accessing-clusters-with-kubectl-on-your-workstation) to access clusters with kubectl on your workstation. In this case, you will be authenticated through the Rancher server’s authentication proxy, then Rancher will connect you to the downstream cluster. This method lets you manage the cluster without the Rancher UI.
- **Access your cluster with the kubectl CLI, using the authorized cluster endpoint:** Follow [these steps]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/kubectl/#authenticating-directly-with-a-downstream-cluster) to access your cluster with kubectl directly, without authenticating through Rancher. We recommend setting up this alternative method to access your cluster so that in case you can’t connect to Rancher, you can still access the cluster.
+- **Access your cluster with the kubectl CLI, using the authorized cluster endpoint:** Follow [these steps]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/kubectl/#authenticating-directly-with-a-downstream-cluster) to access your cluster with kubectl directly, without authenticating through Rancher. We recommend setting up this alternative method to access your cluster so that in case you can’t connect to Rancher, you can still access the cluster.
+
+# Creating an Azure Node Template
+
+Creating a node template for Azure will allow Rancher to provision new nodes when it sets up a Kubernetes cluster in Azure.
+
+### Preparation in Azure
+  
+Before creating a **node template** in Rancher using a cloud infrastructure such as Azure, we must configure Rancher to allow the manipulation of resources in an Azure subscription.
+
+To do this, we will first create a new Azure **service principal (SP)** in Azure **Active Directory (AD)**, which, in Azure, is an application user who has permission to manage Azure resources.
+
+The following is a template `az cli` script that you have to run for creating an service principal, where you have to enter your SP name, role, and scope:
+  
+```
+az ad sp create-for-rbac --name="<Rancher ServicePrincipal name>" --role="Contributor" --scopes="/subscriptions/<subscription Id>"
+```
+  
+The creation of this service principal returns three pieces of identification information, *The application ID, also called the client ID*, *The client secret*, and *The tenant ID*. This information will be used in the following section adding the **node template**.
+
+### Creating the Template
+
+1.	Click **Add Node Template**.
+
+1.	Complete the **Azure Options** form. For help filling out the form, refer to [Configuration](#azure-node-template-configuration) below.
+
+1. Click **Create**.
+
+**Result:** The node template can be used during the cluster creation process.
+
+
+### Template Configuration
+
+- **Account Access** stores your account information for authenticating with Azure. Note: As of v2.2.0, account access information is stored as a cloud credentials. Cloud credentials are stored as Kubernetes secrets. Multiple node templates can use the same cloud credential. You can use an existing cloud credential or create a new one. To create a new cloud credential, enter **Name** and **Account Access** data, then click **Create.**
+
+- **Placement** sets the geographical region where your cluster is hosted and other location metadata.
+
+- **Network** configures the networking used in your cluster.
+
+- **Instance** customizes your VM configuration.
+
+{{< step_rancher-template >}}
+
+
@@ -15,6 +15,8 @@ Use Rancher to create a Kubernetes cluster in Amazon EC2.
  - [Example IAM Policy to allow encrypted EBS volumes](#example-iam-policy-to-allow-encrypted-ebs-volumes)
 - **IAM Policy added as Permission** to the user. See [Amazon Documentation: Adding Permissions to a User (Console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_change-permissions.html#users_change_permissions-add-console) how to attach it to an user.

+> **Note:** Rancher v2.4.6 and v2.4.7 had an issue where the `kms:ListKeys` permission was required to create, edit, or clone Amazon EC2 node templates. This requirement was removed in v2.4.8.
+
 # Creating an EC2 Cluster

 The steps to create a cluster differ based on your Rancher version.
@@ -117,6 +119,10 @@ After creating your cluster, you can access it through the Rancher UI. As a best
 - **Access your cluster with the kubectl CLI:** Follow [these steps]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/kubectl/#accessing-clusters-with-kubectl-on-your-workstation) to access clusters with kubectl on your workstation. In this case, you will be authenticated through the Rancher server’s authentication proxy, then Rancher will connect you to the downstream cluster. This method lets you manage the cluster without the Rancher UI.
 - **Access your cluster with the kubectl CLI, using the authorized cluster endpoint:** Follow [these steps]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/cluster-access/kubectl/#authenticating-directly-with-a-downstream-cluster) to access your cluster with kubectl directly, without authenticating through Rancher. We recommend setting up this alternative method to access your cluster so that in case you can’t connect to Rancher, you can still access the cluster.

+# IAM Policies
+
+> **Note:** Rancher v2.4.6 and v2.4.7 had an issue where the `kms:ListKeys` permission was required to create, edit, or clone Amazon EC2 node templates. This requirement was removed in v2.4.8.
+
 ### Example IAM Policy

 ```json
@@ -5,7 +5,11 @@ weight: 3
 ---

 > **Prerequisite:**
-> Set up the Rancher Kubernetes cluster. As of Rancher v2.5, Rancher can be installed on any Kubernetes cluster. This cluster can use upstream Kubernetes, or it can use one of Rancher's Kubernetes distributions, or it can be a managed Kubernetes cluster from a provider such as Amazon EKS.
+> Set up the Rancher server's local Kubernetes cluster. 
+>
+> - As of Rancher v2.5, Rancher can be installed on any Kubernetes cluster. This cluster can use upstream Kubernetes, or it can use one of Rancher's Kubernetes distributions, or it can be a managed Kubernetes cluster from a provider such as Amazon EKS.
+> - In Rancher v2.4.x, Rancher needs to be installed on a K3s Kubernetes cluster or an RKE Kubernetes cluster.
+> - In Rancher prior to v2.4, Rancher needs to be installed on an RKE Kubernetes cluster.

 # Install the Rancher Helm Chart

@@ -0,0 +1,14 @@
+---
+title: Installing Rancher behind an HTTP Proxy
+weight: 4
+---
+
+In a lot of enterprise environments, servers or VMs running on premise do not have direct Internet access, but must connect to external services through a HTTP(S) proxy for security reasons. This tutorial shows step by step how to set up a highly available Rancher installation in such an environment.
+
+Alternatively, it is also possible to set up Rancher completely air-gapped without any Internet access. This process is described in detail in the [Rancher docs]({{<baseurl>}}/rancher/v2.x/en/installation/other-installation-methods/air-gap/).
+
+# Installation Outline
+
+1. [Set up infrastructure]({{<baseurl>}}/rancher/v2.x/en/installation/other-installation-methods/behind-proxy/prepare-nodes/)
+2. [Set up a Kubernetes cluster]({{<baseurl>}}/rancher/v2.x/en/installation/other-installation-methods/behind-proxy/launch-kubernetes/)
+3. [Install Rancher]({{<baseurl>}}/rancher/v2.x/en/installation/other-installation-methods/behind-proxy/install-rancher/)
@@ -0,0 +1,86 @@
+---
+title: 3. Install Rancher
+weight: 300
+---
+
+Now that you have a running RKE cluster, you can install Rancher in it. For security reasons all traffic to Rancher must be encrypted with TLS. For this tutorial you are going to automatically issue a self-signed certificate through [cert-manager](https://cert-manager.io/). In a real-world use-case you will likely use Let's Encrypt or provide your own certificate. For more details see [SSL configuration]({{<baseurl>}}/rancher/v2.x/en/installation/k8s-install/helm-rancher/#4-choose-your-ssl-configuration).
+
+> **Note:** These installation instructions assume you are using Helm 3.
+
+### Install cert-manager
+
+Add the cert-manager helm repository:
+
+```
+helm repo add jetstack https://charts.jetstack.io
+```
+
+Create a namespace for cert-manager:
+
+```
+kubectl create namespace cert-manager
+```
+
+Install the CustomResourceDefinitions of cert-manager:
+
+```
+kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.crds.yaml
+```
+
+And install it with Helm. Note that cert-manager also needs your proxy configured in case it needs to communicate with Let's Encrypt or other external certificate issuers:
+
+```
+helm upgrade --install cert-manager jetstack/cert-manager \
+  --namespace cert-manager --version v0.15.2 \
+  --set http_proxy=http://${proxy_host} \
+  --set https_proxy=http://${proxy_host} \
+  --set no_proxy=127.0.0.0/8\\,10.0.0.0/8\\,172.16.0.0/12\\,192.168.0.0/16
+```
+
+Now you should wait until cert-manager is finished starting up:
+
+```
+kubectl rollout status deployment -n cert-manager cert-manager
+kubectl rollout status deployment -n cert-manager cert-manager-webhook
+```
+
+### Install Rancher
+
+Next you can install Rancher itself. First add the helm repository:
+
+```
+helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
+```
+
+Create a namespace:
+
+```
+kubectl create namespace cattle-system
+```
+
+And install Rancher with Helm. Rancher also needs a proxy configuration so that it can communicate with external application catalogs or retrieve Kubernetes version update metadata:
+
+```
+helm upgrade --install rancher rancher-latest/rancher \
+   --namespace cattle-system \
+   --set hostname=rancher.example.com \
+   --set proxy=http://${proxy_host}
+```
+
+After waiting for the deployment to finish:
+
+```
+kubectl rollout status deployment -n cattle-system rancher
+```
+
+You can now navigate to `https://rancher.example.com` and start using Rancher.
+
+> **Note:** If you don't intend to send telemetry data, opt out [telemetry]({{<baseurl>}}/rancher/v2.x/en/faq/telemetry/) during the initial login. Leaving this active in an air-gapped environment can cause issues if the sockets cannot be opened successfully.
+
+### Additional Resources
+
+These resources could be helpful when installing Rancher:
+
+- [Rancher Helm chart options]({{<baseurl>}}/rancher/v2.x/en/installation/options/chart-options/)
+- [Adding TLS secrets]({{<baseurl>}}/rancher/v2.x/en/installation/options/tls-secrets/)
+- [Troubleshooting Rancher Kubernetes Installations]({{<baseurl>}}/rancher/v2.x/en/installation/options/troubleshooting/)
@@ -0,0 +1,151 @@
+---
+title: '2. Install Kubernetes'
+weight: 200
+---
+
+Once the infrastructure is ready, you can continue with setting up an RKE cluster to install Rancher in.
+
+### Installing Docker
+
+First, you have to install Docker and setup the HTTP proxy on all three Linux nodes. For this perform the following steps on all three nodes.
+
+For convenience export the IP address and port of your proxy into an environment variable and set up the HTTP_PROXY variables for your current shell:
+
+```
+export proxy_host="10.0.0.5:8888"
+export HTTP_PROXY=http://${proxy_host}
+export HTTPS_PROXY=http://${proxy_host}
+export NO_PROXY=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
+```
+
+Next configure apt to use this proxy when installing packages. If you are not using Ubuntu, you have to adapt this step accordingly:
+
+```
+cat <<'EOF' | sudo tee /etc/apt/apt.conf.d/proxy.conf > /dev/null
+Acquire::http::Proxy "http://${proxy_host}/";
+Acquire::https::Proxy "http://${proxy_host}/";
+EOF
+```
+
+Now you can install Docker:
+
+```
+curl -sL https://releases.rancher.com/install-docker/19.03.sh | sh
+```
+
+Then ensure that your current user is able to access the Docker daemon without sudo:
+
+```
+sudo usermod -aG docker YOUR_USERNAME
+```
+
+And configure the Docker daemon to use the proxy to pull images:
+
+```
+sudo mkdir -p mkdir /etc/systemd/system/docker.service.d
+cat <<'EOF' | sudo tee /etc/systemd/system/docker.service.d/http-proxy.conf > /dev/null
+[Service]
+Environment="HTTP_PROXY=http://${proxy_host}"
+Environment="HTTPS_PROXY=http://${proxy_host}"
+Environment="NO_PROXY=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"
+EOF
+```
+
+To apply the configuration, restart the Docker daemon:
+
+```
+sudo systemctl daemon-reload
+sudo systemctl restart docker
+```
+
+### Creating the RKE Cluster
+
+You need several command line tools on the host where you have SSH access to the Linux nodes to create and interact with the cluster:
+
+*  [RKE CLI binary]({{<baseurl>}}/rke/latest/en/installation/#download-the-rke-binary)
+
+```
+sudo curl -fsSL -o /usr/local/bin/rke https://github.com/rancher/rke/releases/download/v1.1.4/rke_linux-amd64
+sudo chmod +x /usr/local/bin/rke
+```
+
+* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
+
+```
+curl -LO "https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl"
+chmod +x ./kubectl
+sudo mv ./kubectl /usr/local/bin/kubectl
+```
+
+* [helm](https://helm.sh/docs/intro/install/)
+
+```
+curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
+chmod +x get_helm.sh
+sudo ./get_helm.sh
+```
+
+Next, create a YAML file that describes the RKE cluster. Ensure that the IP addresses of the nodes and the SSH username are correct. For more information on the cluster YAML, have a look at the [RKE documentation]({{<baseurl>}}/rke/latest/en/example-yamls/).
+
+```
+nodes:
+  - address: 10.0.1.200
+    user: ubuntu
+    role: [controlplane,worker,etcd]
+  - address: 10.0.1.201
+    user: ubuntu
+    role: [controlplane,worker,etcd]
+  - address: 10.0.1.202
+    user: ubuntu
+    role: [controlplane,worker,etcd]
+
+services:
+  etcd:
+    backup_config:
+      interval_hours: 12
+      retention: 6
+```
+
+After that, you can create the Kubernetes cluster by running:
+
+```
+rke up --config rancher-cluster.yaml
+```
+
+RKE creates a state file called `rancher-cluster.rkestate`, this is needed if you want to perform updates, modify your cluster configuration or restore it from a backup. It also creates a `kube_config_rancher-cluster.yaml` file, that you can use to connect to the remote Kubernetes cluster locally with tools like kubectl or Helm. Make sure to save all of these files in a secure location, for example by putting them into a version control system.
+
+To have a look at your cluster run:
+
+```
+export KUBECONFIG=kube_config_rancher-cluster.yaml
+kubectl cluster-info
+kubectl get pods --all-namespaces
+```
+
+You can also verify that your external load balancer works, and the DNS entry is set up correctly. If you send a request to either, you should receive HTTP 404 response from the ingress controller:
+
+```
+$ curl 10.0.1.100
+default backend - 404
+$ curl rancher.example.com
+default backend - 404
+```
+
+### Save Your Files
+
+> **Important**
+> The files mentioned below are needed to maintain, troubleshoot and upgrade your cluster.
+
+Save a copy of the following files in a secure location:
+
+- `rancher-cluster.yml`: The RKE cluster configuration file.
+- `kube_config_rancher-cluster.yml`: The [Kubeconfig file]({{<baseurl>}}/rke/latest/en/kubeconfig/) for the cluster, this file contains credentials for full access to the cluster.
+- `rancher-cluster.rkestate`: The [Kubernetes Cluster State file]({{<baseurl>}}/rke/latest/en/installation/#kubernetes-cluster-state), this file contains the current state of the cluster including the RKE configuration and the certificates.
+
+> **Note:** The "rancher-cluster" parts of the two latter file names are dependent on how you name the RKE cluster configuration file.
+
+### Issues or errors?
+
+See the [Troubleshooting]({{<baseurl>}}/rancher/v2.x/en/installation/options/troubleshooting/) page.
+
+### [Next: Install Rancher](../install-rancher)
@@ -0,0 +1,61 @@
+---
+title: '1. Set up Infrastructure'
+weight: 100
+---
+
+In this section, you will provision the underlying infrastructure for your Rancher management server with internete access through a HTTP proxy.
+
+To install the Rancher management server on a high-availability RKE cluster, we recommend setting up the following infrastructure:
+
+- **Three Linux nodes,** typically virtual machines, in an infrastructure provider such as Amazon's EC2, Google Compute Engine, or vSphere.
+- **A load balancer** to direct front-end traffic to the three nodes.
+- **A DNS record** to map a URL to the load balancer. This will become the Rancher server URL, and downstream Kubernetes clusters will need to reach it.
+
+These nodes must be in the same region/data center. You may place these servers in separate availability zones.
+
+### Why three nodes?
+
+In an RKE cluster, Rancher server data is stored on etcd. This etcd database runs on all three nodes.
+
+The etcd database requires an odd number of nodes so that it can always elect a leader with a majority of the etcd cluster. If the etcd database cannot elect a leader, etcd can suffer from [split brain](https://www.quora.com/What-is-split-brain-in-distributed-systems), requiring the cluster to be restored from backup. If one of the three etcd nodes fails, the two remaining nodes can elect a leader because they have the majority of the total number of etcd nodes.
+
+### 1. Set up Linux Nodes
+
+These hosts will connect to the internet through an HTTP proxy.
+
+Make sure that your nodes fulfill the general installation requirements for [OS, container runtime, hardware, and networking.]({{<baseurl>}}/rancher/v2.x/en/installation/requirements/)
+
+For an example of one way to set up Linux nodes, refer to this [tutorial]({{<baseurl>}}/rancher/v2.x/en/installation/options/ec2-node) for setting up nodes as instances in Amazon EC2.
+
+### 2. Set up the Load Balancer
+
+You will also need to set up a load balancer to direct traffic to the Rancher replica on both nodes. That will prevent an outage of any single node from taking down communications to the Rancher management server.
+
+When Kubernetes gets set up in a later step, the RKE tool will deploy an NGINX Ingress controller. This controller will listen on ports 80 and 443 of the worker nodes, answering traffic destined for specific hostnames.
+
+When Rancher is installed (also in a later step), the Rancher system creates an Ingress resource. That Ingress tells the NGINX Ingress controller to listen for traffic destined for the Rancher hostname. The NGINX Ingress controller, when receiving traffic destined for the Rancher hostname, will forward that traffic to the running Rancher pods in the cluster.
+
+For your implementation, consider if you want or need to use a Layer-4 or Layer-7 load balancer:
+
+- **A layer-4 load balancer** is the simpler of the two choices, in which you are forwarding TCP traffic to your nodes. We recommend configuring your load balancer as a Layer 4 balancer, forwarding traffic to ports TCP/80 and TCP/443 to the Rancher management cluster nodes. The Ingress controller on the cluster will redirect HTTP traffic to HTTPS and terminate SSL/TLS on port TCP/443. The Ingress controller will forward traffic to port TCP/80 to the Ingress pod in the Rancher deployment.
+- **A layer-7 load balancer** is a bit more complicated but can offer features that you may want. For instance, a layer-7 load balancer is capable of handling TLS termination at the load balancer, as opposed to Rancher doing TLS termination itself. This can be beneficial if you want to centralize your TLS termination in your infrastructure. Layer-7 load balancing also offers the capability for your load balancer to make decisions based on HTTP attributes such as cookies, etc. that a layer-4 load balancer is not able to concern itself with. If you decide to terminate the SSL/TLS traffic on a layer-7 load balancer, you will need to use the `--set tls=external` option when installing Rancher in a later step. For more information, refer to the [Rancher Helm chart options.]({{<baseurl>}}/rancher/v2.x/en/installation/options/chart-options/#external-tls-termination)
+
+For an example showing how to set up an NGINX load balancer, refer to [this page.]({{<baseurl>}}/rancher/v2.x/en/installation/options/nginx/)
+
+For a how-to guide for setting up an Amazon ELB Network Load Balancer, refer to [this page.]({{<baseurl>}}/rancher/v2.x/en/installation/options/nlb/)
+
+> **Important:**
+> Do not use this load balancer (i.e, the `local` cluster Ingress) to load balance applications other than Rancher following installation. Sharing this Ingress with other applications may result in websocket errors to Rancher following Ingress configuration reloads for other apps. We recommend dedicating the `local` cluster to Rancher and no other applications.
+
+### 3. Set up the DNS Record
+
+Once you have set up your load balancer, you will need to create a DNS record to send traffic to this load balancer.
+
+Depending on your environment, this may be an A record pointing to the LB IP, or it may be a CNAME pointing to the load balancer hostname. In either case, make sure this record is the hostname that you intend Rancher to respond on.
+
+You will need to specify this hostname in a later step when you install Rancher, and it is not possible to change it later. Make sure that your decision is a final one.
+
+For a how-to guide for setting up a DNS record to route domain traffic to an Amazon ELB load balancer, refer to the [official AWS documentation.](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-to-elb-load-balancer)
+
+
+### [Next: Set up a Kubernetes cluster]({{<baseurl>}}/rancher/v2.x/en/installation/other-installation-methods/behind-proxy/launch-kubernetes/)
@@ -161,11 +161,11 @@ Placeholder | Description
 ```
 docker run -d --volumes-from rancher-data \
  --restart=unless-stopped \
-  - 80:80 -p 443:443 \
-  - /<CERT_DIRECTORY>/<FULL_CHAIN.pem>:/etc/rancher/ssl/cert.pem \
-  - /<CERT_DIRECTORY>/<PRIVATE_KEY.pem>:/etc/rancher/ssl/key.pem \
-  - /<CERT_DIRECTORY>/<CA_CERTS.pem>:/etc/rancher/ssl/cacerts.pem \
-  --privileged
+  -p 80:80 -p 443:443 \
+  -v /<CERT_DIRECTORY>/<FULL_CHAIN.pem>:/etc/rancher/ssl/cert.pem \
+  -v /<CERT_DIRECTORY>/<PRIVATE_KEY.pem>:/etc/rancher/ssl/key.pem \
+  -v /<CERT_DIRECTORY>/<CA_CERTS.pem>:/etc/rancher/ssl/cacerts.pem \
+  --privileged \
  rancher/rancher:<RANCHER_VERSION_TAG>
 ```

@@ -188,9 +188,9 @@ Placeholder | Description
 ```
 docker run -d --volumes-from rancher-data \
  --restart=unless-stopped \
-  - 80:80 -p 443:443 \
-  - /<CERT_DIRECTORY>/<FULL_CHAIN.pem>:/etc/rancher/ssl/cert.pem \
-  - /<CERT_DIRECTORY>/<PRIVATE_KEY.pem>:/etc/rancher/ssl/key.pem \
+  -p 80:80 -p 443:443 \
+  -v /<CERT_DIRECTORY>/<FULL_CHAIN.pem>:/etc/rancher/ssl/cert.pem \
+  -v /<CERT_DIRECTORY>/<PRIVATE_KEY.pem>:/etc/rancher/ssl/key.pem \
  rancher/rancher:<RANCHER_VERSION_TAG> \
  --privileged \
  --no-cacerts
@@ -0,0 +1,18 @@
+---
+title: Installing Docker
+weight: 1
+---
+
+Docker is required to be installed on any node that runs the Rancher server.
+
+There are a couple of options for installing Docker. One option is to refer to the [official Docker documentation](https://docs.docker.com/install/) about how to install Docker on Linux. The steps will vary based on the Linux distribution.
+
+Another option is to use one of Rancher's Docker installation scripts, which are available for most recent versions of Docker.
+
+For example, this command could be used to install Docker 19.03 on Ubuntu:
+
+```
+curl https://releases.rancher.com/install-docker/19.03.sh | sh
+```
+
+Rancher has installation scripts for every version of upstream Docker that Kubernetes supports. To find out whether a script is available for installing a certain Docker version, refer to this [GitHub repository,](https://github.com/rancher/install-docker) which contains all of Rancher's Docker installation scripts.
@@ -148,7 +148,7 @@ If you have private registries, catalogs or a proxy that intercepts certificates
 Once the Rancher deployment is created, copy your CA certs in pem format into a file named `ca-additional.pem` and use `kubectl` to create the `tls-ca-additional` secret in the `cattle-system` namespace.

 ```plain
-kubectl -n cattle-system create secret generic tls-ca-additional --from-file=ca-additional.pem
+kubectl -n cattle-system create secret generic tls-ca-additional --from-file=ca-additional.pem=./ca-additional.pem
 ```

 ### Private Registry and Air Gap Installs
@@ -26,11 +26,9 @@ If you are using a private CA, Rancher requires a copy of the CA certificate whi

 Copy the CA certificate into a file named `cacerts.pem` and use `kubectl` to create the `tls-ca` secret in the `cattle-system` namespace.

->**Important:** Make sure the file is called `cacerts.pem` as Rancher uses that filename to configure the CA certificate.
-
 ```
 kubectl -n cattle-system create secret generic tls-ca \
-  --from-file=cacerts.pem
+  --from-file=cacerts.pem=./cacerts.pem
 ```

 > **Note:** The configured `tls-ca` secret is retrieved when Rancher starts. On a running Rancher installation the updated CA will take effect after new Rancher pods are started.
@@ -1,6 +1,9 @@
 ---
 title: Setting up Amazon ELB Network Load Balancer
 weight: 5
+aliases:
+  - /rancher/v2.x/en/installation/ha/create-nodes-lb/nlb
+  - /rancher/v2.x/en/installation/k8s-install/create-nodes-lb/nlb
 ---

 This how-to guide describes how to set up a Network Load Balancer (NLB) in Amazon's EC2 service that will direct traffic to multiple instances on EC2.
@@ -15,6 +15,7 @@ The following table lists some of the most noteworthy issues to be considered wh

 Upgrade Scenario | Issue
 ---|---
+Upgrading to v2.4.6 or v2.4.7 | These Rancher versions had an issue where the `kms:ListKeys` permission was required to create, edit, or clone Amazon EC2 node templates. This requirement was removed in v2.4.8.
 Upgrading to v2.3.0+ | Any user provisioned cluster will be automatically updated upon any edit as tolerations were added to the images used for Kubernetes provisioning.
 Upgrading to v2.2.0-v2.2.x | Rancher introduced the [system charts](https://github.com/rancher/system-charts) repository which contains all the catalog items required for features such as monitoring, logging, alerting and global DNS. To be able to use these features in an air gap install, you will need to mirror the `system-charts` repository locally and configure Rancher to use that repository. Please follow the instructions to [configure Rancher system charts]({{<baseurl>}}/rancher/v2.x/en/installation/options/local-system-charts/#setting-up-system-charts-for-rancher-prior-to-v2-3-0).
 Upgrading from v2.0.13 or earlier  | If your cluster's certificates have expired, you will need to perform [additional steps]({{<baseurl>}}/rancher/v2.x/en/cluster-admin/certificate-rotation/#rotating-expired-certificates-after-upgrading-older-rancher-versions) to rotate the certificates.
@@ -16,6 +16,7 @@ If you installed Rancher using the RKE Add-on yaml, follow the directions to [mi
 > 
 > - If you are upgrading to Rancher v2.5 from a Rancher server that was started with the Helm chart option `--add-local=false`, you will need to drop that flag when upgrading. Otherwise, the Rancher server will not start. The `restricted-admin` role can be used to continue restricting access to the local cluster. For more information, see [this section.]({{<baseurl>}}/rancher/v2.x/en/admin-settings/rbac/global-permissions/#upgrading-from-rancher-with-a-hidden-local-cluster)
 > - [Let's Encrypt will be blocking cert-manager instances older than 0.8.0 starting November 1st 2019.](https://community.letsencrypt.org/t/blocking-old-cert-manager-versions/98753) Upgrade cert-manager to the latest version by following [these instructions.]({{<baseurl>}}/rancher/v2.x/en/installation/options/upgrading-cert-manager)
+> - Helm should be run from the same location as your kubeconfig file (where you run your kubectl commands from. If you installed K8s with RKE, the config will have been created in the directory you ran `rke up` in) or should manually target the kubeconfig for the intended cluster with the `--kubeconfig` tag (see: https://helm.sh/docs/helm/helm/)
 > - The upgrade instructions assume you are using Helm 3. For migration of installs started with Helm 2, refer to the official [Helm 2 to 3 migration docs.](https://helm.sh/blog/migrate-from-helm-v2-to-helm-v3/) The [Helm 2 upgrade page here]({{<baseurl>}}/rancher/v2.x/en/upgrades/upgrades/ha/helm2) provides a copy of the older upgrade instructions that used Helm 2, and it is intended to be used if upgrading to Helm 3 is not feasible.
 > - If you are upgrading Rancher from v2.x to v2.3+, and you are using external TLS termination, you will need to edit the cluster.yml to [enable using forwarded host headers.]({{<baseurl>}}/rancher/v2.x/en/installation/options/chart-options/#configuring-ingress-for-external-tls-when-using-nginx-v0-25)

@@ -102,7 +103,16 @@ helm upgrade rancher rancher-<CHART_REPO>/rancher \
  --set hostname=rancher.my.org
 ```

-> **Note:** There will be many more options from the previous step that need to be appended.
+> **Note:** The above is an example, there may be more values from the previous step that need to be appended.
+
+Alternatively, it's possible to reuse current values and make small changes with the `--reuse-values` flag. For example, to only change the Rancher version:
+
+```
+helm upgrade rancher rancher-<CHART_REPO>/rancher \
+  --namespace cattle-system \
+  --reuse-values \
+  --version=2.4.5
+```

 {{% /accordion %}}

@@ -22,7 +22,7 @@ Prometheus [CPU Reservation](https://kubernetes.io/docs/concepts/configuration/m
 Prometheus [Memory Limit](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) | Memory resource limit for the Prometheus pod.
 Prometheus [Memory Reservation](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) | Memory resource requests for the Prometheus pod.
 Selector | Ability to select the nodes in which Prometheus and Grafana pods are deployed to. To use this option, the nodes must have labels.
-Advanced Options | Since monitoring is an [application](https://github.com/rancher/system-charts/tree/dev/charts/rancher-monitoring) from the [Rancher catalog]({{<baseurl>}}/rancher/v2.x/en/catalog/), it can be [configured like other catalog application]({{<baseurl>}}/rancher/v2.x/en/catalog/apps/#configuration-options). _Warning: Any modification to the application without understanding the entire application can lead to catastrophic errors._
+Advanced Options | Since monitoring is an [application](https://github.com/rancher/system-charts/tree/dev/charts/rancher-monitoring) from the [Rancher catalog]({{<baseurl>}}/rancher/v2.x/en/catalog/), it can be [configured like any other catalog application]({{<baseurl>}}/rancher/v2.x/en/catalog/catalog-config/). _Warning: Any modification to the application without understanding the entire application can lead to catastrophic errors._

 ## Node Exporter

@@ -20,3 +20,22 @@ Configure kubectl by visiting your cluster in the Rancher Web UI then clicking o

 Run `kubectl cluster-info` or `kubectl get pods` successfully.

+## Authentication with kubectl and kubeconfig Tokens with TTL
+
+_**Available as of v2.4.6**_ 
+
+_Requirements_
+
+If admins have [enforced TTL on kubeconfig tokens](../../api/api-tokens/#setting-ttl-on-kubeconfig-tokens), the kubeconfig file requires the [Rancher cli](../cli) to be present in your PATH when you run `kubectl`. Otherwise, you’ll see error like: 
+`Unable to connect to the server: getting credentials: exec: exec: "rancher": executable file not found in $PATH`. 
+
+This feature enables kubectl to authenticate with the Rancher server and get a new kubeconfig token when required. The following auth providers are currently supported: 
+
+1. Local
+2. Active Directory
+3. FreeIpa, OpenLdap 
+4. SAML providers - Ping, Okta, ADFS, Keycloak, Shibboleth 
+
+When you first run kubectl, for example, `kubectl get pods`, it will ask you to pick an auth provider and log in with the Rancher server. 
+The kubeconfig token is cached in the path where you run kubectl under `./.cache/token`. This token is valid till [it expires](../../api/api-tokens/#expiration-period), or [gets deleted from the Rancher server](../../api/api-tokens/#deleting-tokens) 
+Upon expiration, the next `kubectl get pods` will ask you to log in with the Rancher server again. 
@@ -12,9 +12,7 @@ The following steps will quickly deploy a Rancher Server on AWS with a single no

 - [Amazon AWS Account](https://aws.amazon.com/account/): An Amazon AWS Account is required to create resources for deploying Rancher and Kubernetes.
 - [Amazon AWS Access Key](https://docs.aws.amazon.com/general/latest/gr/managing-aws-access-keys.html): Use this link to follow a tutorial to create an Amazon AWS Access Key if you don't have one yet.
- [Amazon AWS Key Pair](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair) Use this link and follow instructions to create a Key Pair.
 - Install [Terraform](https://www.terraform.io/downloads.html): Used to provision the server and cluster in Amazon AWS.
- Install the [RKE terraform provider.](https://github.com/rancher/terraform-provider-rke#installing-the-provider) You will need to download the binary for the Terraform provider for RKE that corresponds to the operating system of your workstation. Then you will need to move the binary into your Terraform plugin directory. The name of the directory will depend on your operating system. For more information how to install Terraform plugins, refer to the [Terraform documentation.](https://www.terraform.io/docs/plugins/basics.html#installing-a-plugin)


 ## Getting Started
@@ -36,7 +34,6 @@ Suggestions include:
    - `aws_region` - Amazon AWS region, choose the closest instead of the default
    - `prefix` - Prefix for all created resources
    - `instance_type` - EC2 instance size used, minimum is `t3a.medium` but `t3a.large` or `t3a.xlarge` could be used if within budget
-    - `ssh_key_file_name` - Use a specific SSH key instead of `~/.ssh/id_rsa` (public key is assumed to be `${ssh_key_file_name}.pub`)

 1. Run `terraform init`.

@@ -48,7 +45,7 @@ Suggestions include:
    Outputs:

    rancher_node_ip = xx.xx.xx.xx
-    rancher_server_url = https://ec2-xx-xx-xx-xx.compute-1.amazonaws.com
+    rancher_server_url = https://rancher.xx.xx.xx.xx.xip.io
    workload_node_ip = yy.yy.yy.yy
    ```

@@ -56,7 +53,7 @@ Suggestions include:

 #### Result

-Two Kubernetes clusters are deployed into your AWS account, one running Rancher Server and the other ready for experimentation deployments.
+Two Kubernetes clusters are deployed into your AWS account, one running Rancher Server and the other ready for experimentation deployments. Please note that while this setup is a great way to explore Rancher functionality, a production setup should follow our high availability setup guidelines.

 ### What's Next?

@@ -37,8 +37,6 @@ Suggestions include:

 1. Run `terraform init`.

-1. Install the [RKE terraform provider](https://github.com/rancher/terraform-provider-rke), see [installation instructions](https://github.com/rancher/terraform-provider-rke#using-the-provider).
-
 1. To initiate the creation of the environment, run `terraform apply --auto-approve`. Then wait for output similar to the following:

    ```
@@ -38,8 +38,6 @@ Suggestions include:

 1. Run `terraform init`.

-1. Install the [RKE terraform provider](https://github.com/rancher/terraform-provider-rke), see [installation instructions](https://github.com/rancher/terraform-provider-rke#using-the-provider).
-
 1. To initiate the creation of the environment, run `terraform apply --auto-approve`. Then wait for output similar to the following:

    ```
@@ -48,7 +46,7 @@ Suggestions include:
    Outputs:

    rancher_node_ip = xx.xx.xx.xx
-    rancher_server_url = https://xx-xx-xx-xx.nip.io
+    rancher_server_url = https://rancher.xx.xx.xx.xx.xip.io
    workload_node_ip = yy.yy.yy.yy
    ```

@@ -43,8 +43,6 @@ Suggestions include:

 1. Run `terraform init`.

-1. Install the [RKE terraform provider](https://github.com/rancher/terraform-provider-rke), see [installation instructions](https://github.com/rancher/terraform-provider-rke#using-the-provider).
-
 1. To initiate the creation of the environment, run `terraform apply --auto-approve`. Then wait for output similar to the following:

    ```
@@ -53,7 +51,7 @@ Suggestions include:
    Outputs:

    rancher_node_ip = xx.xx.xx.xx
-    rancher_server_url = https://xx-xx-xx-xx.nip.io
+    rancher_server_url = https://rancher.xx.xx.xx.xx.xip.io
    workload_node_ip = yy.yy.yy.yy
    ```

@@ -10,6 +10,13 @@ The following steps quickly deploy a Rancher Server with a single node cluster a
 - [Virtualbox](https://www.virtualbox.org): The virtual machines that Vagrant provisions need to be provisioned to VirtualBox.
 - At least 4GB of free RAM.

+### Note
+- Vagrant will require plugins to create VirtualBox VMs. Install them with the following commands:
+
+  `vagrant plugin install vagrant-vboxmanage`
+  
+  `vagrant plugin install vagrant-vbguest`
+
 ## Getting Started

 1. Clone [Rancher Quickstart](https://github.com/rancher/quickstart) to a folder using `git clone https://github.com/rancher/quickstart`.
@@ -21,7 +28,7 @@ The following steps quickly deploy a Rancher Server with a single node cluster a
    - Change the number of nodes and the memory allocations, if required. (`node.count`, `node.cpus`, `node.memory`)
    - Change the password of the `admin` user for logging into Rancher. (`default_password`)

-4. To initiate the creation of the environment run, `vagrant up`.
+4. To initiate the creation of the environment run, `vagrant up --provider=virtualbox`.

 5. Once provisioning finishes, go to `https://172.22.101.101` in the browser. The default user/password is `admin/admin`.

@@ -0,0 +1,720 @@
+---
+title: Hardening Guide v2.4
+weight: 99
+---
+
+This document provides prescriptive guidance for hardening a production installation of Rancher v2.4. It outlines the configurations and controls required to address Kubernetes benchmark controls from the Center for Information Security (CIS).
+
+> This hardening guide describes how to secure the nodes in your cluster, and it is recommended to follow this guide before installing Kubernetes.
+
+This hardening guide is intended to be used with specific versions of the CIS Kubernetes Benchmark, Kubernetes, and Rancher:
+
+Hardening Guide Version | Rancher Version | CIS Benchmark Version | Kubernetes Version
+------------------------|----------------|-----------------------|------------------
+Hardening Guide v2.4 | Rancher v2.4 | Benchmark v1.5 | Kubernetes 1.15
+
+
+[Click here to download a PDF version of this document](https://releases.rancher.com/documents/security/2.4/Rancher_Hardening_Guide.pdf)
+
+### Overview
+
+This document provides prescriptive guidance for hardening a production installation of Rancher v2.4 with Kubernetes v1.15. It outlines the configurations required to address Kubernetes benchmark controls from the Center for Information Security (CIS).
+
+For more detail about evaluating a hardened cluster against the official CIS benchmark, refer to the [CIS Benchmark Rancher Self-Assessment Guide - Rancher v2.4]({{< baseurl >}}/rancher/v2.x/en/security/benchmark-2.4/).
+
+#### Known Issues
+
+- Rancher **exec shell** and **view logs** for pods are **not** functional in a CIS 1.5 hardened setup when only public IP is provided when registering custom nodes. This functionality requires a private IP to be provided when registering the custom nodes.
+- When setting the `default_pod_security_policy_template_id:` to `restricted` Rancher creates **RoleBindings** and **ClusterRoleBindings** on the default service accounts. The CIS 1.5 5.1.5 check requires the default service accounts have no roles or cluster roles bound to it apart from the defaults. In addition the default service accounts should be configured such that it does not provide a service account token and does not have any explicit rights assignments.
+
+### Configure Kernel Runtime Parameters
+
+The following `sysctl` configuration is recommended for all nodes type in the cluster. Set the following parameters in `/etc/sysctl.d/90-kubelet.conf`:
+
+```
+vm.overcommit_memory=1
+vm.panic_on_oom=0
+kernel.panic=10
+kernel.panic_on_oops=1
+kernel.keys.root_maxbytes=25000000
+```
+
+Run `sysctl -p /etc/sysctl.d/90-kubelet.conf` to enable the settings.
+
+### Configure `etcd` user and group
+A user account and group for the **etcd** service is required to be setup prior to installing RKE. The **uid** and **gid** for the **etcd** user will be used in the RKE **config.yml** to set the proper permissions for files and directories during installation time.
+
+#### create `etcd` user and group
+To create the **etcd** group run the following console commands.
+
+The commands below use `52034` for **uid** and **gid** are for example purposes. Any valid unused **uid** or **gid** could also be used in lieu of `52034`.
+
+```
+groupadd --gid 52034 etcd
+useradd --comment "etcd service account" --uid 52034 --gid 52034 etcd
+```
+
+Update the RKE **config.yml** with the **uid** and **gid** of the **etcd** user:
+
+``` yaml
+services:
+  etcd:
+    gid: 52034
+    uid: 52034
+```
+
+#### Set `automountServiceAccountToken` to `false` for `default` service accounts
+Kubernetes provides a default service account which is used by cluster workloads where no specific service account is assigned to the pod. Where access to the Kubernetes API from a pod is required, a specific service account should be created for that pod, and rights granted to that service account. The default service account should be configured such that it does not provide a service account token and does not have any explicit rights assignments.  
+
+For each namespace including **default** and **kube-system** on a standard RKE install the **default** service account must include this value:
+
+```
+automountServiceAccountToken: false
+```
+
+Save the following yaml to a file called `account_update.yaml`
+
+``` yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: default
+automountServiceAccountToken: false
+```
+
+Create a bash script file called `account_update.sh`. Be sure to `chmod +x account_update.sh` so the script has execute permissions.
+
+``` 
+#!/bin/bash -e
+
+for namespace in $(kubectl get namespaces -A -o json | jq -r '.items[].metadata.name'); do
+  kubectl patch serviceaccount default -n ${namespace} -p "$(cat account_update.yaml)"
+done
+```
+
+### Ensure that all Namespaces have Network Policies defined
+
+Running different applications on the same Kubernetes cluster creates a risk of one
+compromised application attacking a neighboring application. Network segmentation is
+important to ensure that containers can communicate only with those they are supposed
+to. A network policy is a specification of how selections of pods are allowed to
+communicate with each other and other network endpoints.
+
+Network Policies are namespace scoped. When a network policy is introduced to a given
+namespace, all traffic not allowed by the policy is denied. However, if there are no network
+policies in a namespace all traffic will be allowed into and out of the pods in that
+namespace. To enforce network policies, a CNI (container network interface) plugin must be enabled.
+This guide uses [canal](https://github.com/projectcalico/canal) to provide the policy enforcement.
+Additional information about CNI providers can be found
+[here](https://rancher.com/blog/2019/2019-03-21-comparing-kubernetes-cni-providers-flannel-calico-canal-and-weave/)
+
+Once a CNI provider is enabled on a cluster a default network policy can be applied. For reference purposes a
+**permissive** example is provide below. If you want to allow all traffic to all pods in a namespace
+(even if policies are added that cause some pods to be treated as “isolated”),
+you can create a policy that explicitly allows all traffic in that namespace. Save the following `yaml` as
+`default-allow-all.yaml`. Additional [documentation](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
+about network policies can be found on the Kubernetes site.
+
+> This `NetworkPolicy` is not recommended for production use
+
+``` yaml
+---
+apiVersion: networking.k8s.io/v1
+kind: NetworkPolicy
+metadata:
+  name: default-allow-all
+spec:
+  podSelector: {}
+  ingress:
+  - {}
+  egress:
+  - {}
+  policyTypes:
+  - Ingress
+  - Egress
+```
+
+Create a bash script file called `apply_networkPolicy_to_all_ns.sh`. Be sure to
+`chmod +x apply_networkPolicy_to_all_ns.sh` so the script has execute permissions.
+
+```
+#!/bin/bash -e
+
+for namespace in $(kubectl get namespaces -A -o json | jq -r '.items[].metadata.name'); do
+  kubectl apply -f default-allow-all.yaml -n ${namespace}
+done
+```
+Execute this script to apply the `default-allow-all.yaml` the **permissive** `NetworkPolicy` to all namespaces.
+
+### Reference Hardened RKE `cluster.yml` configuration
+The reference `cluster.yml` is used by the RKE CLI that provides the configuration needed to achieve a hardened install
+of Rancher Kubernetes Engine (RKE). Install [documentation](https://rancher.com/docs/rke/latest/en/installation/) is
+provided with additional details about the configuration items. This reference `cluster.yml` does not include the required **nodes** directive which will vary depending on your environment. Documentation for node configuration can be found here: https://rancher.com/docs/rke/latest/en/config-options/nodes
+
+
+``` yaml
+# If you intend to deploy Kubernetes in an air-gapped environment,
+# please consult the documentation on how to configure custom RKE images.
+kubernetes_version: "v1.15.9-rancher1-1"
+enable_network_policy: true
+default_pod_security_policy_template_id: "restricted"
+# the nodes directive is required and will vary depending on your environment
+# documentation for node configuration can be found here:
+#  https://rancher.com/docs/rke/latest/en/config-options/nodes
+nodes:
+services:
+  etcd:
+    uid: 52034
+    gid: 52034
+  kube-api:
+    pod_security_policy: true
+    secrets_encryption_config:
+      enabled: true
+    audit_log:
+      enabled: true
+    admission_configuration:
+    event_rate_limit:
+      enabled: true
+  kube-controller:
+    extra_args:
+      feature-gates: "RotateKubeletServerCertificate=true"
+  scheduler:
+    image: ""
+    extra_args: {}
+    extra_binds: []
+    extra_env: []
+  kubelet:
+    generate_serving_certificate: true
+    extra_args:
+      feature-gates: "RotateKubeletServerCertificate=true"
+      protect-kernel-defaults: "true"
+      tls-cipher-suites: "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256"
+    extra_binds: []
+    extra_env: []
+    cluster_domain: ""
+    infra_container_image: ""
+    cluster_dns_server: ""
+    fail_swap_on: false
+  kubeproxy:
+    image: ""
+    extra_args: {}
+    extra_binds: []
+    extra_env: []
+network:
+  plugin: ""
+  options: {}
+  mtu: 0
+  node_selector: {}
+authentication:
+  strategy: ""
+  sans: []
+  webhook: null
+addons: |
+  ---
+  apiVersion: v1
+  kind: Namespace
+  metadata:
+    name: ingress-nginx
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: Role
+  metadata:
+    name: default-psp-role
+    namespace: ingress-nginx
+  rules:
+  - apiGroups:
+    - extensions
+    resourceNames:
+    - default-psp
+    resources:
+    - podsecuritypolicies
+    verbs:
+    - use
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: RoleBinding
+  metadata:
+    name: default-psp-rolebinding
+    namespace: ingress-nginx
+  roleRef:
+    apiGroup: rbac.authorization.k8s.io
+    kind: Role
+    name: default-psp-role
+  subjects:
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: system:serviceaccounts
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: system:authenticated
+  ---
+  apiVersion: v1
+  kind: Namespace
+  metadata:
+    name: cattle-system
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: Role
+  metadata:
+    name: default-psp-role
+    namespace: cattle-system
+  rules:
+  - apiGroups:
+    - extensions
+    resourceNames:
+    - default-psp
+    resources:
+    - podsecuritypolicies
+    verbs:
+    - use
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: RoleBinding
+  metadata:
+    name: default-psp-rolebinding
+    namespace: cattle-system
+  roleRef:
+    apiGroup: rbac.authorization.k8s.io
+    kind: Role
+    name: default-psp-role
+  subjects:
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: system:serviceaccounts
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: system:authenticated
+  ---
+  apiVersion: policy/v1beta1
+  kind: PodSecurityPolicy
+  metadata:
+    name: restricted
+  spec:
+    requiredDropCapabilities:
+    - NET_RAW
+    privileged: false
+    allowPrivilegeEscalation: false
+    defaultAllowPrivilegeEscalation: false
+    fsGroup:
+      rule: RunAsAny
+    runAsUser:
+      rule: MustRunAsNonRoot
+    seLinux:
+      rule: RunAsAny
+    supplementalGroups:
+      rule: RunAsAny
+    volumes:
+    - emptyDir
+    - secret
+    - persistentVolumeClaim
+    - downwardAPI
+    - configMap
+    - projected
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: ClusterRole
+  metadata:
+    name: psp:restricted
+  rules:
+  - apiGroups:
+    - extensions
+    resourceNames:
+    - restricted
+    resources:
+    - podsecuritypolicies
+    verbs:
+    - use
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: ClusterRoleBinding
+  metadata:
+    name: psp:restricted
+  roleRef:
+    apiGroup: rbac.authorization.k8s.io
+    kind: ClusterRole
+    name: psp:restricted
+  subjects:
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: system:serviceaccounts
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: system:authenticated
+  ---
+  apiVersion: v1
+  kind: ServiceAccount
+  metadata:
+    name: tiller
+    namespace: kube-system
+  ---
+  apiVersion: rbac.authorization.k8s.io/v1
+  kind: ClusterRoleBinding
+  metadata:
+    name: tiller
+  roleRef:
+    apiGroup: rbac.authorization.k8s.io
+    kind: ClusterRole
+    name: cluster-admin
+  subjects:
+  - kind: ServiceAccount
+    name: tiller
+    namespace: kube-system
+  
+addons_include: []
+system_images:
+  etcd: ""
+  alpine: ""
+  nginx_proxy: ""
+  cert_downloader: ""
+  kubernetes_services_sidecar: ""
+  kubedns: ""
+  dnsmasq: ""
+  kubedns_sidecar: ""
+  kubedns_autoscaler: ""
+  coredns: ""
+  coredns_autoscaler: ""
+  kubernetes: ""
+  flannel: ""
+  flannel_cni: ""
+  calico_node: ""
+  calico_cni: ""
+  calico_controllers: ""
+  calico_ctl: ""
+  calico_flexvol: ""
+  canal_node: ""
+  canal_cni: ""
+  canal_flannel: ""
+  canal_flexvol: ""
+  weave_node: ""
+  weave_cni: ""
+  pod_infra_container: ""
+  ingress: ""
+  ingress_backend: ""
+  metrics_server: ""
+  windows_pod_infra_container: ""
+ssh_key_path: ""
+ssh_cert_path: ""
+ssh_agent_auth: false
+authorization:
+  mode: ""
+  options: {}
+ignore_docker_version: false
+private_registries: []
+ingress:
+  provider: ""
+  options: {}
+  node_selector: {}
+  extra_args: {}
+  dns_policy: ""
+  extra_envs: []
+  extra_volumes: []
+  extra_volume_mounts: []
+cluster_name: ""
+prefix_path: ""
+addon_job_timeout: 0
+bastion_host:
+  address: ""
+  port: ""
+  user: ""
+  ssh_key: ""
+  ssh_key_path: ""
+  ssh_cert: ""
+  ssh_cert_path: ""
+monitoring:
+  provider: ""
+  options: {}
+  node_selector: {}
+restore:
+  restore: false
+  snapshot_name: ""
+dns: null
+```
+
+### Reference Hardened RKE Template configuration 
+
+The reference RKE Template provides the configuration needed to achieve a hardened install of Kubenetes.
+RKE Templates are used to provision Kubernetes and define Rancher settings. Follow the Rancher
+[documentaion](https://rancher.com/docs/rancher/v2.x/en/installation) for additional installation and RKE Template details.
+
+``` yaml
+# 
+# Cluster Config
+# 
+default_pod_security_policy_template_id: restricted
+docker_root_dir: /var/lib/docker
+enable_cluster_alerting: false
+enable_cluster_monitoring: false
+enable_network_policy: true
+# 
+# Rancher Config
+# 
+rancher_kubernetes_engine_config:
+  addon_job_timeout: 30
+  addons: |-
+    ---
+    apiVersion: v1
+    kind: Namespace
+    metadata:
+      name: ingress-nginx
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: Role
+    metadata:
+      name: default-psp-role
+      namespace: ingress-nginx
+    rules:
+    - apiGroups:
+      - extensions
+      resourceNames:
+      - default-psp
+      resources:
+      - podsecuritypolicies
+      verbs:
+      - use
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: RoleBinding
+    metadata:
+      name: default-psp-rolebinding
+      namespace: ingress-nginx
+    roleRef:
+      apiGroup: rbac.authorization.k8s.io
+      kind: Role
+      name: default-psp-role
+    subjects:
+    - apiGroup: rbac.authorization.k8s.io
+      kind: Group
+      name: system:serviceaccounts
+    - apiGroup: rbac.authorization.k8s.io
+      kind: Group
+      name: system:authenticated
+    ---
+    apiVersion: v1
+    kind: Namespace
+    metadata:
+      name: cattle-system
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: Role
+    metadata:
+      name: default-psp-role
+      namespace: cattle-system
+    rules:
+    - apiGroups:
+      - extensions
+      resourceNames:
+      - default-psp
+      resources:
+      - podsecuritypolicies
+      verbs:
+      - use
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: RoleBinding
+    metadata:
+      name: default-psp-rolebinding
+      namespace: cattle-system
+    roleRef:
+      apiGroup: rbac.authorization.k8s.io
+      kind: Role
+      name: default-psp-role
+    subjects:
+    - apiGroup: rbac.authorization.k8s.io
+      kind: Group
+      name: system:serviceaccounts
+    - apiGroup: rbac.authorization.k8s.io
+      kind: Group
+      name: system:authenticated
+    ---
+    apiVersion: policy/v1beta1
+    kind: PodSecurityPolicy
+    metadata:
+      name: restricted
+    spec:
+      requiredDropCapabilities:
+      - NET_RAW
+      privileged: false
+      allowPrivilegeEscalation: false
+      defaultAllowPrivilegeEscalation: false
+      fsGroup:
+        rule: RunAsAny
+      runAsUser:
+        rule: MustRunAsNonRoot
+      seLinux:
+        rule: RunAsAny
+      supplementalGroups:
+        rule: RunAsAny
+      volumes:
+      - emptyDir
+      - secret
+      - persistentVolumeClaim
+      - downwardAPI
+      - configMap
+      - projected
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: ClusterRole
+    metadata:
+      name: psp:restricted
+    rules:
+    - apiGroups:
+      - extensions
+      resourceNames:
+      - restricted
+      resources:
+      - podsecuritypolicies
+      verbs:
+      - use
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: ClusterRoleBinding
+    metadata:
+      name: psp:restricted
+    roleRef:
+      apiGroup: rbac.authorization.k8s.io
+      kind: ClusterRole
+      name: psp:restricted
+    subjects:
+    - apiGroup: rbac.authorization.k8s.io
+      kind: Group
+      name: system:serviceaccounts
+    - apiGroup: rbac.authorization.k8s.io
+      kind: Group
+      name: system:authenticated
+    ---
+    apiVersion: v1
+    kind: ServiceAccount
+    metadata:
+      name: tiller
+      namespace: kube-system
+    ---
+    apiVersion: rbac.authorization.k8s.io/v1
+    kind: ClusterRoleBinding
+    metadata:
+      name: tiller
+    roleRef:
+      apiGroup: rbac.authorization.k8s.io
+      kind: ClusterRole
+      name: cluster-admin
+    subjects:
+    - kind: ServiceAccount
+      name: tiller
+      namespace: kube-system
+  ignore_docker_version: true
+  kubernetes_version: v1.15.9-rancher1-1
+# 
+#   If you are using calico on AWS
+# 
+#    network:
+#      plugin: calico
+#      calico_network_provider:
+#        cloud_provider: aws
+# 
+# # To specify flannel interface
+# 
+#    network:
+#      plugin: flannel
+#      flannel_network_provider:
+#      iface: eth1
+# 
+# # To specify flannel interface for canal plugin
+# 
+#    network:
+#      plugin: canal
+#      canal_network_provider:
+#        iface: eth1
+# 
+  network:
+    mtu: 0
+    plugin: canal
+# 
+#    services:
+#      kube-api:
+#        service_cluster_ip_range: 10.43.0.0/16
+#      kube-controller:
+#        cluster_cidr: 10.42.0.0/16
+#        service_cluster_ip_range: 10.43.0.0/16
+#      kubelet:
+#        cluster_domain: cluster.local
+#        cluster_dns_server: 10.43.0.10
+# 
+  services:
+    etcd:
+      backup_config:
+        enabled: false
+        interval_hours: 12
+        retention: 6
+        safe_timestamp: false
+      creation: 12h
+      extra_args:
+        election-timeout: '5000'
+        heartbeat-interval: '500'
+      gid: 52034
+      retention: 72h
+      snapshot: false
+      uid: 52034
+    kube_api:
+      always_pull_images: false
+      audit_log:
+        enabled: true
+      event_rate_limit:
+        enabled: true
+      pod_security_policy: true
+      secrets_encryption_config:
+        enabled: true
+      service_node_port_range: 30000-32767
+    kube_controller:
+      extra_args:
+        address: 127.0.0.1
+        feature-gates: RotateKubeletServerCertificate=true
+        profiling: 'false'
+        terminated-pod-gc-threshold: '1000'
+    kubelet:
+      extra_args:
+        anonymous-auth: 'false'
+        event-qps: '0'
+        feature-gates: RotateKubeletServerCertificate=true
+        make-iptables-util-chains: 'true'
+        protect-kernel-defaults: 'true'
+        streaming-connection-idle-timeout: 1800s
+        tls-cipher-suites: >-
+          TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
+      fail_swap_on: false
+      generate_serving_certificate: true
+    scheduler:
+      extra_args:
+        address: 127.0.0.1
+        profiling: 'false'
+  ssh_agent_auth: false
+windows_prefered_cluster: false
+```
+
+### Hardened Reference Ubuntu 18.04 LTS **cloud-config**:
+
+The reference **cloud-config** is generally used in cloud infrastructure environments to allow for
+configuration management of compute instances. The reference config configures Ubuntu operating system level settings
+needed before installing kubernetes.
+
+``` yaml
+#cloud-config
+packages:
+  - curl
+  - jq
+runcmd:
+  - sysctl -w vm.overcommit_memory=1
+  - sysctl -w kernel.panic=10
+  - sysctl -w kernel.panic_on_oops=1
+  - curl https://releases.rancher.com/install-docker/18.09.sh | sh
+  - usermod -aG docker ubuntu
+  - return=1; while [ $return != 0 ]; do sleep 2; docker ps; return=$?; done
+  - addgroup --gid 52034 etcd
+  - useradd --comment "etcd service account" --uid 52034 --gid 52034 etcd
+write_files:
+  - path: /etc/sysctl.d/kubelet.conf
+    owner: root:root
+    permissions: "0644"
+    content: |
+      vm.overcommit_memory=1
+      kernel.panic=10
+      kernel.panic_on_oops=1
+```
@@ -3,4 +3,4 @@ title: Security Scans
 weight: 299
 ---

-The documentation about CIS security scans has moved [here.]({{<baseurl>}}/rancher/v2.x/en/cis-scans)
+The documentation about CIS security scans has moved [here.]({{<baseurl>}}/rancher/v2.x/en/cis-scans)
@@ -28,6 +28,9 @@ API Keys are composed of four components:
 3. **Optional:** Enter a description for the API key and select an expiration period or a scope. We recommend setting an expiration date.

    The API key won't be valid after expiration. Shorter expiration periods are more secure.
+
+    _Available as of v2.4.6_
+    Expiration period will be bound by `v3/settings/auth-token-max-ttl-minutes`. If it exceeds the max-ttl, API key will be created with max-ttl as the expiration period.
    
    A scope will limit the API key so that it will only work against the Kubernetes API of the specified cluster. If the cluster is configured with an Authorized Cluster Endpoint, you will be able to use a scoped token directly against the cluster's API without proxying through the Rancher server. See [Authorized Cluster Endpoints]({{<baseurl>}}/rancher/v2.x/en/overview/architecture/#4-authorized-cluster-endpoint) for more information.

@@ -34,8 +34,6 @@ You can upload this snapshot directly to an S3 backend with the [S3 options]({{<
 $ rke etcd snapshot-save --name snapshot.db --config cluster.yml
 ```

-{{< img "/img/rke/rke-etcd-backup.png" "etcd snapshot" >}}
-
 ### 2. Simulate a Node Failure

 To simulate the failure, let's power down `node2`.
@@ -133,8 +131,6 @@ Back up the Kubernetes cluster by taking a local snapshot:
 $ rke etcd snapshot-save --name snapshot.db --config cluster.yml
 ```

-{{< img "/img/rke/rke-etcd-backup.png" "etcd snapshot" >}}
-
 <a id="store-the-snapshot-externally-rke-prior-to-v0.2.0"></a>
 ### 2. Store the Snapshot Externally

@@ -7,8 +7,8 @@ weight: 50
 RKE is a fast, versatile Kubernetes installer that you can use to install Kubernetes on your Linux hosts. You can get started in a couple of quick and easy steps:

 1. [Download the RKE Binary](#download-the-rke-binary)
-    1. [Alternative RKE macOS Install - Homebrew](#alternative-rke-macos-install---homebrew)
-    1. [Alternative RKE macOS Install - MacPorts](#alternative-rke-macos-install---macports)
+    1. [Alternative RKE macOS Install - Homebrew](#alternative-rke-macos-x-install-homebrew)
+    1. [Alternative RKE macOS Install - MacPorts](#alternative-rke-macos-install-macports)
 1. [Prepare the Nodes for the Kubernetes Cluster](#prepare-the-nodes-for-the-kubernetes-cluster)
 1. [Creating the Cluster Configuration File](#creating-the-cluster-configuration-file)
 1. [Deploying Kubernetes with RKE](#deploying-kubernetes-with-rke)
@@ -39,13 +39,11 @@ The following certificates must exist in the certificate directory.
 |         Kube Node          |            kube-node.pem           |            kube-node-key.pem             |
 |   Apiserver Proxy Client   |   kube-apiserver-proxy-client.pem   |   kube-apiserver-proxy-client-key.pem   |
 |         Etcd Nodes         |        kube-etcd-x-x-x-x.pem        |        kube-etcd-x-x-x-x-key.pem        |
-| Kube Api Request Header CA | kube-apiserver-requestheader-ca.pem* | kube-apiserver-requestheader-ca-key.pem** |
+| Kube Api Request Header CA | kube-apiserver-requestheader-ca.pem* | kube-apiserver-requestheader-ca-key.pem |
 |    Service Account Token   |                  -                  |    kube-service-account-token-key.pem   |

 \* Is the same as kube-ca.pem

-\** Is the same as kube-ca-key
-
 ## Generating Certificate Signing Requests (CSRs) and Keys

 If you want to create and sign the certificates by a real Certificate Authority (CA), you can use RKE to generate a set of Certificate Signing Requests (CSRs) and keys. Using the `rke cert generate-csr` command, you can generate the CSRs and keys.
@@ -26,7 +26,7 @@ You can add/remove only worker nodes, by running `rke up --update-only`. This wi

 In order to remove the Kubernetes components from nodes, you use the `rke remove` command.

-> **Warning:** This command is irreversible and will destroy the Kubernetes cluster, including etcd snapshots on S3. If there is a disaster and your cluster is inaccessible, refer to the process for [restoring your cluster from a snapshot]({{<baseurl>}}rke/latest/en/etcd-snapshots/#etcd-disaster-recovery).
+> **Warning:** This command is irreversible and will destroy the Kubernetes cluster, including etcd snapshots on S3. If there is a disaster and your cluster is inaccessible, refer to the process for [restoring your cluster from a snapshot]({{<baseurl>}}/rke/latest/en/etcd-snapshots/#etcd-disaster-recovery).

 The `rke remove` command does the following to each node in the `cluster.yml`:

@@ -2035,9 +2035,9 @@ lodash.sortby@^4.7.0:
  integrity sha1-7dFMgk4sycHgsKG0K7UhBRakJDg=

 lodash@^4.13.1, lodash@^4.17.10, lodash@^4.17.5:
-  version "4.17.15"
-  resolved "https://registry.yarnpkg.com/lodash/-/lodash-4.17.15.tgz#b447f6670a0455bbfeedd11392eff330ea097548"
-  integrity sha512-8xOcRHvCjnocdS5cpwXQXVzmmh5e5+saE2QGoeQmbKmRS6J3VQppPOIt0MnmE+4xlZoumy0GPG0D0MVIQbNA1A==
+  version "4.17.19"
+  resolved "https://registry.yarnpkg.com/lodash/-/lodash-4.17.19.tgz#e48ddedbe30b3321783c5b4301fbd353bc1e4a4b"
+  integrity sha512-JNvd8XER9GQX0v2qJgsaN/mzFCNA5BRe/j8JN9d+tWyGLSodKQHKFicdwNYzWwI3wjRnaKPsGj1XkBjx/F96DQ==

 loose-envify@^1.0.0:
  version "1.4.0"