Merge pull request #1405 from ChrisMcKee/master

additional detail to rancher ha (back/restore) and etcd snapshots
This commit is contained in:
Denise
2019-05-08 10:57:56 -07:00
committed by GitHub
3 changed files with 89 additions and 18 deletions
@@ -44,37 +44,54 @@ To take recurring snapshots, enable the `etcd-snapshot` service, which is a serv
**To Enable Recurring Snapshots:**
1. Open `rancher-cluster.yml` with your favorite text editor.
2. Add the following code block to the bottom of the file:
```
services:
etcd:
snapshot: true # enables recurring etcd snapshots
creation: 6h0s # time increment between snapshots
retention: 24h # time increment before snapshot purge
```
_Pre 0.2.0_
~~~yaml
```
services:
etcd:
snapshot: true # enables recurring etcd snapshots
creation: 6h0s # time increment between snapshots
retention: 24h # time increment before snapshot purge
```
~~~
_Post 0.2.0: Note: S3 backup is optional_
~~~yaml
```
services:
etcd:
backup_config:
enabled: true # enables recurring etcd snapshots
interval_hours: 6 # time increment between snapshots
retention: 60 # time in days before snapshot purge
s3_backup_config: # optional
access_key: "myaccesskey"
secret_key: "myaccesssecret"
bucket_name: "my-backup-bucket"
endpoint: "s3.eu-west-1.amazonaws.com"
region: "eu-west-1"
```
~~~
3. Edit the code according to your requirements.
4. Save and close `rancher-cluster.yml`.
5. Open **Terminal** and change directory to the location of the RKE binary. Your `rancher-cluster.yml` file must reside in the same directory.
6. Run the following command:
```
rke up --config rancher-cluster.yml
```
**Result:** RKE is configured to take recurring snapshots of `etcd` on all nodes running the `etcd` role. Snapshots are saved to the following directory: `/opt/rke/etcd-snapshots/`.
#### Option B: One-Time Snapshots
When you're about to upgrade Rancher or restore it to a previous snapshot, you should snapshot your live image so that you have a backup of `etcd` in its last known state.
**To Take a One-Time Snapshot:**
**To Take a One-Time Local Snapshot:**
1. Open **Terminal** and change directory to the location of the RKE binary. Your `rancher-cluster.yml` file must reside in the same directory.
@@ -86,7 +103,23 @@ When you're about to upgrade Rancher or restore it to a previous snapshot, you s
**Result:** RKE takes a snapshot of `etcd` running on each `etcd` node. The file is saved to `/opt/rke/etcd-snapshots`.
### 2. Backup Snapshots to a Safe Location
**To Take a One-Time S3 Snapshot:**
1. Open **Terminal** and change directory to the location of the RKE binary. Your `rancher-cluster.yml` file must reside in the same directory.
2. Enter the following command. Replace `<SNAPSHOT.db>` with any name that you want to use for the snapshot (e.g. `upgrade.db`).
```shell
rke etcd snapshot-save --config rancher-cluster.yml --name snapshot-name \
--s3 --access-key S3_ACCESS_KEY --secret-key S3_SECRET_KEY \
--bucket-name s3-bucket-name --s3-endpoint s3.amazonaws.com
```
*The snapshot is saved in `/opt/rke/etcd-snapshots` as well as uploaded to the S3 backend.*
### 2. Backup Local Snapshots to a Safe Location
> Note: This step is done for you where S3 backups are enabled
After taking the `etcd` snapshots, save them to a safe location so that they're unaffected if your cluster experiences a disaster scenario. This location should be persistent.
@@ -98,3 +131,6 @@ In this documentation, as an example, we're using Amazon S3 as our safe location
root@node:~# s3cmd mb s3://rke-etcd-snapshots
root@node:~# s3cmd put /opt/rke/etcd-snapshots/snapshot.db s3://rke-etcd-snapshots/
```
@@ -33,10 +33,38 @@ We recommend that you start with fresh nodes and a clean state. Alternatively yo
### 2. Place Snapshot and PKI Bundle
**Local Snapshots**
Pick a one of the clean nodes. That node will be the "target node" for the initial restore. Place the snapshot and PKI certificate bundle files in the `/opt/rke/etcd-snapshots` directory on the "target node".
* Snapshot - `<snapshot>.db`
* PKI Bundle - `pki.bundle.tar.gz`
* PKI Bundle - `pki.bundle.tar.gz` *(Pre RKE 0.2.0 only; after 0.2 you should have an cluster.rkestate file)*
***Continue to step 3***
**Remote Snapshots** (rancher 2.1 / rke 0.2.0 onwards)
Ensuring your `cluster.rkestate` file is present, run rke restore from s3.
```shell
rke etcd snapshot-restore --config rancher-cluster-restore.yml \
--name snap-shot-name.db \
--s3 --access-key KEY --secret-key SECRET \
--bucket-name my-rancher-etcd-backup-bucket \
--s3-endpoint s3.amazonaws.com \
--region eu-west-2
```
Once the process has completed, if rancher has been installed via helm, the UI will load (can take a few minutes).
At this point the restoration is complete.
> Note: At this point it is a good idea to ensure your `kube_config_cluster.yml` and `cluster.rkestate` are backed up and preserved for any future maintenance.
### 3. Configure RKE
+11 -4
View File
@@ -158,14 +158,18 @@ $ rke etcd snapshot-restore --config cluster.yml --name mysnapshot
_Available as of v0.2.0_
> **Note:** Ensure your `cluster.rkestate` is present before starting the restore, as this contains your certificate data for the cluster
When restoring etcd from a snapshot located in S3, the command needs the S3 information in order to connect to the S3 backend and retrieve the snapshot.
```
```shell
$ rke etcd snapshot-restore --config cluster.yml --name snapshot-name \
--s3 --access-key S3_ACCESS_KEY --secret-key S3_SECRET_KEY \
--bucket-name s3-bucket-name --s3-endpoint s3.amazonaws.com
```
## Example
> **Note:** if you were restoring a cluster that had rancher installed the UI should start-up after a few minutes; you don't need to re-run helm.
### Example Scenario of restoring from a Local Snapshot
In this example, the Kubernetes cluster was deployed on two AWS nodes.
@@ -185,7 +189,7 @@ $ rke etcd snapshot-save --name snapshot.db --config cluster.yml
![etcd snapshot]({{< baseurl >}}/img/rke/rke-etcd-backup.png)
### Store the Snapshot Externally to S3
### Store the Snapshot Externally in S3
As of v0.2.0, this step is no longer required, as RKE can upload and download snapshots automatically from S3 by adding in [S3 options](#options-for-rke-etcd-snapshot-save) when running the `rke etcd snapshot-save` command.
@@ -253,7 +257,7 @@ nodes:
After the new node is added to the `cluster.yml`, run `rke etcd snapshot-restore` to launch `etcd` from the backup. The snapshot and `pki.bundle.tar.gz` file are expected to be saved at `/opt/rke/etcd-snapshots`.
As of v0.2.0, if you want to directly retrieve the snapshot from S3, add in the [S3 options](#options-for-rke-etcd-snapshot-restore).
> **Note:** As of v0.2.0, the file **pki.bundle.tar.gz** is no longer required for the restore process.
> **Note:** As of v0.2.0, the file **pki.bundle.tar.gz** is no longer required for the restore process as the certificates required to restore are preserved within the `cluster.rkestate`
```
$ rke etcd snapshot-restore --name snapshot.db --config cluster.yml
@@ -294,3 +298,6 @@ docker container inspect rke-bundle-cert
```
The important thing to note is the mounts of the container and location of the **pki.bundle.tar.gz**.