From 48eb889ebf1f2f8da40249fa43413bc996281e75 Mon Sep 17 00:00:00 2001 From: Catherine Luse Date: Mon, 21 Oct 2019 16:31:24 -0700 Subject: [PATCH] Fix merge conflict error --- .../rke/latest/en/etcd-snapshots/_index.md | 250 ------------------ 1 file changed, 250 deletions(-) diff --git a/content/rke/latest/en/etcd-snapshots/_index.md b/content/rke/latest/en/etcd-snapshots/_index.md index e565453cc17..d973feb3d2f 100644 --- a/content/rke/latest/en/etcd-snapshots/_index.md +++ b/content/rke/latest/en/etcd-snapshots/_index.md @@ -25,257 +25,7 @@ You can use RKE to [restore your cluster from backup]({{}}/rke/latest/e # Example Scenarios -<<<<<<< Updated upstream These [example scenarios]({{}}/rke/latest/en/etcd-snapshots/example-scenarios) for backup and restore are different based on your version of RKE. -======= -### IAM Support for Storing Snapshots in S3 -In addition to API access keys, RKE supports using IAM roles for S3 authentication. The cluster etcd nodes must be assigned an IAM role that has read/write access to the designated backup bucket on S3. Also, the nodes must have network access to the S3 endpoint specified. - - To give an application access to S3, refer to the AWS documentation on [Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances.](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html) - -### Local One-Time Snapshot Example - -``` -$ rke etcd snapshot-save --config cluster.yml --name snapshot-name -``` - -The snapshot is saved in `/opt/rke/etcd-snapshots` - -### One-Time Snapshots uploaded to S3 Example - -_Available as of v0.2.0_ - -``` -$ rke etcd snapshot-save --config cluster.yml --name snapshot-name \ ---s3 --access-key S3_ACCESS_KEY --secret-key S3_SECRET_KEY \ ---bucket-name s3-bucket-name --s3-endpoint s3.amazonaws.com -``` - -The snapshot is saved in `/opt/rke/etcd-snapshots` as well as uploaded to the S3 backend. - -## Recurring Snapshots - -To schedule automatic recurring etcd snapshots, you can enable the `etcd-snapshot` service with [extra configuration options the etcd service](#options-for-the-etcd-snapshot-service). `etcd-snapshot` runs in a service container alongside the `etcd` container. By default, the `etcd-snapshot` service takes a snapshot for every node that has the `etcd` role and stores them to local disk in `/opt/rke/etcd-snapshots`. If you set up the [options for S3](#options-for-the-etcd-snapshot-service), the snapshot will also be uploaded to the S3 backend. - -Prior to v0.2.0, along with the snapshots, RKE saves a backup of the certificates, i.e. a file named `pki.bundle.tar.gz`, in the same location. The snapshot and pki bundle file are required for the restore process in versions prior to v0.2.0. - -When a cluster is launched with the `etcd-snapshot` service enabled, you can view the `etcd-rolling-snapshots` logs to confirm backups are being created automatically. - -``` -$ docker logs etcd-rolling-snapshots - -time="2018-05-04T18:39:16Z" level=info msg="Initializing Rolling Backups" creation=1m0s retention=24h0m0s -time="2018-05-04T18:40:16Z" level=info msg="Created backup" name="2018-05-04T18:40:16Z_etcd" runtime=108.332814ms -time="2018-05-04T18:41:16Z" level=info msg="Created backup" name="2018-05-04T18:41:16Z_etcd" runtime=92.880112ms -time="2018-05-04T18:42:16Z" level=info msg="Created backup" name="2018-05-04T18:42:16Z_etcd" runtime=83.67642ms -time="2018-05-04T18:43:16Z" level=info msg="Created backup" name="2018-05-04T18:43:16Z_etcd" runtime=86.298499ms -``` - -### Options for the `Etcd-Snapshot` Service - -Depending on your version of RKE, the options used to configure recurring snapshots may be different. - -_Available as of v0.2.0_ - -|Option|Description| S3 Specific | -|---|---| --- | -|**interval_hours**| The duration in hours between recurring backups. This supercedes the `creation` option and will override it if both are specified.| | -|**retention**| The number of snapshots to retain before rotation. This supercedes the `retention` option and will override it if both are specified.| | -|**bucket_name**| S3 bucket name where backups will be stored| * | -|**access_key**| S3 access key with permission to access the backup bucket.| * | -|**secret_key** |S3 secret key with permission to access the backup bucket.| * | -|**region** |S3 region for the backup bucket. This is optional.| * | -|**endpoint** |S3 regions endpoint for the backup bucket.| * | - -
- - -```yaml -services: - etcd: - backup_config: - interval_hours: 12 - retention: 6 - s3backupconfig: - access_key: S3_ACCESS_KEY - secret_key: S3_SECRET_KEY - bucket_name: s3-bucket-name - region: "" - endpoint: s3.amazonaws.com -``` - -#### Prior to v0.2.0 - -|Option|Description| -|---|---| -|**Snapshot**|By default, the recurring snapshot service is disabled. To enable the service, you need to define it as part of `etcd` and set it to `true`.| -|**Creation**|By default, the snapshot service will take snapshots every 5 minutes (`5m0s`). You can change the time between snapshots as part of the `creation` directive for the `etcd` service.| -|**Retention**|By default, all snapshots are saved for 24 hours (`24h`) before being deleted and purged. You can change how long to store a snapshot as part of the `retention` directive for the `etcd` service.| - -```yaml -services: - etcd: - snapshot: true - creation: 5m0s - retention: 24h -``` - -## Etcd Disaster Recovery - -If there is a disaster with your Kubernetes cluster, you can use `rke etcd snapshot-restore` to recover your etcd. This command reverts etcd to a specific snapshot. RKE also removes the old `etcd` container before creating a new `etcd` cluster using the snapshot that you have chosen. - ->**Warning:** Restoring an etcd snapshot deletes your current etcd cluster and replaces it with a new one. Before you run the `rke etcd snapshot-restore` command, you should back up any important data in your cluster. - -The snapshot used to restore your etcd cluster can either be stored locally in `/opt/rke/etcd-snapshots` or from a S3 compatible backend. The S3 backend option is available as of v0.2.0. - -### Options for `rke etcd snapshot-restore` - -| Option | Description | S3 Specific | -| --- | --- | ---| -| `--name` value | Specify snapshot name | | -| `--config` value | Specify an alternate cluster YAML file (default: "cluster.yml") [$RKE_CONFIG] | | -| `--s3` | Enabled backup to s3 |* | -| `--s3-endpoint` value | Specify s3 endpoint url (default: "s3.amazonaws.com") | * | -| `--access-key` value | Specify s3 accessKey | *| -| `--secret-key` value | Specify s3 secretKey | *| -| `--bucket-name` value | Specify s3 bucket name | *| -| `--region` value | Specify the s3 bucket location (optional) | *| -| `--ssh-agent-auth` | [Use SSH Agent Auth defined by SSH_AUTH_SOCK]({{< baseurl >}}/rke/latest/en/config-options/#ssh-agent) | | -| `--ignore-docker-version` | [Disable Docker version check]({{< baseurl >}}/rke/latest/en/config-options/#supported-docker-versions) | - -### Example of Restoring from a Local Snapshot - -When restoring etcd from a local snapshot, the snapshot is assumed to be located in `/opt/rke/etcd-snapshots`. In versions prior to v0.2.0, the `pki.bundle.tar.gz` file is also expected to be in the same location. As of v0.2.0, this file is no longer needed as v0.2.0 has changed how the [Kubernetes cluster state is stored]({{< baseurl >}}/rke/latest/en/installation/#kubernetes-cluster-state). - -``` -$ rke etcd snapshot-restore --config cluster.yml --name mysnapshot -``` - -### Example of Restoring from a Snapshot in S3 - -_Available as of v0.2.0_ - -> **Note:** Ensure your `cluster.rkestate` is present before starting the restore, as this contains your certificate data for the cluster - -When restoring etcd from a snapshot located in S3, the command needs the S3 information in order to connect to the S3 backend and retrieve the snapshot. - -```shell -$ rke etcd snapshot-restore --config cluster.yml --name snapshot-name \ ---s3 --access-key S3_ACCESS_KEY --secret-key S3_SECRET_KEY \ ---bucket-name s3-bucket-name --s3-endpoint s3.amazonaws.com -``` -> **Note:** if you were restoring a cluster that had rancher installed the UI should start-up after a few minutes; you don't need to re-run helm. - -### Example Scenario of restoring from a Local Snapshot - -In this example, the Kubernetes cluster was deployed on two AWS nodes. - -| Name | IP | Role | -|:-----:|:--------:|:----------------------:| -| node1 | 10.0.0.1 | [controlplane, worker] | -| node2 | 10.0.0.2 | [etcd] | - -### Back up the `etcd` cluster - -Take a local snapshot of the Kubernetes cluster. As of v0.2.0, you can also upload this snapshot directly to a S3 backend with the [S3 options](#options-for-rke-etcd-snapshot-save). - -``` -$ rke etcd snapshot-save --name snapshot.db --config cluster.yml -``` - -{{< img "/img/rke/rke-etcd-backup.png" "etcd snapshot">}} - - -### Store the Snapshot Externally in S3 - -As of v0.2.0, this step is no longer required, as RKE can upload and download snapshots automatically from S3 by adding in [S3 options](#options-for-rke-etcd-snapshot-save) when running the `rke etcd snapshot-save` command. - -After taking the etcd snapshot on `node2`, we recommend saving this backup in a persistence place. One of the options is to save the backup and `pki.bundle.tar.gz` file on a S3 bucket or tape backup. - -> **Note:** As of v0.2.0, the file **pki.bundle.tar.gz** is no longer required for the restore process. - -``` -# If you're using an AWS host and have the ability to connect to S3 -root@node2:~# s3cmd mb s3://rke-etcd-backup -root@node2:~# s3cmd /opt/rke/etcd-snapshots/snapshot.db /opt/rke/etcd-snapshots/pki.bundle.tar.gz s3://rke-etcd-backup/ -``` - -### Place the backup on a new node - -To simulate the failure, let's power down `node2`. - -``` -root@node2:~# poweroff -``` - -| Name | IP | Role | -|:-----:|:--------:|:----------------------:| -| node1 | 10.0.0.1 | [controlplane, worker] | -| ~~node2~~ | ~~10.0.0.2~~ | ~~[etcd]~~ | -| node3 | 10.0.0.3 | [etcd] | -| | | | - - -Before restoring etcd and running `rke up`, we need to retrieve the backup saved on S3 to a new node, e.g. `node3`. As of v0.2.0, you can directly retrieve the snapshot from S3 when running the restore command, so this step is for users who stored the snapshot externally without using the integrated S3 options. - -``` -# Make a Directory -root@node3:~# mkdir -p /opt/rke/etcdbackup -# Get the Backup from S3 -root@node3:~# s3cmd get s3://rke-etcd-backup/snapshot.db /opt/rke/etcd-snapshots/snapshot.db -# Get the pki bundle from S3, only needed prior to v0.2.0 -root@node3:~# s3cmd get s3://rke-etcd-backup/pki.bundle.tar.gz /opt/rke/etcd-snapshots/pki.bundle.tar.gz -``` - -### Restore `etcd` on the new node from the backup - -Before updating and restoring etcd, you will need to add the new node into the Kubernetes cluster with the `etcd` role. In the `cluster.yml`, comment out the old node and add in the new node. ` - -```yaml -nodes: - - address: 10.0.0.1 - hostname_override: node1 - user: ubuntu - role: - - controlplane - - worker -# - address: 10.0.0.2 -# hostname_override: node2 -# user: ubuntu -# role: -# - etcd - - address: 10.0.0.3 - hostname_override: node3 - user: ubuntu - role: - - etcd -``` - -After the new node is added to the `cluster.yml`, run `rke etcd snapshot-restore` to launch `etcd` from the backup. The snapshot and `pki.bundle.tar.gz` file are expected to be saved at `/opt/rke/etcd-snapshots`. -As of v0.2.0, if you want to directly retrieve the snapshot from S3, add in the [S3 options](#options-for-rke-etcd-snapshot-restore). - -> **Note:** As of v0.2.0, the file **pki.bundle.tar.gz** is no longer required for the restore process as the certificates required to restore are preserved within the `cluster.rkestate` - -``` -$ rke etcd snapshot-restore --name snapshot.db --config cluster.yml -``` - -Finally, we need to restore the operations on the cluster by making the Kubernetes API point to the new `etcd` by running `rke up` again using the new `cluster.yml`. - -``` -$ rke up --config cluster.yml -``` - -Confirm that your Kubernetes cluster is functional by checking the pods on your cluster. - -``` -> kubectl get pods -NAME READY STATUS RESTARTS AGE -nginx-65899c769f-kcdpr 1/1 Running 0 17s -nginx-65899c769f-pc45c 1/1 Running 0 17s -nginx-65899c769f-qkhml 1/1 Running 0 17s -``` ->>>>>>> Stashed changes ## Troubleshooting