diff --git a/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/migrate-to-an-out-of-tree-cloud-provider/migrate-to-out-of-tree-azure.md b/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/migrate-to-an-out-of-tree-cloud-provider/migrate-to-out-of-tree-azure.md new file mode 100644 index 00000000000..9f77591a582 --- /dev/null +++ b/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/migrate-to-an-out-of-tree-cloud-provider/migrate-to-out-of-tree-azure.md @@ -0,0 +1,211 @@ +--- +title: Migrating Azure In-tree to Out-of-tree +--- + + + + + +Kubernetes is moving away from maintaining cloud providers in-tree. + +Starting with Kubernetes 1.29, in-tree cloud providers have been disabled. You must disable `DisableCloudProviders` and `DisableKubeletCloudCredentialProvider` to use the in-tree Azure cloud provider or migrate from in-tree cloud provider to out-of-tree provider. You can disable the required feature gates by setting `feature-gates=DisableCloudProviders=false` as an additional argument for the cluster's Kubelet, Controller Manager, and API Server in the advanced cluster configuration. Additionally, set `DisableKubeletCloudCredentialProvider=false` in the Kubelet's arguments to enable in-tree functionality for authenticating to Azure container registries for image pull credentials. See [upstream docs](https://github.com/kubernetes/kubernetes/pull/117503) for more details. + +In Kubernetes v1.30 and later, the in-tree cloud providers have been removed. Rancher allows you to upgrade to Kubernetes v1.30 when you migrate from an in-tree to out-of-tree provider. + +To migrate from the in-tree cloud provider to the out-of-tree Azure cloud provider, you must stop the existing cluster's kube controller manager and install the Azure cloud controller manager. + +If it's acceptable to have some downtime during migration, follow the instructions to [set up an external cloud provider](../set-up-cloud-providers/azure.md#using-the-out-of-tree-azure-cloud-provider). These instructions outline how to configure the out-of-tree cloud provider for a newly provisioned cluster. During set up, there will be some downtime, as there is a time gap between when the old cloud provider stops running and when the new cloud provider starts to run. + +If your setup can't tolerate any control plane downtime, you must enable leader migration. This facilitates a smooth transition from the controllers in the kube controller manager to their counterparts in the cloud controller manager. + +:::note Important: +The Kubernetes [cloud controller migration documentation](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/#before-you-begin) states that it's possible to migrate with the same Kubernetes version, but assumes that the migration is part of a Kubernetes upgrade. Refer to the Kubernetes documentation on [migrating to use the cloud controller manager](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/) to see if you need to customize your setup before migrating. Confirm your [migration configuration values](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/#default-configuration). If your cloud provider provides an implementation of the Node IPAM controller, you also need to [migrate the IPAM controller](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/#node-ipam-controller-migration). + +Starting with Kubernetes v1.26, in-tree persistent volume types `kubernetes.io/azure-disk` and `kubernetes.io/azure-file` are deprecated and no longer supported. There are no plans to remove these drivers following their deprecation, however you should migrate to the corresponding CSI drivers, `disk.csi.azure.com` and `file.csi.azure.com`. To review the migration options for your storage classes and upgrade your cluster to use Azure Disks and Azure Files CSI drivers, see [Migrate from in-tree to CSI drivers](https://learn.microsoft.com/en-us/azure/aks/csi-migrate-in-tree-volumes). +::: + + + + +1. Update the cluster config to enable leader migration: + +```yaml +spec: + rkeConfig: + machineSelectorConfig: + - config: + kube-controller-manager-arg: + - enable-leader-migration + machineLabelSelector: + matchExpressions: + - key: rke.cattle.io/control-plane-role + operator: In + values: + - 'true' +``` + +Note that the cloud provider is still `azure` at this step: + +```yaml +spec: + rkeConfig: + machineGlobalConfig: + cloud-provider-name: azure +``` + +2. Cordon control plane nodes so that Azure cloud controller pods run on nodes only after upgrading to the external cloud provider: + +```shell +kubectl cordon -l "node-role.kubernetes.io/control-plane=true" +``` + +3. To deploy the Azure cloud controller manager, use any of the available options: +- UI: Follow steps 1-10 of [Helm chart installation from UI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-ui) to install the cloud controller manager chart. +- CLI: Follow steps 1-4 of [Helm chart installation from CLI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-cli). +- Update the cluster's additional manifest: Follow steps 2-3 to [install the cloud controller manager chart](../set-up-cloud-providers/azure.md#using-the-out-of-tree-azure-cloud-provider). + +Confirm that the chart is installed but that the new pods aren't running yet due to cordoned controlplane nodes. + +4. To enable leader migration, add `--enable-leader-migration` to the container arguments of `cloud-controller-manager`: + +```shell +kubectl -n kube-system patch deployment cloud-controller-manager \ +--type=json \ +-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-leader-migration"}]' +``` + +5. Update the provisioning cluster to change the cloud provider and remove leader migration args from the kube controller manager. + If upgrading the Kubernetes version, set the Kubernetes version as well in the `spec.kubernetesVersion` section of the cluster YAML file. + +```yaml +spec: + rkeConfig: + machineGlobalConfig: + cloud-provider-name: external +``` + +Remove `enable-leader-migration` from the kube controller manager: + +```yaml +spec: + rkeConfig: + machineSelectorConfig: + - config: + kube-controller-manager-arg: + - enable-leader-migration + machineLabelSelector: + matchExpressions: + - key: rke.cattle.io/control-plane-role + operator: In + values: + - 'true' +``` + +6. Uncordon control plane nodes so that Azure cloud controller pods now run on nodes: + +```shell +kubectl uncordon -l "node-role.kubernetes.io/control-plane=true" +``` + +7. Update the cluster. The `cloud-controller-manager` pods should now be running. + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +8. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +9. (Optional) You can also disable leader migration after the upgrade, as leader migration is not required with only one cloud-controller-manager. + Update the `cloud-controller-manager` deployment to remove leader migration from the container arguments: + +```yaml +- --enable-leader-migration=true +``` + + + + + +1. Update the cluster config to enable leader migration in `cluster.yml`: + +```yaml +services: + kube-controller: + extra_args: + enable-leader-migration: "true" +``` + +Note that the cloud provider is still `azure` at this step: + +```yaml +cloud_provider: + name: azure +``` + +2. Cordon the control plane nodes, so that Azure cloud controller pods run on nodes only after upgrading to the external cloud provider: + +```shell +kubectl cordon -l "node-role.kubernetes.io/controlplane=true" +``` + +3. To install the Azure cloud controller manager, follow the same steps as when installing Azure cloud provider on a new cluster: +- UI: Follow steps 1-10 of [Helm chart installation from UI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-ui) to install the cloud controller manager chart. +- CLI: Follow steps 1-4 of [Helm chart installation from CLI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-cli) to install the cloud controller manager chart. + +4. Confirm that the chart is installed but that the new pods aren't running yet due to cordoned controlplane nodes. After updating the cluster in the next step, RKE will upgrade and uncordon each node, and schedule `cloud-controller-manager` pods. + +5. To enable leader migration, add `--enable-leader-migration` to the container arguments of `cloud-controller-manager`: + +```shell +kubectl -n kube-system patch deployment cloud-controller-manager \ +--type=json \ +-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-leader-migration"}]' +``` + +6. Update `cluster.yml` to change the cloud provider to `external` and remove the leader migration arguments from the kube-controller. + +```yaml +rancher_kubernetes_engine_config: + cloud_provider: + name: external +``` + +Remove `enable-leader-migration` if you don't want it enabled in your cluster: + +```yaml +services: + kube-controller: + extra_args: + enable-leader-migration: "true" +``` + +7. If you're upgrading the cluster's Kubernetes version, set the Kubernetes version as well. + +8. Update the cluster. The `cloud-controller-manager` pods should now be running. + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +9. The cloud provider is responsible for setting the ProviderID of the node. Verify that all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +10. (Optional) You can also disable leader migration after the upgrade, as leader migration is not required with only one cloud-controller-manager. + Update the `cloud-controller-manager` deployment to remove leader migration from the container arguments: + +```yaml +- --enable-leader-migration=true +``` + + + + diff --git a/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md b/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md index 6bf997b36f2..b49ca3f3ca3 100644 --- a/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md +++ b/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md @@ -737,7 +737,7 @@ nodeSelector: 10. Install the chart and confirm that the Daemonset `aws-cloud-controller-manager` deploys successfully: ```shell -kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager +kubectl rollout status deployment -n kube-system aws-cloud-controller-manager ``` diff --git a/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md b/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md index 8720aa0760e..c291376354a 100644 --- a/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md +++ b/docs/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md @@ -6,6 +6,17 @@ title: Setting up the Azure Cloud Provider +:::note Important: + +In Kubernetes 1.30 and later, you must use an out-of-tree Azure cloud provider. The Azure cloud provider has been [removed completely](https://github.com/kubernetes/kubernetes/pull/122857), and won't work after an upgrade to Kubernetes 1.30. The steps listed below are still required to set up an Azure cloud provider. You can [set up an out-of-tree cloud provider](#using-the-out-of-tree-azure-cloud-provider) after completing the prerequisites for Azure. + +You can also [migrate from an in-tree to an out-of-tree Azure cloud provider](../migrate-to-an-out-of-tree-cloud-provider/migrate-to-out-of-tree-azure.md) on Kubernetes 1.29 and earlier. All existing clusters must migrate prior to upgrading to v1.30 in order to stay functional. + +Starting with Kubernetes 1.29, in-tree cloud providers have been disabled. You must disable `DisableCloudProviders` and `DisableKubeletCloudCredentialProvider` to use the in-tree Azure cloud provider. You can do this by setting `feature-gates=DisableCloudProviders=false` as an additional argument for the cluster's Kubelet, Controller Manager, and API Server in the advanced cluster configuration. Additionally, set `DisableKubeletCloudCredentialProvider=false` in the Kubelet's arguments to enable in-tree functionality for authenticating to Azure container registries for image pull credentials. See [upstream docs](https://github.com/kubernetes/kubernetes/pull/117503) for more details. + +Starting with Kubernetes version 1.26, in-tree persistent volume types `kubernetes.io/azure-disk` and `kubernetes.io/azure-file` are deprecated and will no longer be supported. For new clusters, [install the CSI drivers](#installing-csi-drivers), or migrate to the corresponding CSI drivers `disk.csi.azure.com` and `file.csi.azure.com` by following the [upstream migration documentation](https://learn.microsoft.com/en-us/azure/aks/csi-migrate-in-tree-volumes). +::: + When using the `Azure` cloud provider, you can leverage the following capabilities: - **Load Balancers:** Launches an Azure Load Balancer within a specific Network Security Group. @@ -76,12 +87,15 @@ Only hosts expected to be load balancer back ends need to be in this group. ## RKE2 Cluster Set-up in Rancher +:::note Important: +This section is valid only for creating clusters with the in-tree cloud provider. +::: + 1. Choose "Azure" from the Cloud Provider drop-down in the Cluster Configuration section. -1. * Supply the Cloud Provider Configuration. Note that Rancher will automatically create a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you will need to specify them before creating the cluster. - * You can click on "Show Advanced" to see more of these automatically generated names and update them if - necessary. Your Cloud Provider Configuration **must** match the fields in the Machine Pools section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group. - * An example is provided below. You will modify it as needed. +2. Supply the Cloud Provider Configuration. Note that Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you must specify them before creating the cluster. + * Click **Show Advanced** to view or edit these automatically generated names. Your Cloud Provider Configuration **must** match the fields in the **Machine Pools** section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group. + * An example is provided below. Modify it as needed.
Example Cloud Provider Config @@ -110,6 +124,492 @@ Only hosts expected to be load balancer back ends need to be in this group.
-1. Under the **Cluster Configuration > Advanced** section, click **Add** under **Additional Controller Manager Args** and add this flag: `--configure-cloud-routes=false` +3. Under the **Cluster Configuration > Advanced** section, click **Add** under **Additional Controller Manager Args** and add this flag: `--configure-cloud-routes=false` -1. Click the **Create** button to submit the form and create the cluster. +4. Click **Create** to submit the form and create the cluster. + +## Cloud Provider Configuration + +Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you will need to specify them before creating the cluster. You can check **RKE1 Node Templates** or **RKE2 Machine Pools** to view or edit these automatically generated names. + +**Refer to the full list of configuration options in the [upstream docs](https://cloud-provider-azure.sigs.k8s.io/install/configs/).** + +:::note +1. `useInstanceMetadata` must be set to `true` for the cloud provider to correctly configure `providerID`. +2. `excludeMasterFromStandardLB` must be set to `false` if you need to add nodes labeled `node-role.kubernetes.io/master` to the backend of the Azure Load Balancer (ALB). +3. `loadBalancerSku` can be set to `basic` or `standard`. Basic SKU will be deprecated in September 2025. Refer to the [Azure upstream docs](https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/public-ip-basic-upgrade-guidance#basic-sku-vs-standard-sku) for more information. +::: + +Azure supports reading the cloud config from Kubernetes secrets. The secret is a serialized version of the azure.json file. When the secret is changed, the cloud controller manager reconstructs itself without restarting the pod. It is recommended for the Helm chart to read the Cloud Provider Config from the secret. + +Note that the chart reads the Cloud Provider Config from a given secret name in the `kube-system` namespace. Since Azure reads Kubernetes secrets, RBAC also needs to be configured. An example secret for the Cloud Provider Config is shown below. Modify it as needed and create the secret. + + ```yaml +# azure-cloud-config.yaml +apiVersion: v1 +kind: Secret +metadata: + name: azure-cloud-config + namespace: kube-system +type: Opaque +stringData: + cloud-config: |- + { + "cloud": "AzurePublicCloud", + "tenantId": "", + "subscriptionId": "", + "aadClientId": "", + "aadClientSecret": "", + "resourceGroup": "docker-machine", + "location": "westus", + "subnetName": "docker-machine", + "securityGroupName": "rancher-managed-kqmtsjgJ", + "securityGroupResourceGroup": "docker-machine", + "vnetName": "docker-machine-vnet", + "vnetResourceGroup": "docker-machine", + "primaryAvailabilitySetName": "docker-machine", + "routeTableResourceGroup": "docker-machine", + "cloudProviderBackoff": false, + "useManagedIdentityExtension": false, + "useInstanceMetadata": true, + "loadBalancerSku": "standard", + "excludeMasterFromStandardLB": false, + } +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRole +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +rules: + - apiGroups: [""] +resources: ["secrets"] +resourceNames: ["azure-cloud-config"] +verbs: + - get +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRoleBinding +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: system:azure-cloud-provider-secret-getter + subjects: + - kind: ServiceAccount + name: azure-cloud-config + namespace: kube-system + ``` + +## Using the Out-of-tree Azure Cloud Provider + + + + +1. Select **External** from the **Cloud Provider** drop-down in the **Cluster Configuration** section. + +2. Prepare the Cloud Provider Configuration to set it in the next step. Note that Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you must specify them before creating the cluster. + - Click **Show Advanced** to view or edit these automatically generated names. Your Cloud Provider Configuration **must** match the fields in the **Machine Pools** section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group. + +3. Under **Cluster Configuration > Advanced**, click **Add** under **Additional Controller Manager Args** and add this flag: `--configure-cloud-routes=false`. + +Note that the chart reads the Cloud Provider Config from the secret in the `kube-system` namespace. An example secret for the Cloud Provider Config is shown below. Modify it as needed. Refer to the full list of configuration options in the [upstream docs](https://cloud-provider-azure.sigs.k8s.io/install/configs/). + + ```yaml +apiVersion: helm.cattle.io/v1 +kind: HelmChart +metadata: + name: azure-cloud-controller-manager + namespace: kube-system +spec: + chart: cloud-provider-azure + repo: https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo + targetNamespace: kube-system + bootstrap: true + valuesContent: |- + infra: + clusterName: + cloudControllerManager: + cloudConfigSecretName: azure-cloud-config + cloudConfig: null + clusterCIDR: null + enableDynamicReloading: 'true' + nodeSelector: + node-role.kubernetes.io/control-plane: 'true' + allocateNodeCidrs: 'false' + hostNetworking: true + caCertDir: /etc/ssl + configureCloudRoutes: 'false' + enabled: true + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/master + - effect: NoSchedule + key: node-role.kubernetes.io/control-plane + value: 'true' + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' +--- +apiVersion: v1 +kind: Secret +metadata: + name: azure-cloud-config + namespace: kube-system +type: Opaque +stringData: + cloud-config: |- + { + "cloud": "AzurePublicCloud", + "tenantId": "", + "subscriptionId": "", + "aadClientId": "", + "aadClientSecret": "", + "resourceGroup": "docker-machine", + "location": "westus", + "subnetName": "docker-machine", + "securityGroupName": "rancher-managed-kqmtsjgJ", + "securityGroupResourceGroup": "docker-machine", + "vnetName": "docker-machine-vnet", + "vnetResourceGroup": "docker-machine", + "primaryAvailabilitySetName": "docker-machine", + "routeTableResourceGroup": "docker-machine", + "cloudProviderBackoff": false, + "useManagedIdentityExtension": false, + "useInstanceMetadata": true, + "loadBalancerSku": "standard", + "excludeMasterFromStandardLB": false, + } +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRole +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +rules: + - apiGroups: [""] +resources: ["secrets"] +resourceNames: ["azure-cloud-config"] +verbs: + - get +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRoleBinding +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: system:azure-cloud-provider-secret-getter + subjects: + - kind: ServiceAccount + name: azure-cloud-config + namespace: kube-system + ``` + +4. Click **Create** to submit the form and create the cluster. + + + + + +1. Choose **External** from the **Cloud Provider** drop-down in the **Cluster Options** section. This sets `--cloud-provider=external` for Kubernetes components. + +2. Install the `cloud-provider-azure` chart after the cluster finishes provisioning. Note that the cluster is not successfully provisioned and nodes are still in an `uninitialized` state until you deploy the cloud controller manager. This can be done [manually using CLI](#helm-chart-installation-from-cli), or via [Helm charts in UI](#helm-chart-installation-from-ui). + +Refer to the [official Azure upstream documentation](https://cloud-provider-azure.sigs.k8s.io/install/azure-ccm/) for more details on deploying the Cloud Controller Manager. + + + + +### Helm Chart Installation from CLI + +Official upstream docs for [Helm chart installation](https://github.com/kubernetes-sigs/cloud-provider-azure/tree/master/helm/cloud-provider-azure) can be found on Github. + +1. Create a `azure-cloud-config` secret with the required [cloud provider config](#cloud-provider-configuration). + +```shell +kubectl apply -f azure-cloud-config.yaml +``` + +2. Add the Helm repository: + +```shell +helm repo add azure-cloud-controller-manager https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo +helm repo update +``` + +3. Create a `values.yaml` file with the following contents to override the default `values.yaml`: + + + + +```yaml +# values.yaml +infra: + clusterName: +cloudControllerManager: + cloudConfigSecretName: azure-cloud-config + cloudConfig: null + clusterCIDR: null + enableDynamicReloading: 'true' + configureCloudRoutes: 'false' + allocateNodeCidrs: 'false' + caCertDir: /etc/ssl + enabled: true + replicas: 1 + hostNetworking: true + nodeSelector: + node-role.kubernetes.io/control-plane: 'true' + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/master + - effect: NoSchedule + key: node-role.kubernetes.io/control-plane + value: 'true' + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' +``` + + + + + +```yaml +# values.yaml +cloudControllerManager: + cloudConfigSecretName: azure-cloud-config + cloudConfig: null + clusterCIDR: null + enableDynamicReloading: 'true' + configureCloudRoutes: 'false' + allocateNodeCidrs: 'false' + caCertDir: /etc/ssl + enabled: true + replicas: 1 + hostNetworking: true + nodeSelector: + node-role.kubernetes.io/controlplane: 'true' + node-role.kubernetes.io/control-plane: null + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/controlplane + value: 'true' + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' +infra: + clusterName: +``` + + + + +4. Install the Helm chart: + +```shell +helm upgrade --install cloud-provider-azure azure-cloud-controller-manager/cloud-provider-azure -n kube-system --values values.yaml +``` + +Verify that the Helm chart installed successfully: + +```shell +helm status cloud-provider-azure -n kube-system +``` + +5. (Optional) Verify that the cloud controller manager update succeeded: + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +6. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +### Helm Chart Installation from UI + +1. Click **☰**, then select the name of the cluster from the left navigation. + +2. Select **Apps** > **Repositories**. + +3. Click the **Create** button. + +4. Enter `https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo` in the **Index URL** field. + +5. Select **Apps** > **Charts** from the left navigation and install **cloud-provider-azure** chart. + +6. Select the namespace, `kube-system`, and enable **Customize Helm options before install**. + +7. Replace `cloudConfig: /etc/kubernetes/azure.json` to read from the Cloud Config Secret and enable dynamic reloading: + +```yaml + cloudConfigSecretName: azure-cloud-config + enableDynamicReloading: 'true' +``` + +8. Update the following fields as required: + +```yaml + allocateNodeCidrs: 'false' + configureCloudRoutes: 'false' + clusterCIDR: null +``` + + + + +9. Rancher-provisioned RKE2 nodes have the selector `node-role.kubernetes.io/control-plane` set to `true`. Update the nodeSelector: +```yaml +nodeSelector: + node-role.kubernetes.io/control-plane: 'true' +``` + + + + +10. Rancher-provisioned RKE nodes are tainted `node-role.kubernetes.io/controlplane`. Update tolerations and the nodeSelector: + +```yaml +tolerations: + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' + - effect: NoSchedule + value: 'true' + key: node-role.kubernetes.io/controlplane +``` + +```yaml +nodeSelector: + node-role.kubernetes.io/controlplane: 'true' +``` + + + +11. Install the chart and confirm that the cloud controller and cloud node manager deployed successfully: + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +12. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +### Installing CSI Drivers + +Install [Azure Disk CSI driver](https://github.com/kubernetes-sigs/azuredisk-csi-driver) or [Azure File CSI Driver](https://github.com/kubernetes-sigs/azurefile-csi-driver) to access [Azure Disk](https://azure.microsoft.com/en-us/services/storage/disks/) or [Azure File](https://azure.microsoft.com/en-us/services/storage/disks/) volumes respectively. + +The steps to install the Azure Disk CSI driver are shown below. You can install the Azure File CSI Driver in a similar manner by following the [helm installation documentation](https://github.com/kubernetes-sigs/azurefile-csi-driver/blob/master/charts/README.md). + +::: note Important: + +Clusters must be provisioned using `Managed Disk` to use Azure Disk. You can configure this when creating **RKE1 Node Templates** or **RKE2 Machine Pools*. + +::: + +Official upstream docs for [Helm chart installation](https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/charts/README.md) can be found on Github. + +1. Add and update the helm repository: + +```shell +helm repo add azuredisk-csi-driver https://raw.githubusercontent.com/kubernetes-sigs/azuredisk-csi-driver/master/charts +helm repo update azuredisk-csi-driver +``` + +1. Install the chart as shown below, updating the --version argument as needed. Refer to the full list of latest chart configurations in the [upstream docs](https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/charts/README.md#latest-chart-configuration). + +```shell +helm install azuredisk-csi-driver azuredisk-csi-driver/azuredisk-csi-driver --namespace kube-system --version v1.30.1 --set controller.cloudConfigSecretName=azure-cloud-config --set controller.cloudConfigSecretNamespace=kube-system --set controller.runOnControlPlane=true +``` + +2. (Optional) Verify that the azuredisk-csi-driver installation succeeded: + +```shell +kubectl --namespace=kube-system get pods --selector="app.kubernetes.io/name=azuredisk-csi-driver" --watch +``` + +3. Provision an example Storage Class: + +```shell +cat < + + + +Kubernetes is moving away from maintaining cloud providers in-tree. + +Starting with Kubernetes 1.29, in-tree cloud providers have been disabled. You must disable `DisableCloudProviders` and `DisableKubeletCloudCredentialProvider` to use the in-tree Azure cloud provider or migrate from in-tree cloud provider to out-of-tree provider. You can disable the required feature gates by setting `feature-gates=DisableCloudProviders=false` as an additional argument for the cluster's Kubelet, Controller Manager, and API Server in the advanced cluster configuration. Additionally, set `DisableKubeletCloudCredentialProvider=false` in the Kubelet's arguments to enable in-tree functionality for authenticating to Azure container registries for image pull credentials. See [upstream docs](https://github.com/kubernetes/kubernetes/pull/117503) for more details. + +In Kubernetes v1.30 and later, the in-tree cloud providers have been removed. Rancher allows you to upgrade to Kubernetes v1.30 when you migrate from an in-tree to out-of-tree provider. + +To migrate from the in-tree cloud provider to the out-of-tree Azure cloud provider, you must stop the existing cluster's kube controller manager and install the Azure cloud controller manager. + +If it's acceptable to have some downtime during migration, follow the instructions to [set up an external cloud provider](../set-up-cloud-providers/azure.md#using-the-out-of-tree-azure-cloud-provider). These instructions outline how to configure the out-of-tree cloud provider for a newly provisioned cluster. During set up, there will be some downtime, as there is a time gap between when the old cloud provider stops running and when the new cloud provider starts to run. + +If your setup can't tolerate any control plane downtime, you must enable leader migration. This facilitates a smooth transition from the controllers in the kube controller manager to their counterparts in the cloud controller manager. + +:::note Important: +The Kubernetes [cloud controller migration documentation](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/#before-you-begin) states that it's possible to migrate with the same Kubernetes version, but assumes that the migration is part of a Kubernetes upgrade. Refer to the Kubernetes documentation on [migrating to use the cloud controller manager](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/) to see if you need to customize your setup before migrating. Confirm your [migration configuration values](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/#default-configuration). If your cloud provider provides an implementation of the Node IPAM controller, you also need to [migrate the IPAM controller](https://kubernetes.io/docs/tasks/administer-cluster/controller-manager-leader-migration/#node-ipam-controller-migration). + +Starting with Kubernetes v1.26, in-tree persistent volume types `kubernetes.io/azure-disk` and `kubernetes.io/azure-file` are deprecated and no longer supported. There are no plans to remove these drivers following their deprecation, however you should migrate to the corresponding CSI drivers, `disk.csi.azure.com` and `file.csi.azure.com`. To review the migration options for your storage classes and upgrade your cluster to use Azure Disks and Azure Files CSI drivers, see [Migrate from in-tree to CSI drivers](https://learn.microsoft.com/en-us/azure/aks/csi-migrate-in-tree-volumes). +::: + + + + +1. Update the cluster config to enable leader migration: + +```yaml +spec: + rkeConfig: + machineSelectorConfig: + - config: + kube-controller-manager-arg: + - enable-leader-migration + machineLabelSelector: + matchExpressions: + - key: rke.cattle.io/control-plane-role + operator: In + values: + - 'true' +``` + +Note that the cloud provider is still `azure` at this step: + +```yaml +spec: + rkeConfig: + machineGlobalConfig: + cloud-provider-name: azure +``` + +2. Cordon control plane nodes so that Azure cloud controller pods run on nodes only after upgrading to the external cloud provider: + +```shell +kubectl cordon -l "node-role.kubernetes.io/control-plane=true" +``` + +3. To deploy the Azure cloud controller manager, use any of the available options: +- UI: Follow steps 1-10 of [Helm chart installation from UI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-ui) to install the cloud controller manager chart. +- CLI: Follow steps 1-4 of [Helm chart installation from CLI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-cli). +- Update the cluster's additional manifest: Follow steps 2-3 to [install the cloud controller manager chart](../set-up-cloud-providers/azure.md#using-the-out-of-tree-azure-cloud-provider). + +Confirm that the chart is installed but that the new pods aren't running yet due to cordoned controlplane nodes. + +4. To enable leader migration, add `--enable-leader-migration` to the container arguments of `cloud-controller-manager`: + +```shell +kubectl -n kube-system patch deployment cloud-controller-manager \ +--type=json \ +-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-leader-migration"}]' +``` + +5. Update the provisioning cluster to change the cloud provider and remove leader migration args from the kube controller manager. + If upgrading the Kubernetes version, set the Kubernetes version as well in the `spec.kubernetesVersion` section of the cluster YAML file. + +```yaml +spec: + rkeConfig: + machineGlobalConfig: + cloud-provider-name: external +``` + +Remove `enable-leader-migration` from the kube controller manager: + +```yaml +spec: + rkeConfig: + machineSelectorConfig: + - config: + kube-controller-manager-arg: + - enable-leader-migration + machineLabelSelector: + matchExpressions: + - key: rke.cattle.io/control-plane-role + operator: In + values: + - 'true' +``` + +6. Uncordon control plane nodes so that Azure cloud controller pods now run on nodes: + +```shell +kubectl uncordon -l "node-role.kubernetes.io/control-plane=true" +``` + +7. Update the cluster. The `cloud-controller-manager` pods should now be running. + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +8. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +9. (Optional) You can also disable leader migration after the upgrade, as leader migration is not required with only one cloud-controller-manager. + Update the `cloud-controller-manager` deployment to remove leader migration from the container arguments: + +```yaml +- --enable-leader-migration=true +``` + + + + + +1. Update the cluster config to enable leader migration in `cluster.yml`: + +```yaml +services: + kube-controller: + extra_args: + enable-leader-migration: "true" +``` + +Note that the cloud provider is still `azure` at this step: + +```yaml +cloud_provider: + name: azure +``` + +2. Cordon the control plane nodes, so that Azure cloud controller pods run on nodes only after upgrading to the external cloud provider: + +```shell +kubectl cordon -l "node-role.kubernetes.io/controlplane=true" +``` + +3. To install the Azure cloud controller manager, follow the same steps as when installing Azure cloud provider on a new cluster: +- UI: Follow steps 1-10 of [Helm chart installation from UI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-ui) to install the cloud controller manager chart. +- CLI: Follow steps 1-4 of [Helm chart installation from CLI](../set-up-cloud-providers/azure.md#helm-chart-installation-from-cli) to install the cloud controller manager chart. + +4. Confirm that the chart is installed but that the new pods aren't running yet due to cordoned controlplane nodes. After updating the cluster in the next step, RKE will upgrade and uncordon each node, and schedule `cloud-controller-manager` pods. + +5. To enable leader migration, add `--enable-leader-migration` to the container arguments of `cloud-controller-manager`: + +```shell +kubectl -n kube-system patch deployment cloud-controller-manager \ +--type=json \ +-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-leader-migration"}]' +``` + +6. Update `cluster.yml` to change the cloud provider to `external` and remove the leader migration arguments from the kube-controller. + +```yaml +rancher_kubernetes_engine_config: + cloud_provider: + name: external +``` + +Remove `enable-leader-migration` if you don't want it enabled in your cluster: + +```yaml +services: + kube-controller: + extra_args: + enable-leader-migration: "true" +``` + +7. If you're upgrading the cluster's Kubernetes version, set the Kubernetes version as well. + +8. Update the cluster. The `cloud-controller-manager` pods should now be running. + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +9. The cloud provider is responsible for setting the ProviderID of the node. Verify that all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +10. (Optional) You can also disable leader migration after the upgrade, as leader migration is not required with only one cloud-controller-manager. + Update the `cloud-controller-manager` deployment to remove leader migration from the container arguments: + +```yaml +- --enable-leader-migration=true +``` + + + + diff --git a/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md b/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md index 6bf997b36f2..b49ca3f3ca3 100644 --- a/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md +++ b/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon.md @@ -737,7 +737,7 @@ nodeSelector: 10. Install the chart and confirm that the Daemonset `aws-cloud-controller-manager` deploys successfully: ```shell -kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager +kubectl rollout status deployment -n kube-system aws-cloud-controller-manager ``` diff --git a/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md b/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md index 8720aa0760e..c291376354a 100644 --- a/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md +++ b/versioned_docs/version-2.9/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/azure.md @@ -6,6 +6,17 @@ title: Setting up the Azure Cloud Provider +:::note Important: + +In Kubernetes 1.30 and later, you must use an out-of-tree Azure cloud provider. The Azure cloud provider has been [removed completely](https://github.com/kubernetes/kubernetes/pull/122857), and won't work after an upgrade to Kubernetes 1.30. The steps listed below are still required to set up an Azure cloud provider. You can [set up an out-of-tree cloud provider](#using-the-out-of-tree-azure-cloud-provider) after completing the prerequisites for Azure. + +You can also [migrate from an in-tree to an out-of-tree Azure cloud provider](../migrate-to-an-out-of-tree-cloud-provider/migrate-to-out-of-tree-azure.md) on Kubernetes 1.29 and earlier. All existing clusters must migrate prior to upgrading to v1.30 in order to stay functional. + +Starting with Kubernetes 1.29, in-tree cloud providers have been disabled. You must disable `DisableCloudProviders` and `DisableKubeletCloudCredentialProvider` to use the in-tree Azure cloud provider. You can do this by setting `feature-gates=DisableCloudProviders=false` as an additional argument for the cluster's Kubelet, Controller Manager, and API Server in the advanced cluster configuration. Additionally, set `DisableKubeletCloudCredentialProvider=false` in the Kubelet's arguments to enable in-tree functionality for authenticating to Azure container registries for image pull credentials. See [upstream docs](https://github.com/kubernetes/kubernetes/pull/117503) for more details. + +Starting with Kubernetes version 1.26, in-tree persistent volume types `kubernetes.io/azure-disk` and `kubernetes.io/azure-file` are deprecated and will no longer be supported. For new clusters, [install the CSI drivers](#installing-csi-drivers), or migrate to the corresponding CSI drivers `disk.csi.azure.com` and `file.csi.azure.com` by following the [upstream migration documentation](https://learn.microsoft.com/en-us/azure/aks/csi-migrate-in-tree-volumes). +::: + When using the `Azure` cloud provider, you can leverage the following capabilities: - **Load Balancers:** Launches an Azure Load Balancer within a specific Network Security Group. @@ -76,12 +87,15 @@ Only hosts expected to be load balancer back ends need to be in this group. ## RKE2 Cluster Set-up in Rancher +:::note Important: +This section is valid only for creating clusters with the in-tree cloud provider. +::: + 1. Choose "Azure" from the Cloud Provider drop-down in the Cluster Configuration section. -1. * Supply the Cloud Provider Configuration. Note that Rancher will automatically create a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you will need to specify them before creating the cluster. - * You can click on "Show Advanced" to see more of these automatically generated names and update them if - necessary. Your Cloud Provider Configuration **must** match the fields in the Machine Pools section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group. - * An example is provided below. You will modify it as needed. +2. Supply the Cloud Provider Configuration. Note that Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you must specify them before creating the cluster. + * Click **Show Advanced** to view or edit these automatically generated names. Your Cloud Provider Configuration **must** match the fields in the **Machine Pools** section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group. + * An example is provided below. Modify it as needed.
Example Cloud Provider Config @@ -110,6 +124,492 @@ Only hosts expected to be load balancer back ends need to be in this group.
-1. Under the **Cluster Configuration > Advanced** section, click **Add** under **Additional Controller Manager Args** and add this flag: `--configure-cloud-routes=false` +3. Under the **Cluster Configuration > Advanced** section, click **Add** under **Additional Controller Manager Args** and add this flag: `--configure-cloud-routes=false` -1. Click the **Create** button to submit the form and create the cluster. +4. Click **Create** to submit the form and create the cluster. + +## Cloud Provider Configuration + +Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you will need to specify them before creating the cluster. You can check **RKE1 Node Templates** or **RKE2 Machine Pools** to view or edit these automatically generated names. + +**Refer to the full list of configuration options in the [upstream docs](https://cloud-provider-azure.sigs.k8s.io/install/configs/).** + +:::note +1. `useInstanceMetadata` must be set to `true` for the cloud provider to correctly configure `providerID`. +2. `excludeMasterFromStandardLB` must be set to `false` if you need to add nodes labeled `node-role.kubernetes.io/master` to the backend of the Azure Load Balancer (ALB). +3. `loadBalancerSku` can be set to `basic` or `standard`. Basic SKU will be deprecated in September 2025. Refer to the [Azure upstream docs](https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/public-ip-basic-upgrade-guidance#basic-sku-vs-standard-sku) for more information. +::: + +Azure supports reading the cloud config from Kubernetes secrets. The secret is a serialized version of the azure.json file. When the secret is changed, the cloud controller manager reconstructs itself without restarting the pod. It is recommended for the Helm chart to read the Cloud Provider Config from the secret. + +Note that the chart reads the Cloud Provider Config from a given secret name in the `kube-system` namespace. Since Azure reads Kubernetes secrets, RBAC also needs to be configured. An example secret for the Cloud Provider Config is shown below. Modify it as needed and create the secret. + + ```yaml +# azure-cloud-config.yaml +apiVersion: v1 +kind: Secret +metadata: + name: azure-cloud-config + namespace: kube-system +type: Opaque +stringData: + cloud-config: |- + { + "cloud": "AzurePublicCloud", + "tenantId": "", + "subscriptionId": "", + "aadClientId": "", + "aadClientSecret": "", + "resourceGroup": "docker-machine", + "location": "westus", + "subnetName": "docker-machine", + "securityGroupName": "rancher-managed-kqmtsjgJ", + "securityGroupResourceGroup": "docker-machine", + "vnetName": "docker-machine-vnet", + "vnetResourceGroup": "docker-machine", + "primaryAvailabilitySetName": "docker-machine", + "routeTableResourceGroup": "docker-machine", + "cloudProviderBackoff": false, + "useManagedIdentityExtension": false, + "useInstanceMetadata": true, + "loadBalancerSku": "standard", + "excludeMasterFromStandardLB": false, + } +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRole +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +rules: + - apiGroups: [""] +resources: ["secrets"] +resourceNames: ["azure-cloud-config"] +verbs: + - get +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRoleBinding +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: system:azure-cloud-provider-secret-getter + subjects: + - kind: ServiceAccount + name: azure-cloud-config + namespace: kube-system + ``` + +## Using the Out-of-tree Azure Cloud Provider + + + + +1. Select **External** from the **Cloud Provider** drop-down in the **Cluster Configuration** section. + +2. Prepare the Cloud Provider Configuration to set it in the next step. Note that Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you must specify them before creating the cluster. + - Click **Show Advanced** to view or edit these automatically generated names. Your Cloud Provider Configuration **must** match the fields in the **Machine Pools** section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group. + +3. Under **Cluster Configuration > Advanced**, click **Add** under **Additional Controller Manager Args** and add this flag: `--configure-cloud-routes=false`. + +Note that the chart reads the Cloud Provider Config from the secret in the `kube-system` namespace. An example secret for the Cloud Provider Config is shown below. Modify it as needed. Refer to the full list of configuration options in the [upstream docs](https://cloud-provider-azure.sigs.k8s.io/install/configs/). + + ```yaml +apiVersion: helm.cattle.io/v1 +kind: HelmChart +metadata: + name: azure-cloud-controller-manager + namespace: kube-system +spec: + chart: cloud-provider-azure + repo: https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo + targetNamespace: kube-system + bootstrap: true + valuesContent: |- + infra: + clusterName: + cloudControllerManager: + cloudConfigSecretName: azure-cloud-config + cloudConfig: null + clusterCIDR: null + enableDynamicReloading: 'true' + nodeSelector: + node-role.kubernetes.io/control-plane: 'true' + allocateNodeCidrs: 'false' + hostNetworking: true + caCertDir: /etc/ssl + configureCloudRoutes: 'false' + enabled: true + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/master + - effect: NoSchedule + key: node-role.kubernetes.io/control-plane + value: 'true' + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' +--- +apiVersion: v1 +kind: Secret +metadata: + name: azure-cloud-config + namespace: kube-system +type: Opaque +stringData: + cloud-config: |- + { + "cloud": "AzurePublicCloud", + "tenantId": "", + "subscriptionId": "", + "aadClientId": "", + "aadClientSecret": "", + "resourceGroup": "docker-machine", + "location": "westus", + "subnetName": "docker-machine", + "securityGroupName": "rancher-managed-kqmtsjgJ", + "securityGroupResourceGroup": "docker-machine", + "vnetName": "docker-machine-vnet", + "vnetResourceGroup": "docker-machine", + "primaryAvailabilitySetName": "docker-machine", + "routeTableResourceGroup": "docker-machine", + "cloudProviderBackoff": false, + "useManagedIdentityExtension": false, + "useInstanceMetadata": true, + "loadBalancerSku": "standard", + "excludeMasterFromStandardLB": false, + } +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRole +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +rules: + - apiGroups: [""] +resources: ["secrets"] +resourceNames: ["azure-cloud-config"] +verbs: + - get +--- +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRoleBinding +metadata: + labels: + kubernetes.io/cluster-service: "true" + name: system:azure-cloud-provider-secret-getter +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: system:azure-cloud-provider-secret-getter + subjects: + - kind: ServiceAccount + name: azure-cloud-config + namespace: kube-system + ``` + +4. Click **Create** to submit the form and create the cluster. + + + + + +1. Choose **External** from the **Cloud Provider** drop-down in the **Cluster Options** section. This sets `--cloud-provider=external` for Kubernetes components. + +2. Install the `cloud-provider-azure` chart after the cluster finishes provisioning. Note that the cluster is not successfully provisioned and nodes are still in an `uninitialized` state until you deploy the cloud controller manager. This can be done [manually using CLI](#helm-chart-installation-from-cli), or via [Helm charts in UI](#helm-chart-installation-from-ui). + +Refer to the [official Azure upstream documentation](https://cloud-provider-azure.sigs.k8s.io/install/azure-ccm/) for more details on deploying the Cloud Controller Manager. + + + + +### Helm Chart Installation from CLI + +Official upstream docs for [Helm chart installation](https://github.com/kubernetes-sigs/cloud-provider-azure/tree/master/helm/cloud-provider-azure) can be found on Github. + +1. Create a `azure-cloud-config` secret with the required [cloud provider config](#cloud-provider-configuration). + +```shell +kubectl apply -f azure-cloud-config.yaml +``` + +2. Add the Helm repository: + +```shell +helm repo add azure-cloud-controller-manager https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo +helm repo update +``` + +3. Create a `values.yaml` file with the following contents to override the default `values.yaml`: + + + + +```yaml +# values.yaml +infra: + clusterName: +cloudControllerManager: + cloudConfigSecretName: azure-cloud-config + cloudConfig: null + clusterCIDR: null + enableDynamicReloading: 'true' + configureCloudRoutes: 'false' + allocateNodeCidrs: 'false' + caCertDir: /etc/ssl + enabled: true + replicas: 1 + hostNetworking: true + nodeSelector: + node-role.kubernetes.io/control-plane: 'true' + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/master + - effect: NoSchedule + key: node-role.kubernetes.io/control-plane + value: 'true' + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' +``` + + + + + +```yaml +# values.yaml +cloudControllerManager: + cloudConfigSecretName: azure-cloud-config + cloudConfig: null + clusterCIDR: null + enableDynamicReloading: 'true' + configureCloudRoutes: 'false' + allocateNodeCidrs: 'false' + caCertDir: /etc/ssl + enabled: true + replicas: 1 + hostNetworking: true + nodeSelector: + node-role.kubernetes.io/controlplane: 'true' + node-role.kubernetes.io/control-plane: null + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/controlplane + value: 'true' + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' +infra: + clusterName: +``` + + + + +4. Install the Helm chart: + +```shell +helm upgrade --install cloud-provider-azure azure-cloud-controller-manager/cloud-provider-azure -n kube-system --values values.yaml +``` + +Verify that the Helm chart installed successfully: + +```shell +helm status cloud-provider-azure -n kube-system +``` + +5. (Optional) Verify that the cloud controller manager update succeeded: + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +6. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +### Helm Chart Installation from UI + +1. Click **☰**, then select the name of the cluster from the left navigation. + +2. Select **Apps** > **Repositories**. + +3. Click the **Create** button. + +4. Enter `https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo` in the **Index URL** field. + +5. Select **Apps** > **Charts** from the left navigation and install **cloud-provider-azure** chart. + +6. Select the namespace, `kube-system`, and enable **Customize Helm options before install**. + +7. Replace `cloudConfig: /etc/kubernetes/azure.json` to read from the Cloud Config Secret and enable dynamic reloading: + +```yaml + cloudConfigSecretName: azure-cloud-config + enableDynamicReloading: 'true' +``` + +8. Update the following fields as required: + +```yaml + allocateNodeCidrs: 'false' + configureCloudRoutes: 'false' + clusterCIDR: null +``` + + + + +9. Rancher-provisioned RKE2 nodes have the selector `node-role.kubernetes.io/control-plane` set to `true`. Update the nodeSelector: +```yaml +nodeSelector: + node-role.kubernetes.io/control-plane: 'true' +``` + + + + +10. Rancher-provisioned RKE nodes are tainted `node-role.kubernetes.io/controlplane`. Update tolerations and the nodeSelector: + +```yaml +tolerations: + - effect: NoSchedule + key: node.cloudprovider.kubernetes.io/uninitialized + value: 'true' + - effect: NoSchedule + value: 'true' + key: node-role.kubernetes.io/controlplane +``` + +```yaml +nodeSelector: + node-role.kubernetes.io/controlplane: 'true' +``` + + + +11. Install the chart and confirm that the cloud controller and cloud node manager deployed successfully: + +```shell +kubectl rollout status deployment -n kube-system cloud-controller-manager +kubectl rollout status daemonset -n kube-system cloud-node-manager +``` + +12. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID: + +```shell +kubectl describe nodes | grep "ProviderID" +``` + +### Installing CSI Drivers + +Install [Azure Disk CSI driver](https://github.com/kubernetes-sigs/azuredisk-csi-driver) or [Azure File CSI Driver](https://github.com/kubernetes-sigs/azurefile-csi-driver) to access [Azure Disk](https://azure.microsoft.com/en-us/services/storage/disks/) or [Azure File](https://azure.microsoft.com/en-us/services/storage/disks/) volumes respectively. + +The steps to install the Azure Disk CSI driver are shown below. You can install the Azure File CSI Driver in a similar manner by following the [helm installation documentation](https://github.com/kubernetes-sigs/azurefile-csi-driver/blob/master/charts/README.md). + +::: note Important: + +Clusters must be provisioned using `Managed Disk` to use Azure Disk. You can configure this when creating **RKE1 Node Templates** or **RKE2 Machine Pools*. + +::: + +Official upstream docs for [Helm chart installation](https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/charts/README.md) can be found on Github. + +1. Add and update the helm repository: + +```shell +helm repo add azuredisk-csi-driver https://raw.githubusercontent.com/kubernetes-sigs/azuredisk-csi-driver/master/charts +helm repo update azuredisk-csi-driver +``` + +1. Install the chart as shown below, updating the --version argument as needed. Refer to the full list of latest chart configurations in the [upstream docs](https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/charts/README.md#latest-chart-configuration). + +```shell +helm install azuredisk-csi-driver azuredisk-csi-driver/azuredisk-csi-driver --namespace kube-system --version v1.30.1 --set controller.cloudConfigSecretName=azure-cloud-config --set controller.cloudConfigSecretNamespace=kube-system --set controller.runOnControlPlane=true +``` + +2. (Optional) Verify that the azuredisk-csi-driver installation succeeded: + +```shell +kubectl --namespace=kube-system get pods --selector="app.kubernetes.io/name=azuredisk-csi-driver" --watch +``` + +3. Provision an example Storage Class: + +```shell +cat <