docs: audit cluster management section#485
docs: audit cluster management section#485Iheanacho-ai wants to merge 1 commit intosiderolabs:mainfrom
Conversation
7685b14 to
eaab1ce
Compare
Signed-off-by: Amarachi Iheanacho <amarachi.iheanacho@siderolabs.com>
eaab1ce to
a2e0bb3
Compare
| machines: | ||
| - <existing-worker-uuid> | ||
| - <new-worker-uuid> # add the new machine UUID here | ||
| ``` |
There was a problem hiding this comment.
You should add a section for when someone is using a machine set. In that case it's just a number change (no UUID)
| To remove a control plane node: | ||
|
|
||
| ```yaml | ||
| kind: ControlPlane |
There was a problem hiding this comment.
nit: your example above puts ControlPlane example first and workers second.
| import { version } from '/snippets/custom-variables.mdx'; | ||
|
|
||
| Refer to the general guide on creating a cluster to get started. To create a hybrid cluster, navigate to the cluster, then apply the following cluster patch by clicking on "Config Patches", and create a new patch with the target of "Cluster": | ||
| A hybrid cluster is a Kubernetes cluster whose nodes span multiple networks or infrastructure types, for example, a mix of bare metal machines, cloud virtual machines, on-premises virtual machines, or single-board computers (SBCs). |
There was a problem hiding this comment.
SBC isn't a different type than bare metal.
| A hybrid cluster is a Kubernetes cluster whose nodes span multiple networks or infrastructure types, for example, a mix of bare metal machines, cloud virtual machines, on-premises virtual machines, or single-board computers (SBCs). | ||
|
|
||
| <img src="./images/create-a-hybrid-cluster-create-patch-kubescan-enabled.png" alt="Create Patch"/> | ||
| By default, Kubernetes assumes all nodes can reach each other directly on the same network. When nodes are spread across different networks, this assumption breaks down. <a href={`../../talos/${version}/networking/kubespan`}>Kubespan</a> addresses this by establishing an encrypted WireGuard tunnel between every node in the cluster, so that all nodes can communicate securely regardless of where they are hosted. |
There was a problem hiding this comment.
| By default, Kubernetes assumes all nodes can reach each other directly on the same network. When nodes are spread across different networks, this assumption breaks down. <a href={`../../talos/${version}/networking/kubespan`}>Kubespan</a> addresses this by establishing an encrypted WireGuard tunnel between every node in the cluster, so that all nodes can communicate securely regardless of where they are hosted. | |
| Kubernetes requires all nodes can reach each other directly without NAT. When nodes are spread across different networks, this assumption breaks down. <a href={`../../talos/${version}/networking/kubespan`}>Kubespan</a> addresses this by establishing an encrypted WireGuard tunnel between every node in the cluster. The tunnel flattens the network so all nodes can communicate securely regardless of where they are hosted. |
| 3. Select **Config Patches** from the dropdown. | ||
| 4. Click **Create Patch** to open the **Create Patch** page. | ||
| 5. Apply the following patch: | ||
| ```yaml |
There was a problem hiding this comment.
Talos 1.13 is going to have a multi-doc config so we might want to add tabs for that now. This config will still work but new config types are recommended.
| </Tab> | ||
| </Tabs> | ||
|
|
||
| Once this patch is applied, all node-to-node traffic in the cluster will be encrypted using WireGuard, allowing nodes to communicate with each other securely regardless of which network they are on. |
There was a problem hiding this comment.
We should add a warning about network throughput with wireguard is significantly less than without wireguard. If people want native network throughput from nodes on the same network they need to set up the filters.excludeAdvertisedNetworks configuration
| @@ -1,45 +1,77 @@ | |||
| --- | |||
| title: Expose an HTTP Service from a Cluster | |||
There was a problem hiding this comment.
| title: Expose an HTTP Service from a Cluster | |
| title: Expose a Workload via Service Proxy |
I'm not sure if this would be easier to search for or more clear about what the document is for. The original title could apply to load balancers or any way to expose a service.
| omni-kube-service-exposer.sidero.dev/label: Sample Nginx | ||
| #omni-kube-service-exposer.sidero.dev/prefix: myservice | ||
| omni-kube-service-exposer.sidero.dev/prefix: myservice | ||
| omni-kube-service-exposer.sidero.dev/icon: H4sICB0B1mQAA25naW54LXN2Z3JlcG8tY29tLnN2ZwBdU8ly2zAMvfcrWPZKwiTANWM5015yyiHdDr1kNLZsa0axvKix8/cFJbvNdCRCEvEAPDxQ8/vLSydem+Op7XeVtGCkaHbLftXuNpX8Pax1kveL+UetxY9919erZiWG/k58+/kgvjb7Xonz+Qyn182RP2DZvyjx0OyaYz30x38o8dhemqP43vfdSWi9+DDnCHFuV8O2ksmY/UWKbdNutsPfz9e2OX/pL5U0wghCvqVgqrtTJbfDsL+bzUrhM0F/3MzQGDPjlHIxH9qhaxbrtmueh7d987zbtLvLfDZtz/f1sBWrSj5aD9klhVswwdfWgLNJXR+GL6sgRwSP6QmRd53yELzCCMmRShCjqyFmLOsWwCiIKS01GJOUA0qZHQUby5ZXlsAGjkv8wmuK00A+gDfxoD1DSREQOm0teBdVgOA4wqdY1i0i+AiG4lOGbFEhg7icZWJIgCMz+It1DA/hYDQXScxVjyyohpCprBt7SswylJze49htVNxQjk6xDuSXTAs12OQgUGLWMRenLj4pTsNb11SSde/uPhmbA2U5e6c3qxBiEdhTOhhO77CIwxvJ55p7NVlN1owX+xkOJhUb3M1OTuShAZpQIoK72mtcSF5bwExLoxECjsqzssgIzdMLB2IdiPViApHbsTwhH1KNkIgFHO2tTOB54pjfXu3k4QLechmK9lCGzfm9s0XbQtmWfqa4NB0Oo1lzVtUsx6wjKxtYBcKSMkJOyGzJBbYxBM0aBypZfdBRJyDCz0zNRjXZKw0D/J75KFApFvPVTt73kv/6b0Lr9bqMp/wziz8W9M/pAwQAAA== |
| spec: | ||
| containers: | ||
| - name: workload-proxy-example-nginx | ||
| image: nginx:stable-alpine-slim |
There was a problem hiding this comment.
we should switch this image to our example workload https://docs.siderolabs.com/talos/v1.12/getting-started/deploy-first-workload
| ``` | ||
|
|
||
| ### Troubleshooting | ||
| ## Troubleshoot exposed services |
There was a problem hiding this comment.
There's also a kubernetes workload that runs in the cluster they can inspect for troubleshooting. We should mention that somewhere and show them how to look at logs
| --- | ||
|
|
||
| This guide will walk you through the steps to import an existing Talos cluster into Omni. | ||
| If you have an existing Talos cluster running outside of Omni, you can import it so that Omni can manage it going forward. The import process connects your Talos nodes to Omni, preserves the existing cluster configuration as config patches, and registers the cluster as a managed resource, without resetting or disrupting the running workloads. |
There was a problem hiding this comment.
| If you have an existing Talos cluster running outside of Omni, you can import it so that Omni can manage it going forward. The import process connects your Talos nodes to Omni, preserves the existing cluster configuration as config patches, and registers the cluster as a managed resource, without resetting or disrupting the running workloads. | |
| If you have existing Talos clusters running without Omni management, you can import them to be managed by Omni. The import process connects your Talos nodes to Omni, preserves the existing cluster configuration as config patches, and registers the cluster as a managed resource, without resetting or disrupting the running workloads. |
| <Info> | ||
| This is an experimental feature. It does not support Talos installations with custom built Linux kernel or custom built extensions. | ||
| </Info> | ||
| > **Note:** This is an experimental feature. Clusters with a custom-built Linux kernel or custom-built extensions are not supported. If your cluster uses either of these, do not proceed with this guide. |
There was a problem hiding this comment.
I don't think this is considered experimental anymore
|
|
||
| ### Step 3: Unlock the cluster | ||
|
|
||
| When you are ready for Omni to begin managing the cluster, unlock it by running the following command, replacing `<cluster-name>` with the name of your cluster: |
There was a problem hiding this comment.
We should mention what happens when a cluster is unlocked. The endpoint is changed to Omni and other patches will be applied. They can see pending changes in the Omni UI. I think there's a CLI to see the diff too.
|
|
||
| Understanding how Omni handles schematics and config patches during import helps you anticipate the outcome and troubleshoot any issues that arise. | ||
|
|
||
| ### Image schematic |
There was a problem hiding this comment.
This should be higher in the guide and maybe set as a prereq
|
|
||
| The import command uses the `--initial-talos-version` and `--initial-kubernetes-version` values to generate the default machine config that Omni would produce for each node. It then compares that default config against the actual config running on each node and generates a config patch representing the difference, effectively capturing all customisations made to the cluster since it was first created. | ||
|
|
||
| Certain machine config fields that are not permitted on Omni-managed clusters are excluded from the generated patches. |
| - **Use Option 2** if the cluster was unlocked and further modified after import, making the backed-up configs potentially out of date. | ||
|
|
||
| Cluster has to be in `locked` state to be able to abort an import operation. | ||
| ### Option 1: Restore from backup |
| @@ -1,40 +1,46 @@ | |||
| --- | |||
| title: Restore Etcd of a Cluster Managed by Cluster Templates | |||
There was a problem hiding this comment.
Is it different when the machine isn't managed by cluster templates? Maybe we can shorten this title
| The output will look like this: | ||
| The output will look similar to this: | ||
|
|
||
| ``` |
|
|
||
| ```bash | ||
| omnictl get clusteruuid my-cluster | ||
| omnictl get clusteruuid <cluster-name> |
There was a problem hiding this comment.
You use <cluster-name> in 3 commands. Maybe we should export a variable so the other commands are copy/pastable
| The output will look like this: | ||
| The output will look similar to this: | ||
|
|
||
| ``` |
| Omni upgrades control plane nodes first, verifying that the etcd cluster is healthy and will remain healthy after each node leaves the etcd cluster before proceeding. | ||
|
|
||
| > Note: you cannot lock control plane nodes, as it is not supported to have the Kubernetes version of a worker higher than that of the control plane nodes in a cluster - this may result in API version incompatibility. | ||
| For each node, Omni drains and cordons it, updates the OS, then uncordons it. All upgrades use the `--preserve=true` flag, which retains ephemeral data on the node. |
There was a problem hiding this comment.
I think preserve is the default behavior since ~1.10 (maybe 1.11). No flag is needed even with talosctl.
You ca nstill point out that ephemeral data (including container images) and user volumes on the node is not erased.
|
|
||
| ### What happens during a Kubernetes upgrade | ||
|
|
||
| Kubernetes upgrades are non-disruptive to workloads and proceed in the following order: |
There was a problem hiding this comment.
Do workloads restart during the upgrade? For some reason I thought they did which would be disruptive.
|
|
||
| ### Apply updated Kubernetes manifests | ||
|
|
||
| Omni does not automatically apply updates to Kubernetes bootstrap manifests during an upgrade. |
There was a problem hiding this comment.
You may want to say what this includes. CoreDNS, kube-proxy, CNI
|
|
||
| #### Format of the audit log | ||
| <Tabs> | ||
| <Tab title="UI"> |
There was a problem hiding this comment.
You can test it at https://latest.omni.stage-pnap.managed.siderolabs.io/
| title: Audit logs | ||
| description: View and manage activity logs in Omni. | ||
| title: Audit Logs | ||
| description: View, configure, and interpret activity logs in Omni. |
There was a problem hiding this comment.
We get people who often ask how they can export their audit logs to a different log platform. We don't have a solution for them right now but we should at least have a section that answers that question and we can update when we do have a solution.




No description provided.