TalosEtcdBackup
The TalosEtcdBackup custom resource enables automated etcd backups for Talos control planes, streaming snapshots directly to S3-compatible storage without local disk usage.
Overview
TalosEtcdBackup provides a declarative way to backup etcd data from Talos control planes. The operator:
- Connects to the Talos API to stream an etcd snapshot
- Uploads the snapshot directly to S3 (or S3-compatible storage)
- Uses zero local disk I/O for minimal footprint
- Updates status conditions to track backup progress
Features
- Zero Local I/O: Snapshots stream directly from Talos API to S3
- Memory Efficient: Uses streaming to minimize memory footprint
- S3 Compatible: Works with AWS S3, MinIO, Ceph, and other S3-compatible storage
- Secure: Credentials stored in Kubernetes secrets
- Status Tracking: Updates conditions for easy monitoring
Specification
Required Fields
spec.talosControlPlaneRef: Reference to the TalosControlPlane to backupspec.backupStorage.s3.bucket: S3 bucket namespec.backupStorage.s3.region: AWS regionspec.backupStorage.s3.accessKeyID: Secret reference for access key IDspec.backupStorage.s3.secretAccessKey: Secret reference for secret access key
Optional Fields
spec.backupStorage.s3.endpoint: Custom S3 endpoint (for S3-compatible storage)spec.backupStorage.s3.insecureSkipTLSVerify: Skip TLS verification (default: false)
Example
1. Create S3 Credentials Secret
apiVersion: v1
kind: Secret
metadata:
name: my-s3-credentials
namespace: default
type: Opaque
stringData:
accessKeyID: "YOUR_ACCESS_KEY_ID"
secretAccessKey: "YOUR_SECRET_ACCESS_KEY"
2. Create TalosEtcdBackup Resource
apiVersion: talos.alperen.cloud/v1alpha1
kind: TalosEtcdBackup
metadata:
name: my-backup
spec:
talosControlPlaneRef:
name: my-controlplane
backupStorage:
s3:
bucket: my-etcd-backups
region: us-west-2
accessKeyID:
name: my-s3-credentials
key: accessKeyID
secretAccessKey:
name: my-s3-credentials
key: secretAccessKey
3. Check Backup Status
kubectl get talosetcdbackup my-backup -o yaml
Look for the status.conditions field to check backup progress:
Progressing: Backup is in progressReady: Backup completed successfullyFailed: Backup encountered an error
Using S3-Compatible Storage
For MinIO, Ceph, or other S3-compatible storage, add the endpoint field:
spec:
backupStorage:
s3:
bucket: my-backups
region: us-east-1
endpoint: https://minio.example.com
insecureSkipTLSVerify: false # Set to true for self-signed certs
accessKeyID:
name: my-s3-credentials
key: accessKeyID
secretAccessKey:
name: my-s3-credentials
key: secretAccessKey
Backup Location
Backups are stored in S3 with the following key pattern:
talos-operator-etcd-backups/<controlplane-name>/etcd-snapshot-<timestamp>.db
Example:
talos-operator-etcd-backups/my-controlplane/etcd-snapshot-2025-10-04T11-30-00Z.db
Status Conditions
The operator updates status conditions throughout the backup lifecycle:
| Condition | Status | Reason | Description |
|---|---|---|---|
| Progressing | True | BackupInProgress | Backup is currently running |
| Progressing | False | BackupCompleted | Backup finished |
| Ready | True | BackupSucceeded | Backup completed successfully |
| Failed | True | BackupFailed | Backup encountered an error |
Best Practices
- Credentials Management: Always use Kubernetes secrets for S3 credentials
- Bucket Lifecycle: Configure S3 lifecycle policies for automatic cleanup
- Monitoring: Watch status conditions for backup health
- Testing: Test restores regularly to ensure backups are valid
- Network: Ensure operator has network access to S3 endpoint
Troubleshooting
Backup Fails with "Failed to get TalosControlPlane"
Ensure: - The referenced TalosControlPlane exists in the same namespace - The TalosControlPlane is in a ready state with valid bundle config
Backup Fails with "Failed to get secret"
Ensure:
- The secret exists in the same namespace as the TalosEtcdBackup
- The secret contains the correct keys (accessKeyID and secretAccessKey)
Backup Fails with "Failed to upload snapshot to S3"
Check: - S3 credentials have write permissions to the bucket - Network connectivity to S3 endpoint - Bucket exists and is in the correct region - For custom endpoints, verify the endpoint URL is correct
Backup Hangs in Progressing State
- Check operator logs for detailed error messages
- Verify Talos API is accessible from the operator
- Ensure sufficient network bandwidth for snapshot transfer
Performance Considerations
The backup process streams data directly from Talos to S3:
- Memory Usage: Minimal, as data is streamed (not buffered)
- Disk Usage: Zero, no local storage is used
- Network: Bandwidth dependent on snapshot size
- Duration: Depends on snapshot size and network speed
For large etcd databases, ensure: - Adequate network bandwidth between operator and S3 - Sufficient Talos API timeout settings - Monitoring of backup completion times