Kubernetes Addon
Deploy a managed Kubernetes cluster on a Voltage Park Cluster instance using the Spheron K8s addon.
Prerequisites
-
Select a compatible offer. Browse compatible Voltage Park Cluster offers on the dashboard using the Cluster filter.
-
Select a Kubernetes version. Available Kubernetes versions are listed in the deployment form when configuring the instance.
Enabling the Addon
Enable the Kubernetes addon from the deployment form on the dashboard. Select a Kubernetes version when configuring the instance. The cluster provisions automatically during deployment.
Wait for the deployment status to change to running before retrieving the kubeconfig.
Extracting kubeconfig
Once the deployment status is running, retrieve the kubeconfig from the instance details drawer in the dashboard. Save it locally:
mkdir -p ~/.kube
# Copy the kubeconfig value from the deployment details into ~/.kube/config
chmod 600 ~/.kube/config
# Verify cluster connectivity
kubectl get nodesExpected output:
NAME STATUS ROLES AGE VERSION
spheron-gpu-0 Ready control-plane 5m v1.35.0Deploying a GPU Workload
Example Pod spec requesting all 8 H100 GPUs:
apiVersion: v1
kind: Pod
metadata:
name: gpu-training
spec:
containers:
- name: trainer
image: nvcr.io/nvidia/pytorch:24.01-py3
command: ["python3", "-c", "import torch; print(torch.cuda.device_count())"]
resources:
limits:
nvidia.com/gpu: 8
restartPolicy: NeverApply it:
kubectl apply -f gpu-pod.yaml
kubectl logs gpu-trainingFor smaller allocations, change nvidia.com/gpu: 8 to the number of GPUs your workload needs.
Cluster Health Check
Monitor the health status of a running cluster using the pre-provisioned Grafana dashboard. The Grafana URL is available in the instance details in the dashboard.
Grafana Dashboard
A pre-provisioned Grafana URL for cluster metrics (GPU utilization, memory, network) is available in the instance details in the dashboard. Open the URL directly in a browser. No additional setup is required.
Additional Resources
- Instance Types: Cluster requirements for the K8s addon
- Regions & Providers: Voltage Park capabilities
- Distributed Training: Multi-GPU training without K8s
- Volume Mounting (Voltage Park): Persistent storage for K8s workloads