Instance Types

Spheron GPU offerings are organized along two axes: interruptibility and hardware isolation.

Interruptibility determines whether a running instance can be reclaimed by the provider. Spot instances can be interrupted at any time. Dedicated instances carry a 99.95% SLA and are not reclaimed after deployment.

Hardware isolation determines how your workload accesses the underlying hardware. Spot always runs in a VM. Dedicated instances come in three sub-types:

VM: Isolated virtual machine on shared physical hardware. This is the default for most GPU offers across all providers.
Bare Metal: Full access to a physical server with no hypervisor layer. Identified by the BAREMETAL suffix in the GPU type name on the dashboard.
Cluster: Dedicated bare-metal server with 8 GPUs and a high-speed hardware interconnect. Supports InfiniBand (3.2 Tbps) or Ethernet (100 Gbps). Identified by the CLUSTER designation in the offers list. Currently available on Voltage Park only; additional providers are in progress.

Decision Matrix

Category	Sub-type	Interruption Risk	Hardware	Interconnect	Best For	Relative Price
Spot	VM	Yes (provider can reclaim anytime)	Shared (VM)	N/A	Experiments, fault-tolerant batch jobs with checkpointing	Lowest
Dedicated	VM	No (99.95% SLA)	Shared (VM)	N/A	Production inference, interactive sessions	Medium
Dedicated	Bare Metal	No (99.95% SLA)	Full physical server	N/A	Workloads requiring no virtualization overhead	Medium-High
Dedicated	Cluster	No (99.95% SLA)	Full physical server (8 GPUs)	InfiniBand (3.2 Tbps) or Ethernet (100 Gbps)	Multi-GPU distributed training (DDP, DeepSpeed), K8s on bare metal	Highest

Spot

Spot instances are VM-based, typically 30 to 60% cheaper than Dedicated. The provider can reclaim them at any time.

When to use Spot:

Experiments and prototyping
Batch training jobs with checkpointing enabled
Fault-tolerant workloads with checkpointing to a persistent volume

Handling interruption:

Save checkpoints to a persistent volume at regular intervals to resume training if the instance is reclaimed:

import torch
 
# Save checkpoint every N steps to a mounted volume
if step % checkpoint_interval == 0:
    torch.save({
        'step': step,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
    }, '/checkpoints/checkpoint_latest.pt')

See the Volume Mounting guides to set up a persistent volume at /checkpoints.

Dedicated

Dedicated instances carry a 99.95% SLA and are not reclaimed by the provider after deployment. Three hardware sub-types are available under Dedicated.

VM

Dedicated VM is the standard virtualized offering. Your workload runs in an isolated VM on shared physical hardware. This is the most common instance type across providers and covers the majority of GPU offers on the platform.

When to use Dedicated VM:

Production inference servers
Interactive training sessions
Demos and customer-facing workloads
Any job requiring uninterrupted runtime without needing bare-metal performance

Bare Metal

Dedicated Bare Metal gives you full access to a physical server with no hypervisor or VM layer. You get the entire machine, which eliminates virtualization overhead and provides predictable hardware performance. GPU count varies by offer and provider; Bare Metal servers range from single-GPU configurations up to multi-GPU servers.

Bare Metal GPU offers display with the BAREMETAL suffix in the GPU type name on the dashboard. They appear alongside standard VM offers in the GPU offers list; there is no separate section or tab for them.

When to use Dedicated Bare Metal:

Workloads sensitive to virtualization overhead
Use cases that require direct hardware access for performance or compliance reasons

On the Dashboard

Open the Spheron dashboard and go to Deploy GPUs.
Browse the GPU offers list. Look for the BAREMETAL suffix in the GPU Type field of each offer card.
Select the matching offer, complete the deployment form (OS image, SSH key, optional startup script), and deploy.

Cluster

Dedicated Cluster is Voltage Park's bare-metal GPU offering for multi-GPU workloads. Each Cluster instance is an entire 8-GPU physical server with no hypervisor layer. Cluster instances are the only sub-type that supports multi-GPU distributed training and the Kubernetes addon.

Two interconnect options are available:

Interconnect	Bandwidth	Notes
InfiniBand	3.2 Tbps	H100 SXM5 with NVLink; optimal for all-reduce-heavy DDP and ZeRO-3 training
Ethernet	100 Gbps	Lower cost; suitable for distributed jobs where network bandwidth is not the bottleneck

Cluster instances are currently available on Voltage Park only. Support for additional providers is in progress.

Cluster offers are identifiable by the CLUSTER designation in the offers list.

When to use Cluster:

Large-scale distributed training (PyTorch DDP, DeepSpeed ZeRO-3) on bare-metal H100 hardware
Kubernetes workloads on bare-metal GPU nodes (enable the K8s addon)
Multi-GPU jobs requiring maximum GPU-to-GPU bandwidth (use InfiniBand)

Selecting Instance Type via Dashboard

When deploying from the Spheron dashboard:

Go to Deploy GPUs and browse the GPU offers list.
Each offer card shows the GPU model, vCPUs, RAM, storage, and price.
Identify the offering type by the offer details:
- Spot: labeled as interruptible; typically shows a spot price alongside the regular price.
- Dedicated VM: standard VM price with no interruption risk; the most common offering across providers.
- Dedicated Bare Metal: identified by BAREMETAL in the GPU type name on the dashboard (e.g., H100_SXM5_BAREMETAL).
- Dedicated Cluster: Voltage Park multi-GPU bare-metal offers showing 8 GPUs. Identified by the CLUSTER designation in the offer name.
Select the offer that matches your workload requirements and click Deploy.

Additional Resources

Distributed Training guide: Cluster H100 bare-metal setup
Kubernetes addon guide: Enable K8s on Voltage Park Cluster instances
Volume Mounting: Persistent storage for checkpoints
Cost Optimization: Choosing the right instance type for your budget
API Reference: Filter GPU offers by instance type programmatically