Skip to content

Instance Types

Spheron GPU offerings are organized along two axes: interruptibility and hardware isolation.

Interruptibility determines whether a running instance can be reclaimed by the provider. Spot instances can be interrupted at any time. Dedicated instances carry a 99.95% SLA and are not reclaimed after deployment.

Hardware isolation determines how your workload accesses the underlying hardware. Spot always runs in a VM. Dedicated instances come in three sub-types:

  • VM: Isolated virtual machine on shared physical hardware. This is the default for most GPU offers across all providers.
  • Bare Metal: Full access to a physical server with no hypervisor layer. Identified by the BAREMETAL suffix in the GPU type name on the dashboard.
  • Cluster: Dedicated bare-metal server with 8 GPUs and a high-speed hardware interconnect. Supports InfiniBand (3.2 Tbps) or Ethernet (100 Gbps). Identified by the CLUSTER designation in the offers list. Currently available on Voltage Park only; additional providers are in progress.

Decision Matrix

CategorySub-typeInterruption RiskHardwareInterconnectBest ForRelative Price
SpotVMYes (provider can reclaim anytime)Shared (VM)N/AExperiments, fault-tolerant batch jobs with checkpointingLowest
DedicatedVMNo (99.95% SLA)Shared (VM)N/AProduction inference, interactive sessionsMedium
DedicatedBare MetalNo (99.95% SLA)Full physical serverN/AWorkloads requiring no virtualization overheadMedium-High
DedicatedClusterNo (99.95% SLA)Full physical server (8 GPUs)InfiniBand (3.2 Tbps) or Ethernet (100 Gbps)Multi-GPU distributed training (DDP, DeepSpeed), K8s on bare metalHighest

Spot

Spot instances are VM-based, typically 30 to 60% cheaper than Dedicated. The provider can reclaim them at any time.

When to use Spot:
  • Experiments and prototyping
  • Batch training jobs with checkpointing enabled
  • Fault-tolerant workloads with checkpointing to a persistent volume
Handling interruption:

Save checkpoints to a persistent volume at regular intervals to resume training if the instance is reclaimed:

import torch
 
# Save checkpoint every N steps to a mounted volume
if step % checkpoint_interval == 0:
    torch.save({
        'step': step,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
    }, '/checkpoints/checkpoint_latest.pt')

See the Volume Mounting guides to set up a persistent volume at /checkpoints.

Dedicated

Dedicated instances carry a 99.95% SLA and are not reclaimed by the provider after deployment. Three hardware sub-types are available under Dedicated.

VM

Dedicated VM is the standard virtualized offering. Your workload runs in an isolated VM on shared physical hardware. This is the most common instance type across providers and covers the majority of GPU offers on the platform.

When to use Dedicated VM:
  • Production inference servers
  • Interactive training sessions
  • Demos and customer-facing workloads
  • Any job requiring uninterrupted runtime without needing bare-metal performance

Bare Metal

Dedicated Bare Metal gives you full access to a physical server with no hypervisor or VM layer. You get the entire machine, which eliminates virtualization overhead and provides predictable hardware performance. GPU count varies by offer and provider; Bare Metal servers range from single-GPU configurations up to multi-GPU servers.

Bare Metal GPU offers display with the BAREMETAL suffix in the GPU type name on the dashboard. They appear alongside standard VM offers in the GPU offers list; there is no separate section or tab for them.

When to use Dedicated Bare Metal:
  • Workloads sensitive to virtualization overhead
  • Use cases that require direct hardware access for performance or compliance reasons

On the Dashboard

  1. Open the Spheron dashboard and go to Deploy GPUs.
  2. Browse the GPU offers list. Look for the BAREMETAL suffix in the GPU Type field of each offer card.
  3. Select the matching offer, complete the deployment form (OS image, SSH key, optional startup script), and deploy.

Cluster

Dedicated Cluster is Voltage Park's bare-metal GPU offering for multi-GPU workloads. Each Cluster instance is an entire 8-GPU physical server with no hypervisor layer. Cluster instances are the only sub-type that supports multi-GPU distributed training and the Kubernetes addon.

Two interconnect options are available:

InterconnectBandwidthNotes
InfiniBand3.2 TbpsH100 SXM5 with NVLink; optimal for all-reduce-heavy DDP and ZeRO-3 training
Ethernet100 GbpsLower cost; suitable for distributed jobs where network bandwidth is not the bottleneck

Cluster instances are currently available on Voltage Park only. Support for additional providers is in progress.

Cluster offers are identifiable by the CLUSTER designation in the offers list.

When to use Cluster:
  • Large-scale distributed training (PyTorch DDP, DeepSpeed ZeRO-3) on bare-metal H100 hardware
  • Kubernetes workloads on bare-metal GPU nodes (enable the K8s addon)
  • Multi-GPU jobs requiring maximum GPU-to-GPU bandwidth (use InfiniBand)

Selecting Instance Type via Dashboard

When deploying from the Spheron dashboard:

  1. Go to Deploy GPUs and browse the GPU offers list.
  2. Each offer card shows the GPU model, vCPUs, RAM, storage, and price.
  3. Identify the offering type by the offer details:
    • Spot: labeled as interruptible; typically shows a spot price alongside the regular price.
    • Dedicated VM: standard VM price with no interruption risk; the most common offering across providers.
    • Dedicated Bare Metal: identified by BAREMETAL in the GPU type name on the dashboard (e.g., H100_SXM5_BAREMETAL).
    • Dedicated Cluster: Voltage Park multi-GPU bare-metal offers showing 8 GPUs. Identified by the CLUSTER designation in the offer name.
  4. Select the offer that matches your workload requirements and click Deploy.

Additional Resources