Skip to content

Guides to Deploy on Spheron

Where to Start

  • Training a model? → Start with Distributed Training for multi-GPU, or pick any RTX 4090 Spot instance for single-GPU fine-tuning
  • Running inference?vLLM Server for an OpenAI-compatible API; Ollama for interactive local usage
  • Running an AI node? → See the AI Nodes section below

Training

Model training guides, from single-GPU fine-tuning to large-scale distributed runs.

Distributed Training (PyTorch DDP)

Multi-GPU PyTorch DDP and DeepSpeed ZeRO-3 on a Voltage Park bare-metal H100 NVLink cluster (up to 8× H100). Covers torchrun, gradient checkpointing, BF16 precision, checkpoint persistence, and GPU monitoring.

Hardware: Voltage Park Cluster (H100 NVLink, up to 8 GPUs)

LLM Inference

Deploy and serve large language models on Spheron GPU instances.

vLLM Inference Server

OpenAI-compatible inference server using vLLM on H100 or A100. Includes a systemd service for persistence, SSH tunnel access, and performance tuning flags.

Hardware: H100 80GB (7B–13B models) · 2× A100 80GB (30B+)

Ollama + Open WebUI

Browser-based chat interface backed by Ollama on an RTX 4090. Docker Compose setup with GPU passthrough; pull any model with one command.

Hardware: RTX 4090 (24GB VRAM)

Qwen3-Omni-30B-A3B

Multimodal AI model with 30B parameters supporting text, audio, and vision inputs.

Qwen3-VL 4B & 8B

Vision-language models in 4B and 8B parameter versions for image understanding.

Chandra OCR

Specialized OCR model for document processing and text extraction.

Soulx Podcast-1.7B

Compact 1.7B parameter model optimized for podcast and audio content generation.

Janus CoderV-8B

Code generation and understanding model with 8B parameters.

Baidu Ernie-4.5-VL-28B-A3B

Advanced vision-language model from Baidu with 28B parameters.

AI Nodes

Deploy and run specialized AI network nodes.

Gonka AI Node

Deploy Gonka AI node infrastructure for AI compute network participation.

Pluralis Node 0

Set up and run Pluralis Node 0 for distributed AI network participation.

Additional Resources