Skip to content

Guides to Deploy on Spheron

Comprehensive deployment guides for running AI nodes and large language models on Spheron.

Available Deployment Guides

🤖 AI Nodes

Deploy and run specialized AI network nodes:

Gonka AI Node

Deploy Gonka AI node infrastructure for decentralized AI workloads.

Pluralis Node 0

Set up and run Pluralis Node 0 for distributed AI network participation.

Inference Devnet Node

Deploy inference devnet nodes for AI model serving infrastructure.

Gensyn AI Node

Run Gensyn AI nodes for decentralized machine learning network.

🧠 Large Language Models (LLMs)

Deploy and run production-ready LLMs:

Qwen3-Omni-30B-A3B

Multimodal AI model with 30B parameters supporting text, audio, and vision inputs.

Qwen3-VL 4B & 8B

Vision-language models in 4B and 8B parameter versions for image understanding.

Chandra OCR

Specialized OCR model for document processing and text extraction.

Soulx Podcast-1.7B

Compact 1.7B parameter model optimized for podcast and audio content generation.

Janus CoderV-8B

Code generation and understanding model with 8B parameters.

Baidu Ernie-4.5-VL-28B-A3B

Advanced vision-language model from Baidu with 28B parameters.

Getting Started

Before deploying models or nodes:

  1. Ensure you have an active Spheron AI account
  2. Check GPU requirements
    • Each guide lists minimum GPU specifications
    • See Platform Overview for available GPUs
  3. Review pricing
  4. Understand resource needs
    • VRAM requirements vary by model size
    • Storage needs depend on model and dataset size
    • Network bandwidth for multi-node setups

Deployment Types

Quick Deploy

Most guides include quick deployment options:

  • Pre-configured instances
  • One-click setup scripts
  • Automated environment setup

Custom Deployment

For advanced users:

  • Manual configuration
  • Custom model parameters
  • Fine-tuning options
  • Multi-GPU setups

Production Deployment

Enterprise-ready configurations:

  • High-availability setups
  • Load balancing
  • Monitoring and logging
  • Auto-scaling considerations

Common Requirements

Hardware

  • GPU: Varies by model (RTX 4090 minimum for most)
  • VRAM: 16GB-80GB depending on model
  • Storage: 50GB-500GB for models and data
  • RAM: 32GB+ recommended

Software

  • OS: Ubuntu 22.04 LTS (recommended)
  • CUDA: 11.8+ or 12.x
  • Python: 3.9-3.11
  • Docker: Optional but recommended

Network

  • Stable internet connection
  • Open ports for API access
  • Sufficient bandwidth for data transfer

Support

Need help with deployment?

  • Documentation: Each guide has detailed step-by-step instructions
  • Community: Join Discord for help
  • Quick Start: See Quick Start for basic deployment
  • Connection Guides: Check Connecting to Instances for access methods

Additional Resources

Choose a deployment guide above to get started with your AI workload on Spheron.