Guides to Deploy on Spheron
Comprehensive deployment guides for running AI nodes and large language models on Spheron.
Available Deployment Guides
🤖 AI Nodes
Deploy and run specialized AI network nodes:
Gonka AI Node
Deploy Gonka AI node infrastructure for decentralized AI workloads.
Pluralis Node 0
Set up and run Pluralis Node 0 for distributed AI network participation.
Inference Devnet Node
Deploy inference devnet nodes for AI model serving infrastructure.
Gensyn AI Node
Run Gensyn AI nodes for decentralized machine learning network.
🧠Large Language Models (LLMs)
Deploy and run production-ready LLMs:
Qwen3-Omni-30B-A3B
Multimodal AI model with 30B parameters supporting text, audio, and vision inputs.
Qwen3-VL 4B & 8B
Vision-language models in 4B and 8B parameter versions for image understanding.
Chandra OCR
Specialized OCR model for document processing and text extraction.
Soulx Podcast-1.7B
Compact 1.7B parameter model optimized for podcast and audio content generation.
Janus CoderV-8B
Code generation and understanding model with 8B parameters.
Baidu Ernie-4.5-VL-28B-A3B
Advanced vision-language model from Baidu with 28B parameters.
Getting Started
Before deploying models or nodes:
-
Ensure you have an active Spheron AI account
- See Getting Started if you're new
-
Check GPU requirements
- Each guide lists minimum GPU specifications
- See Platform Overview for available GPUs
-
Review pricing
- Check Billing for cost information
- Consider Reserved GPUs for long-term deployments
-
Understand resource needs
- VRAM requirements vary by model size
- Storage needs depend on model and dataset size
- Network bandwidth for multi-node setups
Deployment Types
Quick Deploy
Most guides include quick deployment options:
- Pre-configured instances
- One-click setup scripts
- Automated environment setup
Custom Deployment
For advanced users:
- Manual configuration
- Custom model parameters
- Fine-tuning options
- Multi-GPU setups
Production Deployment
Enterprise-ready configurations:
- High-availability setups
- Load balancing
- Monitoring and logging
- Auto-scaling considerations
Common Requirements
Hardware
- GPU: Varies by model (RTX 4090 minimum for most)
- VRAM: 16GB-80GB depending on model
- Storage: 50GB-500GB for models and data
- RAM: 32GB+ recommended
Software
- OS: Ubuntu 22.04 LTS (recommended)
- CUDA: 11.8+ or 12.x
- Python: 3.9-3.11
- Docker: Optional but recommended
Network
- Stable internet connection
- Open ports for API access
- Sufficient bandwidth for data transfer
Support
Need help with deployment?
- Documentation: Each guide has detailed step-by-step instructions
- Community: Join Discord for help
- Quick Start: See Quick Start for basic deployment
- Connection Guides: Check Connecting to Instances for access methods
Additional Resources
- Security Best Practices - Secure your deployments
- SSH Connection - Connect to your instances
- Jupyter Notebook - Interactive development
- API Reference - Automate deployments
Choose a deployment guide above to get started with your AI workload on Spheron.