Skip to content

DataCrunch - Mounting Shared Storage

Mount persistent shared filesystems on DataCrunch instances using Network File System (NFS) protocol.

Overview

DataCrunch shared filesystems (SFS) use NFS protocol for high-performance network-attached storage. Each filesystem is accessible via a datacenter-specific NFS endpoint and identified by a unique pseudopath.

Protocol: NFS Connection: Datacenter-region NFS endpoint Performance: Parallel connections via nconnect option

Prerequisites

Before starting, ensure you have:

  1. Created a volume with provider: datacrunch via the Volumes API
  2. Attached the shared filesystem to your DataCrunch instance
  3. Retrieved the following from the Spheron AI dashboard:
    • Pseudopath — starts with / (e.g., /grgwrgwrg-20f77f6a)
    • Datacenter location slug (e.g., fin-01)
  4. SSH access to your DataCrunch instance

Mounting Process

Connect to Your Instance

SSH into your DataCrunch GPU instance using the connection details from the instance's Details drawer:

ssh ubuntu@your-instance-ip

See SSH Connection Setup for detailed connection instructions.

Install NFS Client Tools

Ensure the NFS client is installed on your instance (Ubuntu):

sudo apt-get update
sudo apt-get install -y nfs-common

Create Mount Directory

Create a directory where you'll mount the shared filesystem. Replace <SFS_NAME> with your preferred directory name:

sudo mkdir -p /mnt/<SFS_NAME>

Common choices include:

  • /mnt/storage - Standard mount location
  • /mnt/datasets - For ML/AI datasets
  • /mnt/models - For model checkpoints

Mount the Filesystem

Mount the shared filesystem. Replace the placeholders with your values:

  • <DC> — datacenter location slug (e.g., fin-01)
  • <PSEUDOPATH> — filesystem pseudopath (e.g., /grgwrgwrg-20f77f6a)
  • <SFS_NAME> — your mount directory name
sudo mount -t nfs -o nconnect=16 nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME>
Example:
sudo mount -t nfs -o nconnect=16 nfs.fin-01.datacrunch.io:/grgwrgwrg-20f77f6a /mnt/storage

Persist the Mount in /etc/fstab (Recommended)

Add the filesystem to /etc/fstab so it remounts automatically on reboot:

grep -qxF 'nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=16 0 0' /etc/fstab || echo 'nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=16 0 0' | sudo tee -a /etc/fstab
Example:
grep -qxF 'nfs.fin-01.datacrunch.io:/grgwrgwrg-20f77f6a /mnt/storage nfs defaults,nconnect=16 0 0' /etc/fstab || echo 'nfs.fin-01.datacrunch.io:/grgwrgwrg-20f77f6a /mnt/storage nfs defaults,nconnect=16 0 0' | sudo tee -a /etc/fstab

Verify Mount

Confirm the filesystem mounted successfully:

df -h

Look for your shared filesystem in the output:

Filesystem                                          Size  Used Avail Use% Mounted on
...
nfs.fin-01.datacrunch.io:/grgwrgwrg-20f77f6a       100G  1.0G   99G   1% /mnt/storage

Set Permissions (Optional)

Make the mounted storage writable by your user:

sudo chown ubuntu:ubuntu /mnt/<SFS_NAME>

Replace ubuntu:ubuntu with your username if different.

Where to Find the Pseudopath

After attaching the volume to an instance, open the Volumes page in the Spheron AI dashboard. Select your volume from the sidebar — the Pseudopath is displayed in the volume details. It starts with / and looks like /grgwrgwrg-20f77f6a.

The datacenter location slug is also visible there in the NFS mount command (e.g., nfs.fin-01.datacrunch.io).


Mounting to Every Node in a Cluster

To mount the shared filesystem across all nodes in a cluster at once, use pdsh. Replace the placeholders as above:

pdsh -a "sudo mkdir -vp /mnt/<SFS_NAME> && grep -qxF 'nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=16 0 0' /etc/fstab || echo 'nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=16 0 0' | sudo tee -a /etc/fstab && sudo mount /mnt/<SFS_NAME>"

Using Your Mounted Volume (DataCrunch)

Accessing the Storage

Once mounted, use the shared filesystem like any local directory:

# Navigate to the storage
cd /mnt/storage
 
# Create files and directories
mkdir my-project
echo "Hello, storage!" > my-project/readme.txt
 
# List contents
ls -lh /mnt/storage/

Checking Storage Usage

# Check space on the mounted filesystem
df -h /mnt/storage
 
# Check detailed disk usage
du -sh /mnt/storage/*

Working with Large Datasets

The mounted filesystem is ideal for:

  • ML/AI training datasets
  • Model checkpoints and artifacts
  • Shared data across multiple instances in the same datacenter
  • Persistent application data
# Example: Download dataset to storage
cd /mnt/storage
wget https://example.com/large-dataset.tar.gz
tar -xzf large-dataset.tar.gz

Unmounting Shared Storage (DataCrunch)

Unmount when swapping volumes, detaching the filesystem, or terminating the instance.

Unmount the Filesystem

sudo umount /mnt/<SFS_NAME>

If you get a "target is busy" error, check for active processes:

# Check what's using the storage
lsof /mnt/<SFS_NAME>
 
# Or use fuser
fuser -m /mnt/<SFS_NAME>

Remove from File System Table

Remove the entry from /etc/fstab to prevent auto-mount on next boot:

# Edit fstab manually
sudo nano /etc/fstab
 
# Or remove automatically with sed
sudo sed -i '/datacrunch\.io.*nfs/d' /etc/fstab

Troubleshooting (DataCrunch)

Mount Fails with "Connection Refused"

Cause: Wrong datacenter slug, pseudopath, or filesystem not shared with this instance.

Solution:
  1. Verify the volume is attached to your instance in the Spheron AI dashboard
  2. Double-check the datacenter slug and pseudopath
  3. Confirm the instance and filesystem are in the same datacenter location

Mount Fails with "No such file or directory"

Cause: Mount point directory doesn't exist.

Solution:
sudo mkdir -p /mnt/<SFS_NAME>
sudo mount -t nfs -o nconnect=16 nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME>

Filesystem Not Mounting on Boot

Cause: Network not ready when fstab mounts are processed.

Solution: Add the _netdev option to the fstab entry:

nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=16,_netdev 0 0

The _netdev option tells the system to wait for network before mounting.

Performance Issues

Cause: Suboptimal nconnect value or high network latency.

Solutions:
  • Increase nconnect for more parallel connections (try 32)
  • Ensure instance and filesystem are in the same datacenter
nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=32 0 0

Checking Mount Status

# List all NFS mounts
mount | grep nfs
 
# Check NFS statistics
nfsstat
 
# Verify fstab syntax
sudo mount -fav

Best Practices (DataCrunch)

Organization:
  • Use descriptive mount points: /mnt/data, /mnt/models, /mnt/datasets
  • Create subdirectories for different projects
  • Document what data is stored where
Performance:
  • Use nconnect=16 or higher for better throughput
  • Keep instances and shared filesystems in the same datacenter location
Data Safety:
  • Maintain backups of critical data
  • Filesystems persist independently of instances, but data can still be lost due to accidental deletion
  • Test backup and restore procedures
Security:
  • Restrict access using filesystem permissions
  • Only mount filesystems shared explicitly with your instance
  • Audit who has access to shared storage

Quick Reference

DataCrunch NFS Mounting

Setup Process:
  1. Install nfs-common package
  2. Create mount directory (/mnt/<SFS_NAME>)
  3. Mount with sudo mount -t nfs -o nconnect=16 nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME>
  4. Persist in /etc/fstab
  5. Verify with df -h
Key Configuration:
nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME> nfs defaults,nconnect=16 0 0
Common Operations:
  • Mount: sudo mount -t nfs -o nconnect=16 nfs.<DC>.datacrunch.io:<PSEUDOPATH> /mnt/<SFS_NAME>
  • Unmount: sudo umount /mnt/<SFS_NAME>
  • Check status: df -h /mnt/<SFS_NAME>
  • Monitor: nfsstat

Protocol: Network File System (NFS)

Key Benefits:
  • ✅ Persistent storage independent of instances
  • ✅ Shared access across multiple instances in the same datacenter
  • ✅ Automatic mounting on boot (with fstab configuration)
  • ✅ No data loss when instances terminate
  • ✅ High-performance parallel connections via nconnect

Additional Resources

Need Help?

  • General Questions: Return to Volume Mounting Overview
  • API Issues: Check Volume API Reference
  • Use chat support in the Spheron AI dashboard for real-time assistance
  • Provider-Specific: Contact Spheron support for infrastructure issues