Comparison
| Workspace (container) | VM cluster | |
|---|---|---|
| Isolation | Shared node, containerized | Dedicated nodes, not shared |
| Privileges | Container-scoped, no host or kernel access | Full root access on each VM |
| Container privileges | No — you’re already inside a container | Yes — run any container, including --privileged, with full capabilities and device mounts |
| OS and kernel | Container image only; shared host kernel | Your OS, kernel modules, and drivers |
| Multi-node | Single node | 1–8 nodes |
| InfiniBand | — | Default, between nodes |
| Storage | Optional Cluster storage or Object storage | 2 TiB boot disk per node (not durable), plus optional NFS shared storage |
| Billing | On-demand, per hour | Reserved contract, prepaid |
When to choose a workspace
- A container already covers your workload — you don’t need kernel-, driver-, or OS-level control.
- You want to start experimenting immediately for short, exploratory work.
- You prefer no commitment — pay per hour for what you actually use, when capacity happens to be available.
When to choose a VM cluster
- You need kernel-, driver-, or OS-level control — install custom kernel modules, specific driver versions, Slurm, Kubernetes, your own orchestrator, or host-level security agents.
- You need multi-node distributed training — InfiniBand is provisioned by default for high-bandwidth, low-latency RDMA between your nodes.
- You need capacity locked in at your chosen start date and guaranteed available for the entire contract — so a project plan doesn’t depend on “is a GPU free right now?”
- You want lower per-GPU pricing than on-demand, with deeper discounts on longer reservation terms.
- You want dedicated hardware — each VM owns its physical server, with no other customer sharing your GPUs, CPU, memory, or network.
What about batch jobs?
A job is another containerized option: it runs a script to completion on a single node, non-interactively — submit a fine-tuning or evaluation run and let it finish, with no infrastructure to manage. Like a workspace, it has no root access and no multi-node InfiniBand. Choose a VM cluster when you need multi-node distributed training with InfiniBand, root- or kernel-level control, or a dedicated reserved cluster.A VM cluster is billed as a prepaid reserved contract and can’t be paid with VESSL Cloud credits. See Pricing and billing.
