VESSL Cloud Documentation

Overview

A job runs a single vesslctl job create command in a container and exits when the command finishes. Use it for training, inference, and batch processing workloads. Submission is CLI-only — the console shows status, logs, and metrics. New to the CLI? Start with the CLI Quickstart.

Prerequisites

Authenticated vesslctl: Run vesslctl auth login to complete the browser OAuth flow. See the CLI Quickstart for the full setup.
Available credit balance: Job creation is blocked when the balance is zero or negative. Add a payment method and top up from Billing.
(Optional) Persistent volume for outputs: Create an Object storage or Cluster storage volume ahead of time if you need to keep model checkpoints or other artifacts.

Submit

Provide a cluster, resource spec, container image, command to run, and any volumes or environment variables you need.

vesslctl job create \
  --name my-training-job \
  --resource-spec <spec-slug> \
  --image quay.io/vessl-ai/torch:2.9.1-cuda13.0.1-py3.13-slim \
  --cmd "python train.py --epochs 10" \
  --env WANDB_API_KEY=<your-key> \
  --object-volume <volume-slug>:/output \
  --tag training-run-2026-04

Run vesslctl cluster list and vesslctl resource-spec list to discover available clusters and GPU specs. See vesslctl job create for the full flag reference.

Persist job output

Jobs run in ephemeral containers — anything written outside a mounted volume disappears when the job ends. Attach at least one persistent volume so your outputs survive.

Object storage (--object-volume): Shared across clusters, ideal for final artifacts like trained models and evaluation metrics. Mount at a dedicated path such as /output.
Cluster storage (--cluster-volume): Fast in-cluster storage, ideal for intermediate checkpoints during long training. Mount at /workspace or similar.

vesslctl job create \
  --name my-training-job \
  --resource-spec <spec-slug> \
  --image quay.io/vessl-ai/torch:2.9.1-cuda13.0.1-py3.13-slim \
  --object-volume <output-volume-slug>:/output \
  --cmd "python train.py --output /output"

Temporary storage is cleared when a job ends, even after succeeded. If your training script writes to /tmp or the current directory without a mounted volume, the results are lost.

Reuse a job configuration

Export an existing job’s configuration as JSON and resubmit it later:

vesslctl job export my-job-abc123 > job-config.json
vesslctl job create --file job-config.json

Submit a job from inside a workspace

Workspaces ship with vesslctl pre-authenticated via a workload token, so you can iterate on a script in JupyterLab and submit the same code as a batch job from the same shell. See vesslctl workspace for details.

Get started

Examples

vesslctl CLI

Use VESSL Cloud

Operate organization

Glossary

Support

Create a batch job

Overview

Prerequisites

Submit

Persist job output

Reuse a job configuration

Submit a job from inside a workspace

See also

Get started

Examples

vesslctl CLI

Use VESSL Cloud

Operate organization

Glossary

Support

Documentation Index

​Overview

​Prerequisites

​Submit

​Persist job output

​Reuse a job configuration

​Submit a job from inside a workspace

​See also

Overview

Prerequisites

Submit

Persist job output

Reuse a job configuration

Submit a job from inside a workspace

See also