Skip to content

Image Generation Guides

Deploy GPU-accelerated text-to-image generation models on Spheron GPU instances.

VRAM requirements

ModelMin VRAMRecommended GPU
Stable Diffusion 1.56 GBAny 6 GB+ GPU
SDXL10 GBRTX 4090 (24 GB)
FLUX.1-dev16 GBRTX 4090 (24 GB)
Stable Diffusion 3.5 Medium12 GBRTX 4090 (24 GB)
Stable Diffusion 3.5 Large24 GBRTX 4090 (24 GB) or A100 40 GB
FLUX.2-dev80 GBH100 80 GB (FP8) or H200 141 GB

Available guides

FLUX.1 & FLUX.2

Black Forest Labs text-to-image models. FLUX.1-dev delivers state-of-the-art photorealism on an RTX 4090; FLUX.2 on H100 for highest fidelity.

Hardware: RTX 4090 24GB (FLUX.1-dev) · H100 80GB (FLUX.2)

Stable Diffusion 3.5 & SDXL

Stability AI diffusion models from SD 1.5 (6GB) through SDXL (10GB) and SD 3.5 Large (24GB). FastAPI /generate endpoint for programmatic use.

Hardware: 6–24GB VRAM depending on model variant

ComfyUI

Node-based visual workflow server for image generation. Docker container on port 8188 with SSH tunnel setup. Supports custom workflows via JSON API.

Hardware: RTX 4090 24GB (recommended)

What's next