Image Generation Guides
Deploy GPU-accelerated text-to-image generation models on Spheron GPU instances.
VRAM requirements
| Model | Min VRAM | Recommended GPU |
|---|---|---|
| Stable Diffusion 1.5 | 6 GB | Any 6 GB+ GPU |
| SDXL | 10 GB | RTX 4090 (24 GB) |
| FLUX.1-dev | 16 GB | RTX 4090 (24 GB) |
| Stable Diffusion 3.5 Medium | 12 GB | RTX 4090 (24 GB) |
| Stable Diffusion 3.5 Large | 24 GB | RTX 4090 (24 GB) or A100 40 GB |
| FLUX.2-dev | 80 GB | H100 80 GB (FP8) or H200 141 GB |
Available guides
FLUX.1 & FLUX.2
Black Forest Labs text-to-image models. FLUX.1-dev delivers state-of-the-art photorealism on an RTX 4090; FLUX.2 on H100 for highest fidelity.
Hardware: RTX 4090 24GB (FLUX.1-dev) · H100 80GB (FLUX.2)
Stable Diffusion 3.5 & SDXL
Stability AI diffusion models from SD 1.5 (6GB) through SDXL (10GB) and SD 3.5 Large (24GB). FastAPI /generate endpoint for programmatic use.
Hardware: 6–24GB VRAM depending on model variant
ComfyUI
Node-based visual workflow server for image generation. Docker container on port 8188 with SSH tunnel setup. Supports custom workflows via JSON API.
Hardware: RTX 4090 24GB (recommended)
What's next
- Instance Types: Choose the right GPU for image generation
- Cost Optimization: Spot instances for batch image generation
- Networking: SSH tunneling and port access
- Templates & Images: Copy-ready startup scripts