Stable Diffusion 3.5 & SDXL
Deploy Stable Diffusion 3.5 and SDXL from Stability AI on Spheron GPU instances. A FastAPI wrapper exposes a /generate endpoint for programmatic image generation.
Recommended hardware
| Model | Min VRAM | Recommended GPU | Instance Type |
|---|---|---|---|
| SD 1.5 | 6 GB | Any 6 GB+ GPU | Spot |
| SDXL Base | 10 GB | RTX 4090 (24 GB) | Spot or Dedicated |
| SD 3.5 Medium | 10 GB | RTX 4090 (24 GB) | Dedicated |
| SD 3.5 Large | 24 GB | RTX 4090 (24 GB) or A100 40 GB | Dedicated |
Prerequisites
- A running Spheron GPU instance (see Instance Types)
- SSH access to the instance
- A HuggingFace account with access granted to
stabilityai/stable-diffusion-3.5-mediumandstabilityai/stable-diffusion-3.5-large: the SD 3.5 models are gated and require accepting the Stability Community License on the model pages before downloading.
Manual setup
Use these steps to set up the server manually after SSH-ing into your instance. This works on any provider regardless of cloud-init support.
Step 1: Connect to your instance
ssh <user>@<ipAddress>Replace <user> with the username shown in the instance details panel (e.g., ubuntu for Spheron AI instances) and <ipAddress> with your instance's public IP.
Step 2: Install dependencies
sudo apt-get update -y
sudo apt-get install -y python3-pip
pip install diffusers transformers accelerate torch fastapi uvicorn pillow sentencepiece huggingface_hubStep 2a: Authenticate with HuggingFace (SD 3.5 only)
SD 3.5 models require a HuggingFace token. Skip this step if you are using SD 1.5 or SDXL only.
huggingface-cli login --token <your-hf-token>Replace <your-hf-token> with a token from huggingface.co/settings/tokens.
Step 3: Create the server script
cat > /opt/sd_server.py << 'EOF'
import io, base64
from fastapi import FastAPI
from pydantic import BaseModel
import torch
from diffusers import StableDiffusion3Pipeline
app = FastAPI()
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3.5-medium",
torch_dtype=torch.bfloat16,
).to("cuda")
class GenerateRequest(BaseModel):
prompt: str
negative_prompt: str = ""
width: int = 1024
height: int = 1024
num_inference_steps: int = 40
guidance_scale: float = 4.5
@app.post("/generate")
def generate(req: GenerateRequest):
image = pipe(
req.prompt,
negative_prompt=req.negative_prompt,
width=req.width,
height=req.height,
num_inference_steps=req.num_inference_steps,
guidance_scale=req.guidance_scale,
).images[0]
buf = io.BytesIO()
image.save(buf, format="PNG")
return {"image_b64": base64.b64encode(buf.getvalue()).decode()}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
EOFFor SDXL, replace the pipeline class with StableDiffusionXLPipeline and the model ID with stabilityai/stable-diffusion-xl-base-1.0.
Step 4: Start the server
Run the server in the foreground to verify it works:
python3 /opt/sd_server.pyPress Ctrl+C to stop.
Step 5: Run as a background service
To keep the server running after you close your SSH session, create a systemd service:
sudo tee /etc/systemd/system/sd-server.service > /dev/null << 'EOF'
[Unit]
Description=Stable Diffusion Image Generation Server
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/python3 /opt/sd_server.py
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable sd-server
sudo systemctl start sd-serverAccessing the server
SSH tunnel
ssh -L 8000:localhost:8000 <user>@<ipAddress>Usage example
import requests
import base64
from PIL import Image
import io
response = requests.post(
"http://localhost:8000/generate",
json={
"prompt": "A serene Japanese garden with cherry blossoms, soft morning light",
"negative_prompt": "blurry, low quality, distorted",
"width": 1024,
"height": 1024,
"num_inference_steps": 40,
},
)
response.raise_for_status()
image_bytes = base64.b64decode(response.json()["image_b64"])
image = Image.open(io.BytesIO(image_bytes))
image.save("garden.png")
print("Image saved to garden.png")Check server logs
journalctl -u sd-server -fCloud-init startup script (optional)
If your provider supports cloud-init, you can paste this into the Startup Script field when deploying to automate the setup above.
SD 3.5 Medium (RTX 4090)
#cloud-config
write_files:
- path: /opt/sd_server.py
content: |
import io, base64
from fastapi import FastAPI
from pydantic import BaseModel
import torch
from diffusers import StableDiffusion3Pipeline
app = FastAPI()
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3.5-medium",
torch_dtype=torch.bfloat16,
).to("cuda")
class GenerateRequest(BaseModel):
prompt: str
negative_prompt: str = ""
width: int = 1024
height: int = 1024
num_inference_steps: int = 40
guidance_scale: float = 4.5
@app.post("/generate")
def generate(req: GenerateRequest):
image = pipe(
req.prompt,
negative_prompt=req.negative_prompt,
width=req.width,
height=req.height,
num_inference_steps=req.num_inference_steps,
guidance_scale=req.guidance_scale,
).images[0]
buf = io.BytesIO()
image.save(buf, format="PNG")
return {"image_b64": base64.b64encode(buf.getvalue()).decode()}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
- path: /etc/systemd/system/sd-server.service
content: |
[Unit]
Description=Stable Diffusion Image Generation Server
After=network.target
[Service]
Type=simple
Environment=HF_TOKEN=<your-hf-token>
ExecStart=/usr/bin/python3 /opt/sd_server.py
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
runcmd:
- apt-get update -y
- apt-get install -y python3-pip
- pip install diffusers transformers accelerate torch fastapi uvicorn pillow sentencepiece huggingface_hub
- systemctl daemon-reload
- systemctl enable sd-server
- systemctl start sd-serverFor SDXL, replace the pipeline class with StableDiffusionXLPipeline and the model ID with stabilityai/stable-diffusion-xl-base-1.0.
What's next
- FLUX.1 & FLUX.2: Higher quality text-to-image models
- ComfyUI: Node-based workflow interface
- Image Generation Overview: VRAM requirements and model comparison
- Networking: SSH tunneling and port access