Stable Diffusion 3.5 & SDXL

Deploy Stable Diffusion 3.5 and SDXL from Stability AI on Spheron GPU instances. A FastAPI wrapper exposes a /generate endpoint for programmatic image generation.

Recommended hardware

Model	Min VRAM	Recommended GPU	Instance Type
SD 1.5	6 GB	Any 6 GB+ GPU	Spot
SDXL Base	10 GB	RTX 4090 (24 GB)	Spot or Dedicated
SD 3.5 Medium	10 GB	RTX 4090 (24 GB)	Dedicated
SD 3.5 Large	24 GB	RTX 4090 (24 GB) or A100 40 GB	Dedicated

Prerequisites

A running Spheron GPU instance (see Instance Types)
SSH access to the instance
A HuggingFace account with access granted to stabilityai/stable-diffusion-3.5-medium and stabilityai/stable-diffusion-3.5-large: the SD 3.5 models are gated and require accepting the Stability Community License on the model pages before downloading.

Manual setup

Use these steps to set up the server manually after SSH-ing into your instance. This works on any provider regardless of cloud-init support.

Step 1: Connect to your instance

ssh <user>@<ipAddress>

Replace <user> with the username shown in the instance details panel (e.g., ubuntu for Spheron AI instances) and <ipAddress> with your instance's public IP.

Step 2: Install dependencies

sudo apt-get update -y
sudo apt-get install -y python3-pip
pip install diffusers transformers accelerate torch fastapi uvicorn pillow sentencepiece huggingface_hub

Step 2a: Authenticate with HuggingFace (SD 3.5 only)

SD 3.5 models require a HuggingFace token. Skip this step if you are using SD 1.5 or SDXL only.

huggingface-cli login --token <your-hf-token>

Replace <your-hf-token> with a token from huggingface.co/settings/tokens.

Step 3: Create the server script

cat > /opt/sd_server.py << 'EOF'
import io, base64
from fastapi import FastAPI
from pydantic import BaseModel
import torch
from diffusers import StableDiffusion3Pipeline
 
app = FastAPI()
pipe = StableDiffusion3Pipeline.from_pretrained(
    "stabilityai/stable-diffusion-3.5-medium",
    torch_dtype=torch.bfloat16,
).to("cuda")
 
class GenerateRequest(BaseModel):
    prompt: str
    negative_prompt: str = ""
    width: int = 1024
    height: int = 1024
    num_inference_steps: int = 40
    guidance_scale: float = 4.5
 
@app.post("/generate")
def generate(req: GenerateRequest):
    image = pipe(
        req.prompt,
        negative_prompt=req.negative_prompt,
        width=req.width,
        height=req.height,
        num_inference_steps=req.num_inference_steps,
        guidance_scale=req.guidance_scale,
    ).images[0]
    buf = io.BytesIO()
    image.save(buf, format="PNG")
    return {"image_b64": base64.b64encode(buf.getvalue()).decode()}
 
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
EOF

For SDXL, replace the pipeline class with StableDiffusionXLPipeline and the model ID with stabilityai/stable-diffusion-xl-base-1.0.

Step 4: Start the server

Run the server in the foreground to verify it works:

python3 /opt/sd_server.py

Press Ctrl+C to stop.

Step 5: Run as a background service

To keep the server running after you close your SSH session, create a systemd service:

sudo tee /etc/systemd/system/sd-server.service > /dev/null << 'EOF'
[Unit]
Description=Stable Diffusion Image Generation Server
After=network.target
 
[Service]
Type=simple
ExecStart=/usr/bin/python3 /opt/sd_server.py
Restart=on-failure
RestartSec=10
 
[Install]
WantedBy=multi-user.target
EOF
 
sudo systemctl daemon-reload
sudo systemctl enable sd-server
sudo systemctl start sd-server

Accessing the server

SSH tunnel

ssh -L 8000:localhost:8000 <user>@<ipAddress>

Usage example

import requests
import base64
from PIL import Image
import io
 
response = requests.post(
    "http://localhost:8000/generate",
    json={
        "prompt": "A serene Japanese garden with cherry blossoms, soft morning light",
        "negative_prompt": "blurry, low quality, distorted",
        "width": 1024,
        "height": 1024,
        "num_inference_steps": 40,
    },
)
response.raise_for_status()
 
image_bytes = base64.b64decode(response.json()["image_b64"])
image = Image.open(io.BytesIO(image_bytes))
image.save("garden.png")
print("Image saved to garden.png")

Check server logs

journalctl -u sd-server -f

Cloud-init startup script (optional)

If your provider supports cloud-init, you can paste this into the Startup Script field when deploying to automate the setup above.

SD 3.5 Medium (RTX 4090)

#cloud-config
write_files:
  - path: /opt/sd_server.py
    content: |
      import io, base64
      from fastapi import FastAPI
      from pydantic import BaseModel
      import torch
      from diffusers import StableDiffusion3Pipeline
 
      app = FastAPI()
      pipe = StableDiffusion3Pipeline.from_pretrained(
          "stabilityai/stable-diffusion-3.5-medium",
          torch_dtype=torch.bfloat16,
      ).to("cuda")
 
      class GenerateRequest(BaseModel):
          prompt: str
          negative_prompt: str = ""
          width: int = 1024
          height: int = 1024
          num_inference_steps: int = 40
          guidance_scale: float = 4.5
 
      @app.post("/generate")
      def generate(req: GenerateRequest):
          image = pipe(
              req.prompt,
              negative_prompt=req.negative_prompt,
              width=req.width,
              height=req.height,
              num_inference_steps=req.num_inference_steps,
              guidance_scale=req.guidance_scale,
          ).images[0]
          buf = io.BytesIO()
          image.save(buf, format="PNG")
          return {"image_b64": base64.b64encode(buf.getvalue()).decode()}
 
      if __name__ == "__main__":
          import uvicorn
          uvicorn.run(app, host="0.0.0.0", port=8000)
  - path: /etc/systemd/system/sd-server.service
    content: |
      [Unit]
      Description=Stable Diffusion Image Generation Server
      After=network.target
 
      [Service]
      Type=simple
      Environment=HF_TOKEN=<your-hf-token>
      ExecStart=/usr/bin/python3 /opt/sd_server.py
      Restart=on-failure
      RestartSec=10
 
      [Install]
      WantedBy=multi-user.target
runcmd:
  - apt-get update -y
  - apt-get install -y python3-pip
  - pip install diffusers transformers accelerate torch fastapi uvicorn pillow sentencepiece huggingface_hub
  - systemctl daemon-reload
  - systemctl enable sd-server
  - systemctl start sd-server

For SDXL, replace the pipeline class with StableDiffusionXLPipeline and the model ID with stabilityai/stable-diffusion-xl-base-1.0.

What's next

FLUX.1 & FLUX.2: Higher quality text-to-image models
ComfyUI: Node-based workflow interface
Image Generation Overview: VRAM requirements and model comparison
Networking: SSH tunneling and port access