FLUX.1 & FLUX.2
Deploy FLUX.1 and FLUX.2 from Black Forest Labs on Spheron GPU instances. FLUX models deliver state-of-the-art photorealistic text-to-image generation.
Recommended hardware
| Model | Min VRAM | Recommended GPU | Instance Type |
|---|---|---|---|
| FLUX.1-dev | 16 GB | RTX 4090 (24 GB) | Dedicated or Spot |
| FLUX.1-schnell | 16 GB | RTX 4090 (24 GB) | Dedicated or Spot |
| FLUX.2-dev | 80 GB | H100 80 GB (FP8) or H200 141 GB | Dedicated |
Prerequisites
- A running Spheron GPU instance (see Instance Types)
- SSH access to the instance
- A HuggingFace account with access granted to the FLUX model you intend to use: both FLUX.1-dev (non-commercial license) and FLUX.1-schnell (Apache 2.0 license) are gated models that require accepting their respective licenses before downloading.
Manual setup
Use these steps to set up the server manually after SSH-ing into your instance. This works on any provider regardless of cloud-init support.
Step 1: Connect to your instance
ssh <user>@<ipAddress>Replace <user> with the username shown in the instance details panel (e.g., ubuntu for Spheron AI instances) and <ipAddress> with your instance's public IP.
Step 2: Install dependencies
sudo apt-get update -y
sudo apt-get install -y python3-pip
pip install diffusers transformers accelerate torch fastapi uvicorn pillow sentencepiece huggingface_hubStep 2a: Authenticate with HuggingFace
Both FLUX.1-dev and FLUX.1-schnell are gated models and require a HuggingFace token.
huggingface-cli login --token <your-hf-token>Replace <your-hf-token> with a token from huggingface.co/settings/tokens.
Step 3: Create the server script
cat > /opt/flux_server.py << 'EOF'
import io, base64
from fastapi import FastAPI
from pydantic import BaseModel
import torch
from diffusers import FluxPipeline
app = FastAPI()
pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
torch_dtype=torch.bfloat16,
).to("cuda")
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
num_inference_steps: int = 28
guidance_scale: float = 3.5
@app.post("/generate")
def generate(req: GenerateRequest):
image = pipe(
req.prompt,
width=req.width,
height=req.height,
num_inference_steps=req.num_inference_steps,
guidance_scale=req.guidance_scale,
).images[0]
buf = io.BytesIO()
image.save(buf, format="PNG")
return {"image_b64": base64.b64encode(buf.getvalue()).decode()}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
EOFStep 4: Start the server
Run the server in the foreground to verify it works:
python3 /opt/flux_server.pyPress Ctrl+C to stop.
Step 5: Run as a background service
To keep the server running after you close your SSH session, create a systemd service:
sudo tee /etc/systemd/system/flux.service > /dev/null << 'EOF'
[Unit]
Description=FLUX.1 Image Generation Server
After=network.target
[Service]
Type=simple
Environment=HF_TOKEN=<your-hf-token>
ExecStart=/usr/bin/python3 /opt/flux_server.py
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable flux
sudo systemctl start fluxAccessing the server
SSH tunnel
ssh -L 8000:localhost:8000 <user>@<ipAddress>Usage example
import requests
import base64
from PIL import Image
import io
response = requests.post(
"http://localhost:8000/generate",
json={
"prompt": "A photorealistic mountain landscape at sunset, golden hour lighting",
"width": 1024,
"height": 1024,
"num_inference_steps": 28,
"guidance_scale": 3.5,
},
)
response.raise_for_status()
image_bytes = base64.b64decode(response.json()["image_b64"])
image = Image.open(io.BytesIO(image_bytes))
image.save("output.png")
print("Image saved to output.png")Check server logs
journalctl -u flux -fCloud-init startup script (optional)
If your provider supports cloud-init, you can paste this into the Startup Script field when deploying to automate the setup above. It installs the required Python packages and starts a FastAPI server exposing a /generate endpoint on port 8000.
#cloud-config
write_files:
- path: /opt/flux_server.py
content: |
import io, base64
from fastapi import FastAPI
from pydantic import BaseModel
import torch
from diffusers import FluxPipeline
app = FastAPI()
pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
torch_dtype=torch.bfloat16,
).to("cuda")
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
num_inference_steps: int = 28
guidance_scale: float = 3.5
@app.post("/generate")
def generate(req: GenerateRequest):
image = pipe(
req.prompt,
width=req.width,
height=req.height,
num_inference_steps=req.num_inference_steps,
guidance_scale=req.guidance_scale,
).images[0]
buf = io.BytesIO()
image.save(buf, format="PNG")
return {"image_b64": base64.b64encode(buf.getvalue()).decode()}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
- path: /etc/systemd/system/flux.service
content: |
[Unit]
Description=FLUX.1 Image Generation Server
After=network.target
[Service]
Type=simple
Environment=HF_TOKEN=<your-hf-token>
ExecStart=/usr/bin/python3 /opt/flux_server.py
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
runcmd:
- apt-get update -y
- apt-get install -y python3-pip
- pip install diffusers transformers accelerate torch fastapi uvicorn pillow sentencepiece huggingface_hub
- systemctl daemon-reload
- systemctl enable flux
- systemctl start fluxWhat's next
- Stable Diffusion 3.5 & SDXL: Alternative image generation models
- ComfyUI: Node-based workflow interface
- Image Generation Overview: VRAM requirements and model comparison
- Networking: SSH tunneling and port access