How to Run SDXL Locally — Complete Offline Setup Guide
You can run Stable Diffusion XL on your own PC with no cloud account, no subscription, and no content filters. This guide covers hardware requirements, which UI to install, where to download models, and how to fix the errors that trip up most people. Total setup time: 15–45 minutes depending on your path.
What You Need
- GPU: NVIDIA with 8+ GB VRAM (RTX 3060 12GB or better). 4 GB works with ComfyUI + fp16 optimizations, but it's painfully slow. AMD GPUs work via DirectML — expect significantly worse performance.
- RAM: 32 GB recommended. 16 GB is technically possible, but your system will freeze when loading the base + refiner models simultaneously.
- Disk space: 20 GB minimum for one model + UI + dependencies. Budget 30–50 GB if you plan to use multiple models and LoRAs.
- OS: Windows 10/11 (primary). Linux works well. macOS with Apple Silicon runs 2–4x slower than equivalent NVIDIA setups.
- Software: Python 3.10.x and Git (skip if using a one-click installer).
Step 1 — Pick Your UI
Three options, one clear recommendation:
- Forge UI — Best for most users. It's a fork of AUTOMATIC1111 with better VRAM management, native SDXL support, and 10–30% faster generation on the same hardware. If you'd pick A1111, pick Forge instead — it's the successor.
- Fooocus — Easiest option. Extract, run, done. Auto-downloads models on first launch. Built-in inpainting, styles, and upscaling. Great if you just want to type prompts and get images.
- ComfyUI — Most powerful, steepest learning curve. Node-based workflow editor gives you total control over the generation pipeline. Pick this for custom workflows or video generation.
Or use LocalForge AI for Forge pre-configured with zero setup.
Step 2 — Install Prerequisites
If you're using Fooocus or Forge's one-click package, skip this step — everything's bundled.
For the git clone path: download Python 3.10.x (not 3.11+, which causes dependency conflicts with torch/xformers) and Git for Windows. Verify both:
python --version # should show 3.10.x
git --version # any version works
Step 3 — Install and Launch the UI
Forge (one-click): Download the .7z package from the Forge GitHub releases. Extract, run update.bat, then run.bat. First launch installs PyTorch and dependencies — takes 10–20 minutes.
Fooocus: Download from Fooocus releases. Extract, run run.bat. First launch auto-downloads the SDXL model (~6.5 GB), about 10–15 minutes.
ComfyUI: Download the portable package from ComfyUI releases. Extract, run run_nvidia_gpu.bat.
Your UI opens at http://127.0.0.1:7860 (Forge) or http://127.0.0.1:8188 (ComfyUI). Fooocus opens automatically.
Step 4 — Download the SDXL Model
Fooocus handles this automatically. For Forge and ComfyUI, download these files:
- SDXL 1.0 Base (~6.9 GB) — Hugging Face download
- SDXL VAE fix (~335 MB, recommended) — sdxl-vae-fp16-fix
- SDXL 1.0 Refiner (~6 GB, optional) — skip if you have 16 GB RAM or less
Place checkpoints in models/Stable-diffusion/ (Forge) or models/checkpoints/ (ComfyUI). VAE goes in models/VAE/ or models/vae/.
Always download .safetensors format — it's safer and loads faster than .ckpt.
Step 5 — Configure and Generate
Select the SDXL model in the checkpoint dropdown, then use these settings:
- Resolution: 1024×1024 (SDXL's native resolution). Other good sizes: 1152×896, 1216×832. Don't use 512×512 — you'll get blurry, distorted output.
- Sampler: DPM++ 2M Karras or Euler a
- Steps: 25–30
- CFG Scale: 5–7 (SDXL prefers lower CFG than SD 1.5)
Verify It Works
Run this test prompt: "a photo of a cat sitting on a windowsill, golden hour lighting, detailed fur, 8k"
You should get a coherent, detailed 1024×1024 image. Expected generation times:
- RTX 4080 (16 GB): 15–20 seconds
- RTX 3060 (12 GB): 30–45 seconds
- 8 GB GPU with --medvram: 1–3 minutes
Troubleshooting
- "CUDA out of memory": Add
--medvramor--lowvramto launch args inwebui-user.bat. Reduce resolution to 1024×1024 if you went higher. Set batch size to 1. - Black/blank images: Switch VAE to
sdxl-vae-fp16-fix. SDXL's default VAE has float16 precision issues that produce black output. - Extremely slow generation: Enable xformers (
--xformersin launch args). Switch from A1111 to Forge for better memory management. Reduce steps to 20–25. - UI won't start: Verify Python 3.10.x (not 3.11/3.12). Delete the
venvfolder and re-run the batch file to rebuild the environment. - LoRA not working: Confirm it's SDXL-compatible, not SD 1.5. Check the weight (0.7–1.0) and add the trigger word from the download page to your prompt.
What to Do Next
- Try fine-tuned models: Best SDXL Realistic Models — Juggernaut XL, RealVisXL, and DreamShaper XL all outperform the base checkpoint.
- Set up ComfyUI workflows: ComfyUI Setup Guide — node-based workflows for maximum control.
- Compare model architectures: Flux vs SDXL vs Hunyuan — which generation matters for your use case.
