Flux — The Best Open-Weight AI Image Model
Flux is the image model that dethroned SDXL. Built by Black Forest Labs — the same team that created Stable Diffusion — it produces sharper images, follows complex prompts more accurately, and can actually render readable text. You can run it locally on your own GPU, though you'll need more VRAM than SDXL requires.
Key Takeaway — March 2026
Flux produces better images than SDXL — sharper details, accurate text rendering, fewer hand errors. The tradeoff: it needs 12GB+ VRAM (vs 8GB for SDXL) and generates images ~4x slower.
For most users, FLUX.1 Dev with GGUF-Q8 quantization in ComfyUI is the sweet spot. If you have under 12GB VRAM, try FLUX.2 Klein 4B — it runs on ~8GB and is Apache 2.0 licensed.
What Is Flux?
Flux is a family of text-to-image models from Black Forest Labs (BFL), founded by the original creators of Stable Diffusion. Instead of the U-Net architecture used in SD 1.5 and SDXL, Flux uses a rectified flow transformer — 12B parameters for FLUX.1, up to 32B for FLUX.2.
The practical result: better image quality in fewer steps and far better natural language understanding, thanks to a T5-XXL text encoder instead of CLIP. It can also generate legible text inside images — something SDXL almost never gets right.
Why Flux Over SDXL?
Three things separate Flux from previous-gen models:
- Prompt adherence: Flux follows complex, multi-element prompts far more accurately. Describe a scene with five specific objects and Flux includes all five. SDXL drops elements or misinterprets them.
- Text rendering: One of the first open-weight models that reliably generates readable text in images. SDXL produces garbled text almost every time.
- Human anatomy: The "AI hands" problem is significantly reduced. Fewer extra fingers, fewer melted faces, fewer limb errors.
SDXL still has a more mature ecosystem — more LoRAs, more fine-tunes, more community resources. And it runs on cheaper hardware. But for raw image quality and prompt accuracy, Flux wins.
Which Flux Model Should You Use?
The naming is confusing. Here's what matters for local use:
- FLUX.1 Schnell — 1–4 step generation, Apache 2.0 (fully free, commercial use OK). Fast but lower quality. Good for prototyping.
- FLUX.1 Dev — Full quality, open-weight, non-commercial license. The most popular choice for local generation. Commercial license available from BFL for a fee.
- FLUX.2 Klein 4B — Released January 2026. Apache 2.0, runs on ~8GB VRAM, sub-second generation on consumer GPUs. Best option for lower-end hardware.
- FLUX.1/2 Pro / Max — API only. Best quality, but you can't run these locally.
- FLUX.1 Kontext — In-context image editing via natural language prompts. Also integrated into Adobe Photoshop's Generative Fill.
For most people running locally: use FLUX.1 Dev with GGUF-Q8 quantization. Community testing shows it's 99% identical to full FP16 quality at ~12GB VRAM instead of 24GB+.
System Requirements
Flux is hungrier than SDXL. Here's what you actually need:
- 12GB VRAM (RTX 3060/4070): Runs FLUX.1 Dev with GGUF-Q8. The recommended minimum for good quality.
- 8GB VRAM (RTX 4060/3060 8GB): Runs FLUX.2 Klein 4B natively. FLUX.1 Dev works with NF4 quantization but you'll see quality loss.
- 24GB VRAM (RTX 3090/4090): Full FP16 FLUX.1 without quantization. The ideal setup.
- RAM: 32GB recommended. Model loading peaks at 25–32GB system RAM. 16GB minimum with fp8 text encoder.
- Storage: ~24GB for a FLUX.1 checkpoint plus ~10GB for text encoders and VAE.
Generation speed (FLUX.1 Dev, 1024×1024):
- RTX 4090 24GB: ~20s (30 steps)
- RTX 4070 12GB: ~60s (20 steps)
- RTX 3060 12GB: ~38–41s (Schnell, Q8, 4 steps)
For comparison, SDXL does a comparable image in ~13s on an RTX 4070. Flux trades speed for quality.
How to Run Flux Locally
- ComfyUI — The most popular option. Day-one Flux support, drag-and-drop workflow files, native GGUF quantization handling. If you already use ComfyUI, just download the models.
- Forge — Fork of Automatic1111 with built-in Flux support and automatic VRAM offloading. Easier interface than ComfyUI.
- HuggingFace Diffusers —
FluxPipeline.from_pretrained()for Python developers who want to script generation. - LocalForge AI — Runs Forge pre-configured with Flux models and zero setup. One option if you don't want to manage Python environments yourself.
GGUF-Q8 is the preferred quantization format for local Flux generation — best quality-to-VRAM tradeoff per community benchmarks.
The Honest Downsides
- VRAM hungry: 12B parameters vs ~3.5B for SDXL. Minimum 12GB VRAM for good quality, vs 8GB for SDXL.
- Slower generation: ~57s vs ~13s for comparable 1024×1024 images on an RTX 4070. That's over 4x slower.
- Smaller ecosystem: SDXL has years of LoRAs, fine-tunes, and community models. Flux's ecosystem is growing fast but isn't there yet as of early 2026.
- Licensing confusion: Schnell and Klein 4B are Apache 2.0. Dev is non-commercial unless you pay BFL. Pro/Max are API-only. Many users don't realize Dev has commercial restrictions.
Who Should Use Flux?
- You want the best local image quality: Flux is the answer. Nothing open-weight matches it for prompt accuracy and detail right now.
- You need text in images: Flux is the only reliable open-weight option for legible text rendering. FLUX.2 handles complex typography and infographics.
- You're on a tight VRAM budget (8GB or less): Use FLUX.2 Klein 4B, or stick with SDXL until you can upgrade your GPU.
- You need a huge LoRA ecosystem right now: SDXL still wins here. Flux's library is catching up fast — check back in 6 months.
Frequently Asked Questions
Can I run Flux on 8GB VRAM? +
Is Flux better than Stable Diffusion XL? +
Is Flux free? +
Can Flux generate readable text in images? +
Is Flux uncensored? +
Details
| Website | https://blackforestlabs.ai |
| Runs Locally | Yes |
| Open Source | Yes |
| NSFW Allowed | Yes |
