LocalForge AILocalForge AI
BlogFAQ

Flux / Use Case

How to Run Flux Locally

Flux is the best image model available for photorealism in 2026 — faces, hands, typography, skin texture. It's from the same team that created Stable Diffusion, and it runs locally on your own hardware.

You're probably wondering if your GPU can handle it. The short answer: probably yes, with the right format. Here's how to get Flux running without wasting time on the wrong setup.

About this Use Case

Flux is a local, offline AI image generation tool that is fully open source. It allows unrestricted content generation without filters.

The Problem

You want the best-quality AI images available, and you want to run them locally. Flux is the model to beat for photorealism, but it's also the most VRAM-hungry. Most guides assume you have a 24 GB card. If you don't, you need to know which format to use and which frontend will actually work with your hardware.

Can You Run Flux Locally? (Short Answer)

Yes — on any NVIDIA card with 6+ GB VRAM. The trick is picking the right quantization format. Full precision needs 24 GB. But quantized versions (FP8, GGUF) bring that down to 8–12 GB with minimal quality loss. You'll need a frontend like ComfyUI or Forge to actually run it.

How to Install and Run Flux Locally

  1. Know your GPU's VRAM. Open Task Manager (Windows) → Performance → GPU, or run nvidia-smi in a terminal. Your VRAM number determines which Flux format to use:

    • 24 GB (RTX 3090, 4090): Run Flux Dev at full FP16 precision. Best quality, no compromises.
    • 12–16 GB (RTX 3060 12GB, 4070, 4080): Use FP8 quantized Flux Dev. You keep about 98–99% of the quality.
    • 8–10 GB (RTX 3060 8GB, 4060): Use GGUF Q5_K_S format. Quality drops slightly (~95%) but images are still excellent.
    • 6–8 GB (RTX 2060, 3060 Ti): Use GGUF Q4_K_S with CPU offloading. ~90–93% quality. Generation takes 60–90 seconds per image. Usable, but slow.
  2. Pick a frontend. Flux is a model, not an app — you need software to run it:

    • ComfyUI — Best Flux support. Native loading for all formats (FP16, FP8, GGUF, NF4). Node-based, so there's a learning curve, but it gives you the most control. See the ComfyUI install guide.
    • Forge — Simpler form-based UI. Supports Flux via NF4 and GGUF formats. Easier to learn, fewer options for advanced workflows.
    • LocalForge AI — Forge pre-configured with Flux models already included. Zero setup. Good if you want to skip the install process entirely.
  3. Download the model files. You need three things:

    • The Flux checkpoint itself (4–12 GB depending on format) — from HuggingFace or CivitAI
    • A T5-XXL text encoder (5–10 GB depending on precision) — the FP8 version saves ~5 GB with negligible quality loss
    • A CLIP-L text encoder (~250 MB) and a VAE file (~300 MB)

    Place them in the right folders: checkpoints go in models/checkpoints/ (or models/unet/ in ComfyUI), text encoders in models/clip/, VAE in models/vae/.

  4. Generate your first image. In ComfyUI, load the Flux workflow template. In Forge, select your Flux model from the dropdown. Type a prompt and generate. Flux excels at natural language prompts — you can describe a scene in a full sentence rather than using keyword-style prompts.

Where It Shines

  • Photorealism that beats everything else: Flux produces the most accurate faces, hands, eyes, and skin texture of any model in 2026. SDXL isn't close. If photorealism matters to you, this is the model.
  • Typography that actually works: Flux can render text in images accurately — signs, logos, t-shirt text. This was nearly impossible with Stable Diffusion models.
  • Natural language prompts: Describe what you want in a normal sentence. "A woman sitting in a coffee shop reading a book, afternoon light through the window" works better than keyword spam.
  • Quantization barely hurts quality: FP8 is nearly indistinguishable from FP16. Even GGUF Q5 looks excellent. You don't need a 24 GB card to get great results.

Where It Struggles

  • VRAM is the defining constraint. Full precision needs 24 GB. The text encoders alone eat 5–10 GB. Even with quantization, Flux is the heaviest model you'll run locally.
  • Generation is slow on lower-end cards. 60–90 seconds per image on 8 GB VRAM with GGUF Q4. If you're used to SDXL at 6 seconds per image, Flux will test your patience.
  • Licensing limits commercial use. Flux.1 Dev (the best quality variant) has a non-commercial license. Only Flux.1 Schnell is Apache 2.0 for commercial use, and it sacrifices some quality for speed.
  • LoRA ecosystem is smaller than SDXL. Flux LoRAs exist on CivitAI but the library is much smaller than SDXL. If you need very specific styles or characters, SDXL still has more options.

Pro Tips

  1. Start with Flux.1 Schnell if you're testing your setup. It generates in 1–4 steps instead of 20–30, so you'll know in seconds if everything is working. Switch to Dev for quality once your setup is confirmed.

  2. Use the FP8 T5 text encoder even on a 24 GB card. It saves ~5 GB of VRAM with almost zero quality difference. That freed VRAM lets you use higher-resolution generation or add LoRAs without running out of memory.

  3. If Flux feels slow with certain LoRAs, switch to GGUF model format. A known issue causes standard Flux + some LoRAs to slow dramatically. GGUF-format models don't have this problem.

Alternatives for This Use Case

Tool/Model Why You'd Pick It Downside
SDXL (via Forge or ComfyUI) Runs on 6 GB VRAM, huge LoRA library, fast generation Lower quality than Flux, especially faces/hands
LocalForge AI Flux and Forge pre-configured, zero setup, runs offline 50 USD one-time cost
Midjourney (cloud) High quality, easy to use Subscription, no local option, content filters

Verdict

Flux is the model to run if you care about image quality — nothing else matches it for photorealism in 2026. Your GPU determines the experience: 24 GB is ideal, 12 GB is great with FP8, and even 8 GB works with GGUF quantization. The setup takes 30–60 minutes, but once it's running you have the best image model available, completely offline and unrestricted. If Flux is too heavy for your hardware, SDXL through Forge is the next best option — lighter, faster, and still excellent.

About Flux

Runs Locally Yes
Open Source Yes
NSFW Allowed Yes
Website https://blackforestlabs.ai

Frequently Asked Questions

Can I run Flux on an 8 GB GPU? +
Yes. Use the GGUF Q4_K_S or Q5_K_S quantized format with CPU offloading. Quality stays around 90-95% of full precision. Generation takes about 60-90 seconds per image.
What's the difference between Flux Schnell and Flux Dev? +
Schnell generates in 1-4 steps (fast, lower quality). Dev generates in 20-30 steps (slower, highest quality). Schnell has an Apache 2.0 license for commercial use. Dev is non-commercial only.
Which frontend should I use for Flux? +
ComfyUI has the best Flux support with all formats (FP16, FP8, GGUF, NF4). Forge is easier to learn but has fewer advanced options. Both work well for standard generation.
How much disk space does Flux need? +
The model checkpoint is 4-12 GB depending on format. Text encoders add 5-10 GB. Plus the frontend itself. Budget 20-30 GB total for a complete Flux setup.
Is Flux better than Stable Diffusion XL? +
For photorealism, faces, hands, and text rendering — yes, significantly. SDXL is lighter on VRAM, faster, and has a much larger LoRA library. Pick Flux for quality, SDXL for speed and variety.