What clip skip setting does Pony V6 XL need in ComfyUI?

Set CLIPSetLastLayer to stop_at_clip_layer = -2. This is mandatory - the model was fine-tuned at clip skip 2 and produces plastic, muddy output without it.

Where do I download Pony Diffusion V6 XL?

Civitai model page at civitai.com/models/257749. Download the 'V6 (start with this one)' version - it's 6.46 GB in fp16 safetensors format. Verify file hashes if you mirror from third-party sources.

Does Pony V6 XL need a separate VAE file?

It has a baked VAE, but loading sdxl_vae.safetensors (335 MB) explicitly through a VAELoader node gives cleaner color reproduction - especially skin tones. Worth the extra node.

How much VRAM does Pony V6 XL need?

8 GB runs 1024x1024 in ComfyUI without tiling. 6 GB cards work with --lowvram but generations take 2-3x longer. Adding detailers and upscalers increases requirements - preview small, then commit.

What are score tags and do I have to use them?

Score tags (score_9, score_8_up, score_7_up) are aesthetic quality priors from training data. They're not required, but outputs without them are noticeably lower quality. Start with the top 3 at the beginning of your positive prompt.

Can I use SD 1.5 LoRAs with Pony V6 XL?

No. Pony is SDXL-class - it needs SDXL or Pony-specific LoRAs. SD 1.5 LoRAs will error or produce garbage. Check the LoRA's base model tag on Civitai before downloading.

ComfyUI Pony Diffusion V6 XL workflow (2026)

The Quick Answer

Key Takeaway - May 2026

Pony Diffusion V6 XL (civitai.com/models/257749) is a 6.46 GB SDXL-class checkpoint trained on 2.6M aesthetically ranked images with 71k+ reviews. You need three things configured correctly: clip skip 2 via CLIPSetLastLayer, the external sdxl_vae.safetensors wired explicitly, and score tags (score_9, score_8_up, score_7_up) at the start of your positive prompt. Generate at 1024x1024 with Euler a or DPM++ 2M Karras, 25 steps, CFG 7. The model has 950k+ downloads for a reason - it works when you follow its conventions.

What You Need

Pony Diffusion V6 XL checkpoint - download the "V6 (start with this one)" version from Civitai model 257749 (6.46 GB, fp16 safetensors, pruned)
sdxl_vae.safetensors - 335 MB, available from the same Civitai page or HuggingFace mirrors
ComfyUI installed and launching (this guide assumes you've done that already)
GPU with 8+ GB VRAM - RTX 3060 or better runs 1024x1024 without tiling. 6 GB cards work with --lowvram but expect 2-3x slower generations
ComfyUI Manager (optional but saves time resolving missing custom nodes)

Step 1 - Set Up the Base Graph

Start with ComfyUI's default SDXL workflow. Delete any SD 1.5 nodes - Pony is SDXL-class and needs the dual CLIP encoder path. Your minimum graph:

CheckpointLoaderSimple - connects to KSampler, positive/negative CLIP, and VAE
CLIPSetLastLayer - wired between checkpoint CLIP output and both CLIPTextEncode nodes
VAELoader - override the checkpoint's baked VAE with the external sdxl_vae.safetensors
KSampler - feeds into VAEDecode, then SaveImage

Pin your ComfyUI commit before you install custom nodes. Comfy updates break node compatibility more than the changelogs admit - a frozen install means your workflow JSON stays reproducible.

Step 2 - Load the Checkpoint

Point CheckpointLoaderSimple at your Pony V6 XL file. Verify the filename matches what you downloaded - renaming checkpoint files is how you spend an hour debugging "wrong model loaded" when the problem is a stale symlink.

The file is 6.46 GB in fp16 safetensors format. If you downloaded the full fp32 version (~12 GB), switch to fp16. There's no visible quality difference and you save ~6 GB of VRAM headroom.

Step 3 - Wire the VAE

Use a separate VAELoader node pointing at sdxl_vae.safetensors (335 MB). The checkpoint has a baked VAE, but the external SDXL VAE produces cleaner color reproduction - particularly in skin tones and gradients.

Connect the VAELoader output to both your VAEDecode node and anywhere else the graph expects a VAE input. Don't leave this on "auto" - explicit VAE wiring prevents silent fallbacks when you swap checkpoints later.

Step 4 - CLIP Set Last Layer (Clip Skip 2)

This is the setting that separates Pony results from generic SDXL. Add a CLIPSetLastLayer node and set stop_at_clip_layer to -2.

Wire it between the checkpoint's CLIP output and both CLIPTextEncode nodes (positive and negative). Every prompt encode must go through this node. If you wire one encoder directly to the checkpoint, that encoder runs at clip skip 1 and your positive/negative prompts interpret the model differently.

Clip skip 2 isn't optional for Pony V6 - skip this and your outputs look like generic SDXL with plastic skin and muddy details. The model was fine-tuned with this setting; fighting it wastes your time.

Step 5 - Score Tags in Your Prompts

Pony V6 was trained with aesthetic score tags derived from CLIP-based ranking across 2.6M images. Place them at the start of your positive prompt:

score_9, score_8_up, score_7_up, [your actual prompt here]

Minimum viable set: score_9, score_8_up, score_7_up - this is what most high-quality Civitai workflows use
Full stack: score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up - broader quality range, slightly less curated feel
Negative prompt: add score_6_up, score_5_up, score_4_up to your negative if you used only the top 3 in positive

Don't treat these as incantations. They're quality priors from training data. If your image gets muddy after adding more score tags, you added too many, not too few. Remove one layer at a time until the composition sharpens.

Step 6 - Generate and Lock Settings

Set your KSampler:

Sampler: euler_ancestral (Euler a) for variety, dpmpp_2m for consistency
Scheduler: karras
Steps: 25 (start here; going above 30 rarely improves output)
CFG: 7 (safe default; 8-9 adds contrast but risks oversaturation)
Resolution: 1024x1024 for square, 896x1152 for portrait, 1152x896 for landscape, 832x1216 for tall compositions

Hit Queue Prompt. Your first gen should show Pony-characteristic style: strong color saturation, good anatomy, responsive to the score tags. If it looks like vanilla SDXL, your clip skip node isn't wired correctly - go back to Step 4.

Verify It Works

A correctly configured Pony V6 graph shows these tells:

Color saturation is noticeably higher than base SDXL at the same prompt
Score tag response - removing score tags from positive prompt produces visibly lower-quality output
Clip skip test - temporarily set CLIPSetLastLayer to -1, regenerate with the same seed. If the output changes (more plastic, less detailed), your -2 setting was working correctly
VAE test - disconnect your external VAE, let the checkpoint's baked VAE take over. If colors shift slightly, your external VAE was active

Troubleshooting

Plastic/waxy skin: Check clip skip first - this is the cause 80% of the time. Then check stacked LoRAs fighting the base model's rendering. Third, lower CFG by 1-2 points
VRAM out of memory at 1024x1024: Enable ComfyUI's --lowvram flag. If you're on exactly 8 GB, close other GPU-using apps (Discord, browser hardware acceleration). Tiled VAE decode also helps
Score tags do nothing: Your CLIPTextEncode is wired directly to the checkpoint, bypassing CLIPSetLastLayer. Trace the graph connections from checkpoint CLIP output
Colors look washed out: You're using the baked VAE instead of the external sdxl_vae.safetensors. Wire a VAELoader node explicitly
LoRA breaks composition: You added a LoRA before your baseline was stable. Remove all LoRAs, confirm the base generates cleanly, then add one at a time at 0.5-0.7 strength

Bottom Line

Pony V6 XL in ComfyUI is three things done right: clip skip 2, external SDXL VAE, and score tags at prompt start. Get those three nodes wired correctly and you've got a 950k-download checkpoint doing what made it popular. Build boring first, experiment after.

ComfyUI Pony Diffusion V6 XL workflow (2026)

The Quick Answer

What You Need

Step 1 - Set Up the Base Graph

Step 2 - Load the Checkpoint

Step 3 - Wire the VAE

Step 4 - CLIP Set Last Layer (Clip Skip 2)

Step 5 - Score Tags in Your Prompts

Step 6 - Generate and Lock Settings

Verify It Works

Troubleshooting

Bottom Line

What to Do Next

FAQ

Your Privacy, Guaranteed

The Quick Answer

What You Need

Step 1 - Set Up the Base Graph

Step 2 - Load the Checkpoint

Step 3 - Wire the VAE

Step 4 - CLIP Set Last Layer (Clip Skip 2)

Step 5 - Score Tags in Your Prompts

Step 6 - Generate and Lock Settings

Verify It Works

Troubleshooting

Bottom Line

What to Do Next

FAQ

Related Guides

Get LocalForge AI

Redirecting to Secure Checkout...

Your Privacy, Guaranteed