FLUX 2 Dev
FLUX 2 Dev (FLUX.2-dev) is the frontier-level, open-weight rectified flow Transformer for image generation and editing. It blends a 32B rectified flow core, a long-context vision–language model, and multi-reference editing to deliver production-grade visuals.
Model Type
Rectified flow Transformer · 32B
Highlights
Multi-reference editing · 4MP renders · Long-context VLM
Ecosystem
Hugging Face · Cloudflare Workers AI · NVIDIA RTX · ComfyUI
What Is FLUX 2 Dev?
FLUX 2 Dev (FLUX.2-dev) is Black Forest Labs’ open-weight, 32-billion-parameter rectified flow Transformer for image generation and editing. It merges a latent-space flow transformer, a long-context VLM for reasoning and prompts, and a multi-reference editing path that supports multiple images in a single checkpoint.
FLUX 2 Dev core benefits
- Frontier quality open weights for production-grade image generation.
- Multi-reference editing to keep characters, style, and branding consistent.
- Long-context VLM with ~32K tokens for detailed prompts and layouts.
- Designed for RTX, edge, and cloud with quantized variants.
FLUX 2 vs FLUX.1 at a glance
- New rectified flow Transformer vs FLUX.1’s diffusion-style U-Net.
- Up to 4MP resolution with improved lighting, hands, faces, and text.
- Guidance distillation reduces steps and guidance scale at inference.
- Multi-reference baked into the main checkpoint for editing and consistency.
FLUX 2 Dev quick facts
Use cases
Ad creatives, hero banners, 3D concept art, product renders, interactive filters, avatars, rapid prototyping.
Ecosystem
Hugging Face Diffusers, Cloudflare Workers AI, NVIDIA RTX pipelines, ComfyUI workflows.
One-liner
FLUX 2 Dev is an open-weight, multi-reference image engine ready for real products.
Feature of FLUX 2 Dev
Key capabilities of FLUX 2 Dev that help teams ship high-quality visuals faster.
Multi-reference editing
Mix up to many reference images to keep characters, branding, and style consistent in one checkpoint.
High-resolution output
Up to 4MP / 4K-class images with improved text rendering, lighting, hands, and faces.
Efficient inference
Rectified flow sampling plus guidance distillation reduces steps and guidance scale for faster iterations.
Long-context VLM
Vision–language encoder with ~32K tokens to follow long prompts, layouts, and hex color instructions.
Flexible deployment
Runs via Hugging Face, Cloudflare Workers AI, RTX FP8/FP4 pipelines, and ComfyUI templates.
Ecosystem ready
Supports Diffusers integration, quantized variants, control hints, and extension APIs for tooling.
FLUX 2 Core Architecture
FLUX 2 Dev architecture combines a rectified flow Transformer, high-resolution VAE, long-context VLM, and adaptive schedulers to deliver quality and speed.
Model core
- Rectified flow Transformer in latent space.
- 32B parameters with transformer-style blocks instead of U-Net.
- High-resolution VAE decoding up to ~4MP.
- Vision–language encoder with multi-image conditioning.
Scheduler & inference
- Custom rectified flow schedules, fewer steps for drafts.
- Guidance distillation bakes guidance into weights.
- Adaptive steps: ~12–20 (preview) or 28–50 (production).
Extension & multi-reference
- Multi-image inputs (2–10) for character, style, brand consistency.
- Control hints: depth, pose, segmentation via custom nodes.
- Prompt embeddings and image masks for localized editing.
High-speed Transformer vs other architectures
FLUX 2 Dev’s Transformer core changes memory patterns, parallelization, and quantization compared to U-Net diffusion models.
| System | Core architecture | Typical steps | Resolution focus | Best at |
|---|---|---|---|---|
| FLUX 2 Dev | Rectified flow Transformer in latent | 20–40 | Up to ~4MP | High-end, multi-ref T2I + I2I |
| SDXL | Latent diffusion + U-Net | 25–50 | HD / 4K ensemble | General T2I / I2I |
| LCM / LCM-LoRA | Distilled diffusion | 4–8 | HD | Fast previews / video |
Performance Optimization for FLUX 2 Dev
Optimize FLUX 2 Dev with quantization, weight streaming, and guidance distillation.
Optimization levers
- Quantization: 4-bit variants via Bits-and-Bytes and Diffusers.
- NVIDIA FP8 / FP4 Tensor Core pipelines for RTX speedup.
- Weight streaming/offload to fit high-end GPUs with limited VRAM.
- Guidance distillation for fewer steps and lower guidance scales.
Indicative GPU throughput
| GPU & precision | VRAM use | Time / image | Notes |
|---|---|---|---|
| RTX 4090, BF16 | 24–30 GB | 9–12 s | Full quality baseline |
| RTX 4090, FP8/4-bit + offload | 14–18 GB | 7–10 s | Weight streaming + quantization |
| RTX 5090, FP4 pipeline | ~10 GB | 4–6 s | Projected 2× over 4090 |
| Cloudflare Workers AI | Abstracted | 8–15 s edge | Edge latency + global reach |
Multi-backend support
Start with PyTorch + Diffusers for correctness, then profile and migrate to TensorRT or ONNX for latency-sensitive endpoints.
- PyTorch: official reference and Diffusers integration.
- TensorRT / NIM: NVIDIA-optimized runtimes for RTX and data center GPUs.
- ONNX Runtime / OpenVINO: community conversions for wider hardware.
How to Use FLUX 2 Dev
Quick-start guides for FLUX 2 Dev on Hugging Face Diffusers and Cloudflare Workers AI, plus a latency benchmark snippet.
Using FLUX 2 Dev on Hugging Face
Python + Diffusers text-to-image example (quantized checkpoint with remote encoder).
import torch
from diffusers import Flux2Pipeline
from diffusers.utils import load_image
device = "cuda"
dtype = torch.bfloat16
repo_id = "diffusers/FLUX.2-dev-bnb-4bit"
pipe = Flux2Pipeline.from_pretrained(
repo_id,
torch_dtype=dtype,
).to(device)
prompt = (
"Cinematic concept art of a futuristic city at sunset, "
"soft volumetric lighting, ultra-detailed, 4k, film still"
)
image = pipe(
prompt=prompt,
num_inference_steps=28,
guidance_scale=4.0,
generator=torch.Generator(device=device).manual_seed(42),
).images[0]
image.save("flux2_city.png")
- Steps: 16–24 (drafts), 28–40 (production).
- Guidance: 3–5 is often enough thanks to distillation.
- Resolution: start at 1024×1024, upscale via SR if needed.
- Multi-image: pass a list of PIL images to
image=[...].
Deploying FLUX 2 Dev on Cloudflare Workers AI
Edge-deployed inference with env.AI.run.
export interface Env {
AI: Ai;
}
const MODEL = "@cf/black-forest-labs/flux-2-dev";
export default {
async fetch(request, env): Promise<Response> {
const url = new URL(request.url);
const prompt = url.searchParams.get("prompt") ?? "cyberpunk cat";
const imageArrayBuffer = await env.AI.run(MODEL, { prompt });
return new Response(imageArrayBuffer, {
headers: { "content-type": "image/jpeg" },
});
},
};
Quick cURL test:
curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai/run/@cf/black-forest-labs/flux-2-dev" \
-X POST \
-H "Authorization: Bearer $API_TOKEN" \
-d '{ "prompt": "a futuristic racing car on a neon highway, cinematic" }' \
--output flux2_workers.jpg
Pattern: prototype on Workers AI, then move hot paths to custom RTX clusters when you need batching, schedulers, or LoRAs.
Latency benchmark snippet
import time
start = time.perf_counter()
_ = pipe(prompt=prompt, num_inference_steps=28, guidance_scale=4.0).images[0]
elapsed = time.perf_counter() - start
print(f"Time per image: {elapsed:.2f}s")
Run across resolutions and step counts to populate your own FLUX 2 Dev performance table.
FLUX 2 Dev and ComfyUI
FLUX 2 Dev ships with ComfyUI support via quantized checkpoints and ready-made templates.
ComfyUI setup checklist
- Update ComfyUI to latest.
- Download flux2-dev .safetensors to
ComfyUI/models/checkpoints/. - Load Flux 2 text-to-image template from built-in examples or community workflows.
- Add control nodes (pose, depth), LoRA adapters, and compositing nodes for logos and typography.
Example FLUX 2 Dev workflow skeleton
{
"nodes": [
{ "id": 1, "type": "LoadCheckpoint", "inputs": { "ckpt_name": "flux2-dev.safetensors" } },
{ "id": 2, "type": "CLIPTextEncode", "inputs": { "text": "cinematic cyborg hacker, neon lights" } },
{ "id": 3, "type": "EmptyLatentImage", "inputs": { "width": 1024, "height": 1024, "batch_size": 1 } },
{ "id": 4, "type": "KSampler", "inputs": { "model": 1, "positive": 2, "latent_image": 3, "steps": 28, "cfg": 4.0, "sampler": "flow_dpm" } },
{ "id": 5, "type": "VAEDecode", "inputs": { "samples": 4 } },
{ "id": 6, "type": "SaveImage", "inputs": { "images": 5, "filename_prefix": "flux2_dev_demo" } }
]
}
Extend with style LoRAs, control hints, and video pipelines (e.g., Wan 2.x) for animated outputs.
FLUX 2 Dev Performance Comparison
Position FLUX 2 Dev against SDXL and LCM for latency and quality trade-offs.
| Model | Approx. steps | Relative speed* | Visual quality** | Notes |
|---|---|---|---|---|
| FLUX 2 Dev | 24–32 | 1.0× | 9.4 / 10 | Frontier quality, multi-reference by design |
| SDXL | 30–40 | ~0.8× | 9.0 / 10 | Strong generalist, text weaker in many cases |
| SDXL + LCM | 6–8 | 2.5–3.0× | 8.7 / 10 | Fast iteration and video-friendly |
*Relative speed vs FLUX 2 Dev on the same GPU. **Subjective; replace with your benchmarks.
Typical FLUX 2 Dev Use Cases
Product marketing
Brand-consistent ad creatives, hero banners, multi-language posters.
Creative pipelines
Concept art, storyboards, character sheets, animation keyframes with multi-reference consistency.
Interactive experiences
Edge-hosted filters, avatars, social thumbnails via Workers AI or custom RTX endpoints.
FLUX 2 Dev Visual Gallery
A quick look at FLUX 2 Dev outputs, partner ecosystem, comparisons, and reference demos.
Future of FLUX 2 Dev
Black Forest Labs roadmap themes and likely next steps.
Bigger, better base models
Continual realism and text alignment improvements, building on FLUX 1 → FLUX 2 gains.
Tooling & blueprints
NVIDIA AI blueprints, NIM microservices, ComfyUI templates, and 3D-guided workflows.
Safety & provenance
NSFW/IP filters, watermarking, and C2PA metadata in reference pipelines.
Summary & Practical Recommendations for FLUX 2 Dev
Who should use FLUX 2 Dev?
- Researchers exploring rectified flow, guidance distillation, and watermarking.
- Developers building T2I/I2I APIs, creative SaaS, or visual intelligence features.
- Creators needing consistent characters/styles across campaigns and storyboards.
Deployment tips
- Local: start with diffusers/FLUX.2-dev-bnb-4bit, BF16 or FP8, 1024² drafts.
- Cloud: Hugging Face, Replicate, FAL for GPU-free ops.
- Edge: Cloudflare Workers AI for low-latency share-images and avatars.
SEO keywords to target
FLUX 2, FLUX.2-dev, FLUX 2 model, Black Forest Labs FLUX 2, FLUX 2 architecture, FLUX 2 benchmark, FLUX 2 ComfyUI, FLUX 2 Cloudflare Workers AI.
FAQ: FLUX 2 Dev
1) What is FLUX 2 Dev?
FLUX 2 Dev (FLUX.2-dev) is an open-weight rectified flow Transformer (32B params) for image generation and editing, built by Black Forest Labs.
2) How is FLUX 2 different from FLUX.1?
FLUX 2 uses a rectified flow Transformer with a long-context VLM, higher resolution (up to ~4MP), stronger text rendering, and multi-reference editing baked into the checkpoint.
3) How many steps should I use?
Use 12–20 steps for previews and 28–40 for production. Guidance scale 3–5 is often sufficient.
4) Does FLUX 2 Dev support multi-reference?
Yes. You can combine multiple reference images (2–10) for character, style, or brand consistency in a single run.
5) Can I run FLUX 2 Dev on a single GPU?
With 4-bit/FP8 pipelines and weight streaming, FLUX 2 Dev can run on high-end RTX cards (e.g., 4090) within ~14–18 GB VRAM.
6) How to deploy FLUX 2 Dev at the edge?
Use Cloudflare Workers AI model @cf/black-forest-labs/flux-2-dev and the env.AI.run() API for global, low-latency responses.
7) What resolutions does FLUX 2 Dev support?
FLUX 2 Dev decodes up to ~4MP (4K-class). Start at 1024×1024 or 1536×1024 and upscale if needed.
8) How good is FLUX 2 Dev at text rendering?
It significantly improves text fidelity over FLUX.1 and many diffusion peers, making it strong for UI mockups and posters.
9) What are best practices for prompting FLUX 2 Dev?
Use natural, long prompts; include layout hints and hex colors. Keep guidance moderate (3–5) and set seeds for reproducibility.
10) Where can I find FLUX 2 Dev resources?
Search for the official FLUX 2 blog, Hugging Face model card “black-forest-labs/FLUX.2-dev”, Diffusers docs for Flux2Pipeline, and Cloudflare Workers AI model docs.