April 2026
What does AI image generation actually cost in 2026?
A founder ran a 500-image catalog refresh on a premium model and expected a $20 bill. The invoice was $147. Here is why, and how to price the next one before you click run.
The list price is not the price
Providers quote per-image cost at a default resolution, on a happy path, with no NSFW filter rejections, no agent re-prompts, and no double-charging for the failed render that came before the keeper. Real workloads pay for all of it. The effective per-image cost on a production run is typically 1.4× to 3× the headline number, depending on how aggressive your agent's retry policy is and how strict the safety filter is on your prompt distribution.
Three multipliers do most of the damage. Resolution: a 2048² render is roughly 4× the compute of 1024². Retries: a sane "regenerate until it looks right" loop runs 1.4× on average and spikes to 2.5× on stylized prompts. Rejections: a filtered output costs the same as a delivered one. Nobody refunds a flagged generation.
Per-model price bands, 2026
Numbers below are typical hosted-API prices for a single 1024² render with default steps and no editing. 2048² is roughly 3.5× to 4×. Inpainting and editing typically add 30% to 60% on the same surface.
Model Tier 1024² 2048² Notes ------------------ -------- ----------- ---------- -------------------------- SDXL Lightning draft $0.0025 $0.009 4-step, fastest FLUX schnell draft $0.003 $0.011 open weights SDXL Turbo draft $0.003 $0.011 1-step variants exist SD3.5 Medium mid $0.012 $0.045 open weights FLUX dev mid $0.020 $0.075 open weights Playground v3 mid $0.025 $0.090 strong typography SD3.5 Large mid $0.030 $0.110 slower, sharper HiDream premium $0.035 $0.130 photoreal Imagen 3 premium $0.040 $0.150 Google, strict filter FLUX 1.1 pro premium $0.040 $0.150 best general quality Ideogram v2 premium $0.050 $0.180 best in-image text
Bands shift weekly. The shape does not. Draft models are roughly an order of magnitude cheaper than premium, mid-tier sits in the middle, and premium clusters tightly between four and five cents at 1024². If you want live numbers per call, the model catalog lists every model in the AgentFramer catalog with its current rate.
Workload one: solo creator, product launch
Eight final assets: one OG image, three blog illustrations, four social variants. The agent regenerates 1.4× on average. Two prompts trip the safety filter and cost as much as a delivered image (counted as a rejection multiplier of 1.05× on top of retries).
All-premium (FLUX 1.1 pro, $0.04): 8 × 1.4 × 1.05 × $0.04 = $0.47.
All-mid (Playground v3, $0.025): 8 × 1.4 × 1.05 × $0.025 = $0.29.
Mixed (draft for blog and social, premium only for the OG): 7 × 1.4 × 1.05 × $0.003 + 1 × 1.4 × 1.05 × $0.04 = $0.03 + $0.06 = $0.09.
Same eight images, mixed strategy is one-fifth the cost of all-mid and one-fifth-and-change of all-premium. The OG is the one image that actually needs the premium model. The other seven do not.
Workload two: ecommerce catalog, 500 product shots per month
Apparel brand replaces flat lays with on-model shots. Most products ship at 2048² because the PDP zooms. Retries run hot at 1.6× because hands and fabric drape fail often. Filter rejections are negligible on this prompt distribution.
Premium at 2048² (FLUX 1.1 pro, $0.15): 500 × 1.6 × $0.15 = $120/month.
Mid-tier at 2048² (SD3.5 Large, $0.11): 500 × 1.6 × $0.11 = $88/month.
Two-pass: draft at 1024² to pick the keeper, premium re-render at 2048² for the winner only. 500 × 4 candidates × $0.003 + 500 × 1.1 × $0.15 = $6 + $82.50 = $88.50/month.
The two-pass pattern matches the mid-tier price while keeping premium quality on the deliverable. The trick is that draft candidates are cheap enough to spam four per slot, then commit compute only on the winner. This is also where tool-call patterns earn their keep: the agent calls generate_image four times in parallel, scores them, then calls generate_image once more on the winner at higher resolution.
Workload three: content agency, 5,000 images per month, 20 clients
Mostly social, some hero shots, some animated thumbnails. Assume 80% at 1024² on mid-tier, 20% at 2048² on premium. Retries 1.4×, filter rejections 1.05× across the mix.
Mid-tier portion: 4,000 × 1.4 × 1.05 × $0.025 = $147/month.
Premium portion: 1,000 × 1.4 × 1.05 × $0.15 = $220.50/month.
Total: $367.50/month, or roughly $18 per client.
An agency that bills $400 to $2,000 per client per month treats $367.50 as a rounding error. The same agency on all-premium at 2048² would pay 5,000 × 1.4 × 1.05 × $0.15 = $1,102.50/month. The difference is whether image generation is a line item or a margin problem.
Two operational details matter at this volume. First, parallelism: running 20 concurrent generation calls from the same workspace will not always run 20× faster, because most providers throttle per account. Build the schedule around throughput per minute, not raw price. Second, attribution: when one workspace serves 20 clients, you need per-client cost reporting or you cannot bill back. Per-call metadata on every generate_image request, plus a daily roll-up by tag, turns an opaque monthly bill into a defensible invoice line.
Self-host break-even, 2026 GPU rates
Rented GPU rates have settled. RTX 4090 ~$0.30/hr on community providers, A100 80GB ~$1.40/hr, H100 ~$2.50/hr. Throughput depends on model and resolution; rough numbers for 1024² with reasonable steps:
GPU Hourly SDXL Lightning FLUX schnell FLUX dev SD3.5 Large
(img/hr) (img/hr) (img/hr) (img/hr)
RTX 4090 $0.30 ~3,600 ~1,800 ~600 ~450
A100 80GB $1.40 ~6,000 ~3,000 ~1,000 ~750
H100 $2.50 ~12,000 ~6,000 ~2,000 ~1,500
Implied per-image cost on rented GPUs:
RTX 4090 SDXL Lightning $0.00008 | FLUX dev $0.0005
A100 SDXL Lightning $0.00023 | FLUX dev $0.0014
H100 SDXL Lightning $0.00021 | FLUX dev $0.00125Raw compute on a 4090 is two orders of magnitude cheaper than the hosted draft price. That is not the real cost. Add cold-start time on rented hardware, an autoscaler, a queue, model weight downloads, storage, observability, and one engineer who knows how to keep it running. The honest break-even where self-host beats hosted (after engineering load) sits around 40,000 to 60,000 mid-tier images per month for a small team, lower if you already run GPU infrastructure for other reasons. Below that volume, hosted wins on total cost of ownership even when the per-image math says otherwise.
The number that surprises people on self-host is utilization. A rented H100 costs $2.50/hr whether it is rendering or idle. Hit 30% utilization and your effective per-image cost triples. Hosted providers solve this by pooling demand across thousands of customers, which is exactly why the per-image price they quote can be lower than your own GPU bill divided by your own throughput.
When the cheapest model is the wrong choice
The cost math breaks in three places, and they all look like savings until you measure outcomes.
In-image text. SDXL Lightning at $0.0025 will mangle "Summer Sale 30% Off" four times in a row. Ideogram v2 at $0.05 nails it on the first try. Run the math: 4 × $0.0025 of garbage plus a human hour to fix it costs more than $0.05 once.
Brand consistency across a series. Draft models drift on character, palette, and composition. If a launch needs eight cohesive images, the cheap-model run that takes 24 attempts to converge is more expensive than premium running 8.
Strict safety filters on edgy categories. Imagen 3 has the strictest filter in the premium tier. Wellness, fashion, fitness, and any prompt that mentions a body part can trigger a rejection. Rejections cost as much as deliveries. On rejection-prone categories, FLUX 1.1 pro at the same price often delivers 30% more usable output per dollar.
How to price your run before you click go
Three numbers and you have a budget. Image count × retry multiplier × per-model price. Add 5% for filter rejections on body-adjacent categories. Multiply by 4 if you are rendering at 2048² instead of 1024². If you want a hard ceiling, set a workspace credit cap and let the agent read get_credits before each batch; it will stop on its own when the wallet runs out.
The rule that fits on a sticky note: draft models for exploration, mid-tier for volume, premium only for the asset that will actually get printed, posted as a hero, or read aloud in a deck. Every dollar of unnecessary premium is a dollar that could have funded ten more iterations.