When I set out to make a few hundred background images for a wallpaper app, the first wall I hit was not quality. It was that generation never seemed to finish, and the size of the bill waiting at the end of the month. Each image takes only seconds, but running hundreds through a top-tier model balloons both time and cost beyond what you pictured. Nano Banana 2 Lite, newly added to the Gemini family, fits exactly this "many, fast, cheap" demand. It launched as the fastest and lowest-cost Gemini image model.
That does not mean you should push everything onto the cheapest model. A fast, cheap model trades that speed for weaknesses in certain areas. Here is the pattern I use as an indie developer when mass-producing images: measure the cost, then decide where to split the work.
Hold per-image cost as a formula, not a guess
The most dangerous move in batch cost design is running production on a vague "probably about this much." Start by breaking per-image cost into three parts: the model fee for one generation, the regeneration rate for images you had to redo, and the yield of images you accepted versus threw away.
Discarded images cost money too. Miss that, and effective cost can run nearly double your estimate. As a formula:
effective cost / accepted image = per-call cost × (1 + regen rate) ÷ accept rate
e.g. per call = X, regen rate = 8%, accept rate = 70%:
effective = X × 1.08 ÷ 0.70 ≈ X × 1.54
So what you thought was "X per image" actually costs
about 1.5× that on an accepted basis.What matters here is not that I assert the absolute value of X. Model fees get revised, and your prompt and resolution shift it too. What you should do is generate the first 50 images for real and capture two measured values yourself: the regeneration rate and the accept rate. Once you have those two, monthly cost is just a multiplication by the number of images.
Catch it in two tiers: cheap model and quality model
The speed and low cost of Nano Banana 2 Lite pay off most in the stage where you produce a large batch of rough drafts. Meanwhile, the single hero image you put up front, or anything where broken detail is unacceptable, is often cheaper in the end to hand to a higher-quality model — because you redo it less.
So I use a two-tier setup: generate many candidates with Lite, then finish only the chosen ones on a quality model, or accept them as-is. The code below is a skeleton that generates a candidate pool with Lite, applies a cheap score to cull, and passes only survivors to the next stage.
import os
from google import genai
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
LITE_MODEL = "nano-banana-2-lite" # fastest, cheapest model, for bulk candidate generation
def generate_candidates(prompt: str, n: int) -> list[bytes]:
"""Generate n candidate images with the Lite model."""
images = []
for i in range(n):
resp = client.models.generate_images(
model=LITE_MODEL,
prompt=prompt,
config={"number_of_images": 1},
)
images.append(resp.generated_images[0].image.image_bytes)
return images
def keep_or_drop(image_bytes: bytes) -> bool:
"""Minimal accept/reject: a first-pass filter on size only."""
# Cull extremely small outputs (a sign of a broken generation)
return len(image_bytes) > 40_000 # tune the threshold against real data
def run_batch(prompt: str, want: int) -> list[bytes]:
kept, attempts = [], 0
while len(kept) < want and attempts < want * 3: # always cap to avoid an infinite loop
for img in generate_candidates(prompt, want - len(kept)):
attempts += 1
if keep_or_drop(img):
kept.append(img)
return keptAlways place the attempts < want * 3 cap. When a prompt is hard and the accept rate is very low, an uncapped batch spins forever and time and cost run open-ended. Hitting the cap is the signal to stop and revisit the prompt.
Do not turn failures into silent retries
In high-volume generation, transient errors and empty responses always occur at some rate. Drop in careless unlimited retries and, during an outage, retries avalanche and only the bill grows. Give retries a cap and a wait, and record failed inputs rather than discarding them.
| Failure type | Common reaction | Recommended handling |
|---|---|---|
| Transient rate limit | Immediate infinite retry | Exponential backoff, up to 3 tries |
| Empty / broken output | Accept it unnoticed | Cull with the filter; regenerate once only |
| Prompt-driven low accept rate | Brute-force more images | Stop the batch and fix the prompt |
| Interruption mid-run | Restart from scratch | Save accepted images; resume from where you left off |
Resuming from an interruption is a high-impact design in mass generation. If you accepted 250 of 300 and then errors on the last 50 force you to throw it all out and restart, that is wasted time and cost. Save accepted images incrementally and you only fill in the remainder.
Watch the migration and deprecation schedule too
Image models turn over fast. In fact, some image-generation models already have retirement dates announced. Hard-code a single model name because it is cheapest today, and you touch the whole codebase every time one is retired. Keep the model name as a config value in one place so switching is a one-line change, and the next revision or new model arrives without a scramble.
In my case, I keep the model name and a placeholder per-call cost in a small config file, and at the start of each month I update only that value and the measured accept rate. That alone lets me read the monthly cost of changing the image count instantly.
Where to put the dividing line
Nano Banana 2 Lite is the lead in the stage that produces many candidates fast and cheap. Decide the split up front — the hero image and anything intolerant of breakage go to the quality model, the rough drafts before them go to Lite — and you drop neither cost nor quality.
For a next step, pick one generation flow you currently run entirely on a high-quality model where you have never watched it through to final acceptance, swap just that "candidate" stage to Lite, and measure the regeneration and accept rates over 50 images. With those two numbers in hand, you can decide how far to lean on Lite by measurement, not by feel.
I hope it helps your build. Thanks for reading to the end.