GEMINI LABJP
FLASH35 — Gemini 3.5 Flash is now GA, built for sustained frontier performance on agentic and coding tasks (Jun)AGENTS — Managed Agents launch in public preview, running in Google-hosted isolated Linux sandboxes (Jun)SCHEMA — The Interactions API legacy schema is removed on June 8; migrate from outputs to steps now (Jun)SEARCH — Gemini 3.5 Flash rolls out globally across Search AI Mode and the Gemini app for everyone (Jun)FILESEARCH — File Search goes multimodal, embedding and searching images natively via gemini-embedding-2 (Jun)DEPRECATE — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down on June 25 (Jun)FLASH35 — Gemini 3.5 Flash is now GA, built for sustained frontier performance on agentic and coding tasks (Jun)AGENTS — Managed Agents launch in public preview, running in Google-hosted isolated Linux sandboxes (Jun)SCHEMA — The Interactions API legacy schema is removed on June 8; migrate from outputs to steps now (Jun)SEARCH — Gemini 3.5 Flash rolls out globally across Search AI Mode and the Gemini app for everyone (Jun)FILESEARCH — File Search goes multimodal, embedding and searching images natively via gemini-embedding-2 (Jun)DEPRECATE — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down on June 25 (Jun)
Articles/Advanced
Advanced/2026-06-04Intermediate

Pre-Screening Wallpaper App Submissions with Gemini Vision: A Two-Week Field Memo

Before submitting a new batch of wallpapers, I spent two weeks running Gemini's image understanding as a first-pass filter for store review risk. What it caught, what it missed, and where a human still has to decide.

Gemini98Image RecognitionMultimodal7Indie Development3App Store8Review

One morning a "possible policy violation" email arrived from Google Play, and the rest of my day quietly fell apart. The cause was a single image buried in a wallpaper batch that contained a small, trademark-like logo in one corner. My own eyes had walked right past it.

I have been building wallpaper apps as an indie developer since 2014, and the catalog has grown past 50 million cumulative downloads. Yet my pre-submission check had always been manual. Reviewing dozens of images one by one wears down your attention exactly when you can least afford to lose it. So I spent two weeks testing a simple question: if I insert Gemini's image understanding as a first-pass filter before submission, how much of that burden actually goes away? Here is what I measured.

Why manual review stopped being enough

A wallpaper update often ships several dozen images at once, sometimes more than a hundred. The time you can spend per image is tiny, and the things that trip review are never the same twice: copyrighted material, excessive exposure, violent motifs, misleading text. The criteria are plural, and human attention reliably drops during repetitive work.

My grandfather, a temple carpenter, reportedly inspected every piece of timber before joining it. The act of checking with his own hands was, I was told, a kind of devotion. I do not want to treat checking as a chore either, but relying on my eyes alone meant missing the one thing that mattered at the worst moment. That is exactly why I wanted a first-pass filter to back up human attention.

What I asked Gemini to look at

I built a pipeline that hands every image in the submission folder to Gemini and returns a structured "risk: high / review / none" judgment for each predefined criterion. I chose a fast, low-cost Flash model. Because the per-image cost is small, running it across a hundred images is painless.

import google.generativeai as genai
from pydantic import BaseModel
from enum import Enum
 
class Risk(str, Enum):
    none = "none"
    review = "review"
    high = "high"
 
class Verdict(BaseModel):
    trademark_or_logo: Risk
    explicit_content: Risk
    violence: Risk
    misleading_text: Risk
    note: str
 
model = genai.GenerativeModel("gemini-2.5-flash")
 
def screen(image_path: str) -> Verdict:
    img = genai.upload_file(image_path)
    prompt = (
        "Inspect this image from the perspective of a mobile app store reviewer. "
        "Rate four criteria — trademark/logo presence, exposure, violent motifs, "
        "and misleading text — as none / review / high, with a brief reason in note."
    )
    res = model.generate_content(
        [prompt, img],
        generation_config={"response_mime_type": "application/json",
                           "response_schema": Verdict},
    )
    return Verdict.model_validate_json(res.text)

I fixed the output to a JSON schema so I could aggregate it directly downstream. The rule became three-tiered: any single high sends the batch to my own eyes, review gets judged after reading the reason, and none-only images pass through.

Where it pushed back

The first few days surprised me with false positives. The "misleading text" criterion was especially sensitive, repeatedly mistaking decorative English lettering on a wallpaper for a brand name. Once I made the criterion concrete — "mark high only when it is identifiable as a real existing brand" — the noise visibly dropped. Changing behavior by adding a single line to the prompt is part of the quiet joy of working with an API.

What it struggled with was the impression created by a composition as a whole. There was one image where every element was fine individually, yet the combination could be misread by a reviewer. Gemini returned none there, and in the end it was my own unease that led me to swap it out. It is good at decomposing elements, but reading the overall mood is still a human's job.

Effect and limits after two weeks

By the numbers, the volume sent to manual review felt like it dropped by more than half. Because I can pass none-only batches with confidence, I redirect my concentration toward review and high. As a test I re-ran an old batch, and it re-caught the logo-in-the-corner image I had missed back then, flagging it high.

Still, I would not call this "automating review." Gemini catches only what can be decomposed into stated criteria. Store policies update in their details, and judgments involving cultural nuance still rest with me. These apps carry my revenue base, including AdMob, so I have decided not to let go of the final call. The distance that fit best was this: Gemini is not a replacement decision-maker but an assistant that steers my attention to where it is needed.

What I am trying next

Next I am building a loop that accumulates the reason text behind each review verdict and compares it against my own swap decisions. Once I can see which criteria Gemini and I disagree on, I should be able to tune the prompt closer to my own app's standards. For fellow indie developers shipping large volumes of imagery to the stores, I would say image understanding as a first-pass filter has already reached a genuinely practical stage.

Thank you for reading to the end.

Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

Advanced2026-03-22
Gemini × Figma MCP — Building an Automated Store Submission Asset Pipeline
Learn to build an automated pipeline that generates App Store and Google Play submission assets using Gemini and Figma MCP
Advanced2026-05-19
One Month of Letting Gemini 2.5 Pro Help With Apple Privacy Manifests — Indie Developer Notes
Notes from one month of using Gemini 2.5 Pro to help maintain PrivacyInfo.xcprivacy across an indie iOS app catalog. What worked, what didn't, and the workflow I settled on.
Advanced2026-05-05
Building a B2B Business Automation SaaS with Gemini 2.5 Pro Function Calling — Revenue Blueprint
A complete guide to building and selling B2B business automation SaaS using Gemini 2.5 Pro Function Calling. Covers API architecture, multi-tenant design, pricing strategy, and the sales process that closed first contracts within 3 weeks of demo.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →