GEMINI LABJP
OUTAGE — Gemini recovers from one of its biggest outages (errors 1076/1099) as engineering mitigations take effectDAILY-BRIEF — The new Daily Brief agent works overnight, analyzing your inbox, calendar, and tasks into a personalized morning digestGEMINI-OMNI — Gemini Omni combines Gemini with Google's generative media models to produce consistent, high-quality video from a single promptENTERPRISE — Gemini 3.5 Flash is enabled by default in Gemini Enterprise as of Jun 8 and can no longer be turned offDEPRECATION — Image preview models (3.1-flash-image / 3-pro-image) shut down Jun 25; migrate to the GA versions nowFILE-SEARCH — File Search now supports multimodal search, natively embedding and searching images via gemini-embedding-2OUTAGE — Gemini recovers from one of its biggest outages (errors 1076/1099) as engineering mitigations take effectDAILY-BRIEF — The new Daily Brief agent works overnight, analyzing your inbox, calendar, and tasks into a personalized morning digestGEMINI-OMNI — Gemini Omni combines Gemini with Google's generative media models to produce consistent, high-quality video from a single promptENTERPRISE — Gemini 3.5 Flash is enabled by default in Gemini Enterprise as of Jun 8 and can no longer be turned offDEPRECATION — Image preview models (3.1-flash-image / 3-pro-image) shut down Jun 25; migrate to the GA versions nowFILE-SEARCH — File Search now supports multimodal search, natively embedding and searching images via gemini-embedding-2
Articles/API / SDK
API / SDK/2026-06-12Advanced

Retiring the Midnight Polling Loop — Rebuilding My Gemini Batch Monitoring Around Webhooks

A working log of migrating Gemini Batch API completion monitoring from 60-second polling to event-driven webhooks: static vs dynamic, signature verification, and real numbers.

Gemini API133Batch API2Webhook2Event-drivenOperations2Python35

Premium Article

Around 4 a.m. I was scrolling through server logs and stopped cold.

My nightly Gemini Batch API job had finished long ago, but ingestion of the results didn't start until 58 seconds later. The reason was mundane: my completion check polled once every 60 seconds.

Most of the log was a record of "not done yet" responses. Counting one night's worth, the status-check GETs alone exceeded a thousand. Nine tenths of the traffic was doing no work at all.

When the Gemini API shipped Webhooks in May 2026, I took it as the cue to rebuild this monitoring layer. This is the working log.

Measuring what polling actually cost

Before rebuilding anything, I wanted the current state in numbers. As an indie developer I run everything myself, and this nightly pipeline generates App Store and Google Play descriptions plus localized in-app text for my apps in bulk through the Batch API — three jobs per night.

  • Polling interval: 60 seconds
  • Average job duration: about 2 hours (Batch API is best-effort, so this swings widely night to night)
  • GETs per night: roughly 120 × 3 jobs, plus retries — about 1,080 calls
  • Detection lag after completion: 30 seconds on average, 60 seconds worst case

The GETs themselves cost next to nothing. The real cost is owning one more always-running component: a cron entry and a polling script. I have been burned before — an unhandled exception once killed the watcher silently while the jobs themselves succeeded. Results sat there, uningested, all morning. That hollow feeling stays with you.

Static or dynamic — deciding where events land

Gemini API webhooks come in two flavors, and getting this decision wrong means a rebuild later, so it deserves care.

Static webhooks are project-level. Register an endpoint once with webhooks.create and every subscribed event in the project (batch.succeeded, batch.failed, and so on) arrives there. Signatures use a symmetric signing secret (HMAC).

Dynamic webhooks are per-job. Pass a webhook_config when calling batches.create and only that job's notifications go to the given URI. Signatures are asymmetric via JWKS, and you can attach routing hints in user_metadata.

My setup settled into two rules.

  1. Recurring nightly batches → static. The endpoint is fixed and feeds shared post-processing — database updates, a Slack ping — common to every job
  2. Ad-hoc and experimental jobs → dynamic. I tag them with user_metadata like {"job_group": "experiment"} and point them at a separate endpoint so they never leak into production post-processing

Resisting the inverse matters. If you keep widening the static subscription to absorb one-off jobs, the receiver's branching logic grows without bound.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
How I took roughly 1,080 status-check GETs per night down to zero, and why I still kept a thin fallback poll as insurance
A concrete rule for splitting jobs between static and dynamic webhooks that survived three weeks of production use
A Flask receiver you can run as-is, covering standardwebhooks signature verification, the 5-minute replay window, and webhook-id deduplication
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API / SDK2026-06-11
Gemini 3.2 API Developer Guide — Correct Model IDs, Migration from 3.1, and Production Checklist
A practical guide to calling Gemini 3.2 via the API: correct model IDs, what changed from Gemini 3.1, Python and TypeScript code examples, and a production migration checklist.
API / SDK2026-05-24
Why Your Gemini File API Uploads Vanish After 48 Hours — and How to Code Around It
Gemini File API resources are auto-deleted 48 hours after upload. Here is how to recognize the failure, why it happens, and concrete patterns for re-uploading, falling back to inline data, and managing expiration safely.
API / SDK2026-05-15
3 Gemini API Embedding Errors I Hit Building a Wallpaper App — and How I Fixed Them
Three real Gemini API Embedding errors encountered while building an auto-categorization feature for a wallpaper app with 50M+ downloads: INVALID_ARGUMENT, RESOURCE_EXHAUSTED 429, and poor RAG precision — with working code fixes.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →