ARTICLES

All Articles

Let the public-preview Managed Agents generate files and broken artifacts will flow straight into production. Here is how to build a verification gate that artifacts must pass before you accept them, with runnable Python and a rejection-feedback loop.

◈ API & SDK/2026-06-16Advanced

Wiring Gemini Managed Agents Into Your Automation: Keeping Conversation State and Environment State Apart

Managed Agents spin up a Linux sandbox, run an agent loop, and return a result in a single API call. The first thing that trips you up when moving off a hand-rolled loop is that conversation state and file state are two separate things. Here's that design, worked through live.

◈ API & SDK/2026-06-16Advanced

Don't Break When the Default Model Moves: A Startup Capability-Probing Layer for Gemini

Pinning a model name breaks on deprecation; trusting the default breaks when the weights swap silently. This is the design I settled on: probe what the served model can actually do at startup, then build every request from that answer. Includes runnable Python.

◈ API & SDK/2026-06-16Advanced

Your Gemini Live API session forgets the conversation every time it reconnects — field notes on token refresh and session resumption

Why a Gemini Live API WebSocket drops the conversation and the user's in-flight speech on every reconnect, and how to close the gap with single-use ephemeral tokens, session resumption handles, and the goAway warning.

◈ API & SDK/2026-06-15Intermediate

Put Help Docs and Screenshots in One File Search Store and Return Answers That Cite the Image Too

Your text help docs and your screenshots live in separate stores, so a single question can never return both the steps and the matching screen. With gemini-embedding-2 going multimodal in File Search, here is how I merged them and returned the cited screenshot alongside the answer.

◈ API & SDK/2026-06-15Advanced

When the Default Model Silently Upgrades: Catching Prompt Regressions in Numbers

Gemini 3.5 Flash is now the default and you can no longer turn it off. Assuming your responses can shift without you touching the prompt, here is how to bundle prompt, model, and sampling into one variant and catch regressions with canaries and an LLM judge — in working code.

◈ API & SDK/2026-06-15Advanced

Defending Against Prompt Injection When You Pass External Text to the Gemini API

User reviews, scraped articles, and other untrusted text are the entry point for indirect prompt injection when you feed them to the Gemini API. Here is a prioritized, code-backed defense you can drop into a production pipeline: trust-boundary isolation, schema constraints, a two-stage screening pass, and output sanitization.

◈ API & SDK/2026-06-15Advanced

Permission-Aware RAG — Designing Gemini Search That Only Cites What the User Is Allowed to See

The day you add RAG to internal search, drafts and finance memos nobody should see start leaking into answers. This is a production design — metadata filtering, defense in depth, and audit logging — for letting Gemini search while respecting permissions, with working code.

◈ API & SDK/2026-06-14Advanced

How a Deep Think Verification Step Tripled My API Bill, and How thinking_level Got It Back

After wiring API-accessible Gemini 3 Deep Think into my output-verification step, my projected monthly cost jumped roughly 3x. Here is the implementation record of capping it with thinking_level and a cost guardrail, then settling on a two-stage design with Flash.

◈ API & SDK/2026-06-14Intermediate

Generate With Flash, Escalate to Deep Think Only When Unsure: A Two-Stage Pipeline

With Deep Think opening up on the API, the move is not to route every request through the heavy model but to have Deep Think verify only when Flash's output looks shaky. Here is the cost reasoning and working code.

◈ API & SDK/2026-06-14Advanced

Keeping Gemini API's Default-Model Shift From Becoming an Incident — Pinning Model IDs and Detecting Silent Upgrades in Production

When the default model quietly moves up, your output length, reasoning behavior, and cost change with zero code edits. This guide shows how to pin model IDs in a single source of truth and verify the effective model from the response to detect default changes.

◈ API & SDK/2026-06-14Advanced

Controlling Image Tokens with the Gemini API media_resolution Setting — Tuning Batch Image Classification by Measurement

media_resolution, introduced in the Gemini 3 line, switches how many tokens an image input consumes across three levels. Through real batch-classification measurements, this guide shows how to balance cost and accuracy by assigning the right tier per task.

All Articles

Before You Let a Managed Agent Ship: Designing Your Own Acceptance Gate

Wiring Gemini Managed Agents Into Your Automation: Keeping Conversation State and Environment State Apart

Don't Break When the Default Model Moves: A Startup Capability-Probing Layer for Gemini

Your Gemini Live API session forgets the conversation every time it reconnects — field notes on token refresh and session resumption

Put Help Docs and Screenshots in One File Search Store and Return Answers That Cite the Image Too

When the Default Model Silently Upgrades: Catching Prompt Regressions in Numbers

Defending Against Prompt Injection When You Pass External Text to the Gemini API

Permission-Aware RAG — Designing Gemini Search That Only Cites What the User Is Allowed to See

How a Deep Think Verification Step Tripled My API Bill, and How thinking_level Got It Back

Generate With Flash, Escalate to Deep Think Only When Unsure: A Two-Stage Pipeline

Keeping Gemini API's Default-Model Shift From Becoming an Incident — Pinning Model IDs and Detecting Silent Upgrades in Production

Controlling Image Tokens with the Gemini API media_resolution Setting — Tuning Batch Image Classification by Measurement