◉GEMINI LAB JP

●MODEL — Gemini 3.5 Flash is now generally available, beating 3.1 Pro on nearly all benchmarks while running 4x faster●AGENTS — Managed Agents arrive in the Gemini API in public preview, running autonomous agents in isolated Google-hosted Linux sandboxes●SEARCH — File Search adds multimodal search, natively embedding and searching images via gemini-embedding-2●API — Event-driven webhooks now replace polling for the Batch API and long-running operations●STUDIO — Google AI Studio builds Android apps from plain language and generates images on the fly with Nano Banana●MIGRATION — Gemini CLI reaches end-of-life on June 18; migrate to the Agentic 2.0 CLI (two image-preview models retire June 25)●MODEL — Gemini 3.5 Flash is now generally available, beating 3.1 Pro on nearly all benchmarks while running 4x faster●AGENTS — Managed Agents arrive in the Gemini API in public preview, running autonomous agents in isolated Google-hosted Linux sandboxes●SEARCH — File Search adds multimodal search, natively embedding and searching images via gemini-embedding-2●API — Event-driven webhooks now replace polling for the Batch API and long-running operations●STUDIO — Google AI Studio builds Android apps from plain language and generates images on the fly with Nano Banana●MIGRATION — Gemini CLI reaches end-of-life on June 18; migrate to the Agentic 2.0 CLI (two image-preview models retire June 25)

TAG

SLO

2 articles

← Back to all tags

Related:

Gemini API¹ Tail Latency¹ p95¹ Streaming¹ gemini-api¹ error-budget¹ production¹ cloudflare-workers¹ sre¹

◈ Gemini API/2026-06-23Advanced

Your Gemini API Average Latency Looks Great — But Some Users Still Get Stuck. Defending p95/p99

Your average TTFT is fast, yet a fraction of users keep hitting frozen responses. That is a tail-latency problem (p95/p99). From measurement to model routing, streaming budgets, cache accounting, and retry design — here are the defenses that actually held up in production, with code.

◈ Gemini API/2026-05-28Advanced

Running an SLO and Error Budget for the Gemini API as an Indie Developer — Guarding Four Sites with Burn-Rate Monitoring

Notes from running the Gemini API inside four production sites as an indie developer. A practical SLO and Error Budget design that fits a single-person operation: Cloudflare Workers and KV for burn-rate calculation, simplified multi-window alerts, and decision rules for what to freeze when the budget runs out.