●FLASH GA — Gemini 3.5 Flash is now generally available, billed as the most intelligent model for sustained frontier performance on agentic and coding tasks●TOGGLE — From Jun 16 the Gemini 3.5 Flash feature toggle is removed in the Global, US, and EU multi-regions, so check any configs that depend on it●AGENTS — Managed Agents launched in public preview, letting developers build and deploy autonomous, stateful agents inside Google-hosted isolated Linux sandboxes●IMAGE — The image preview models gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25; migrate to their successors●SEARCH — File Search now supports multimodal search, natively embedding and searching images via the gemini-embedding-2 model●CLI — Gemini CLI and Code Assist end individual access on Jun 18; free users and AI Pro/Ultra subscribers are directed to the Antigravity CLI●FLASH GA — Gemini 3.5 Flash is now generally available, billed as the most intelligent model for sustained frontier performance on agentic and coding tasks●TOGGLE — From Jun 16 the Gemini 3.5 Flash feature toggle is removed in the Global, US, and EU multi-regions, so check any configs that depend on it●AGENTS — Managed Agents launched in public preview, letting developers build and deploy autonomous, stateful agents inside Google-hosted isolated Linux sandboxes●IMAGE — The image preview models gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25; migrate to their successors●SEARCH — File Search now supports multimodal search, natively embedding and searching images via the gemini-embedding-2 model●CLI — Gemini CLI and Code Assist end individual access on Jun 18; free users and AI Pro/Ultra subscribers are directed to the Antigravity CLI

TAG

max-tokens

1 articles

← Back to all tags

gemini-api¹ finish-reason¹ troubleshooting¹ python¹ typescript¹

◈ Gemini API/2026-06-14Intermediate

When Gemini API Cuts Your Response Off Mid-Sentence — Detecting finish_reason: MAX_TOKENS and Stitching the Continuation

Long-form generation that ends mid-sentence is usually finish_reason: MAX_TOKENS. This failure arrives as a quiet HTTP 200, no exception. Here is how to detect it, stitch a continuation to recover the full text, and avoid the thinking-token trap that makes it worse on 3.x models.