GEMINI LABJP
CHROME — Gemini in Chrome lands on Android in late June with Nano Banana and auto browse, rolling out first to 4GB+ RAM devices set to en-USOMNI-FLASH — Gemini Omni Flash rolls out to all AI Plus, Pro, and Ultra subscribers, and is free for adults in YouTube Shorts Remix and YouTube CreateDEADLINE — 12 days until the image preview models shut down on Jun 25 — migrate gemini-3.1-flash and 3-pro image-preview workloads to GA versions nowSCHEMA — The legacy Interactions API schema was removed on Jun 8; double-check your migration to the steps array and the new response_formatFLASH-GA — Gemini 3.5 Flash is generally available via Antigravity, the Gemini API, AI Studio, and Android StudioSUITE — Deep Think, Deep Research, Gemini Live, and Gemini Omni now form one flow: reason, research, talk, and createCHROME — Gemini in Chrome lands on Android in late June with Nano Banana and auto browse, rolling out first to 4GB+ RAM devices set to en-USOMNI-FLASH — Gemini Omni Flash rolls out to all AI Plus, Pro, and Ultra subscribers, and is free for adults in YouTube Shorts Remix and YouTube CreateDEADLINE — 12 days until the image preview models shut down on Jun 25 — migrate gemini-3.1-flash and 3-pro image-preview workloads to GA versions nowSCHEMA — The legacy Interactions API schema was removed on Jun 8; double-check your migration to the steps array and the new response_formatFLASH-GA — Gemini 3.5 Flash is generally available via Antigravity, the Gemini API, AI Studio, and Android StudioSUITE — Deep Think, Deep Research, Gemini Live, and Gemini Omni now form one flow: reason, research, talk, and create
TAG

gemini-embedding-2

1 articles
Back to all tags
Related:
gemini-3-5-flash1rag1cost-optimization1caching1
Gemini API/2026-06-13Advanced

Rebuilding a Three-Layer RAG Cache After Migrating to Gemini 3.5 Flash

When Gemini 2.0 Flash was retired, I rebuilt my RAG caching stack around 3.5 Flash. Here are the working implementations for response, semantic, and embedding caches, measured hit rates from production, and how self-managed caching divides the work with the API's Context Caching.