ARTICLES

All Articles

Design a SwiftData-backed cache layer for Gemini API responses so your iOS app keeps working in airplane mode and on flaky networks. Covers @Model schema, invalidation strategy, store-size discipline, and migration — all from production iOS experience.

◈ API & SDK/2026-05-24Advanced

Taming Gemini API Tail Latency with Request Hedging: A p99 Design Notebook

A four-month operational journal of taming Gemini API tail latency with hedged requests across a production indie app portfolio. Includes measured p50/p95/p99 numbers, a working Swift and TypeScript implementation, and the cost-control parameters that kept monthly billing growth under 18%.

◈ API & SDK/2026-05-24Intermediate

Why Your Gemini File API Uploads Vanish After 48 Hours — and How to Code Around It

Gemini File API resources are auto-deleted 48 hours after upload. Here is how to recognize the failure, why it happens, and concrete patterns for re-uploading, falling back to inline data, and managing expiration safely.

◈ API & SDK/2026-05-23Advanced

Gemini API × Sentry: A Production Pipeline for LLM Error Tracking and Prompt Failure Observability

Pair Sentry's error tracking with Gemini-specific failure modes so you can catch safety filter blocks, recitation rejections, empty completions, and quiet latency drift in production.

◈ API & SDK/2026-05-23Intermediate

When Gemini API Streaming Cuts Off Mid-Response in Production: The Diagnosis Order I Run

How I diagnose mid-response cutoffs in Gemini API streaming - the order I check network, SDK, and server-side suspects, with real cases from indie production.

◈ API & SDK/2026-05-23Intermediate

Designing Around the Gemini 2.0 Flash Deprecation Without Letting It Disrupt Indie Development: My May 2026 Risk-Distribution Notes

How I rebuilt my indie-development jobs to absorb the upcoming Gemini 2.0 Flash deprecation - provider abstraction, cost numbers, a rehearsal day, captured from my May 2026 review.

◈ API & SDK/2026-05-23Intermediate

Why Your Gemini API Structured Output Keeps Failing Validation — and How to Stabilize It

A field guide to the three layers where Gemini API structured output breaks — server-side schema rejection, silent empty responses, and client-side parsing — with practical fixes from an indie developer's production AdMob reporting pipeline.

◎ Updates/2026-05-23Beginner

Google AI Pro Now Includes YouTube Premium Lite: A Pricing Read From a Solo Developer

Google has bundled YouTube Premium Lite (¥780/month) into Google AI Pro (from ¥2,900/month). Here is how the new structure reads against my actual usage as a solo developer with 50M+ downloads, including who actually benefits.

⟐ Dev Tools/2026-05-23Intermediate

LM Studio 'Failed to Load Model' for Gemma 4 MLX — A 4-Bucket Diagnostic for Apple Silicon

When LM Studio refuses to load mlx-community/gemma-4-26b-a4b-it-4bit with a red 'Failed to load model' dialog, the cause is almost always one of four buckets. Here's how to triage them on an Apple Silicon Mac in under thirty minutes.

◈ API & SDK/2026-05-23Intermediate

Six Weeks of Running an App Store vs. Google Play Review Diff with Gemini

A six-week record of using the Gemini API to classify App Store and Google Play reviews in parallel and surface platform-specific priority items. Notes from running this on an indie wallpaper app with 50M+ cumulative downloads, including the three platform gaps that actually showed up and the monthly cost.

◈ API & SDK/2026-05-23Advanced

Idempotency Key Design for the Gemini API: Patterns I Use to Prevent Duplicate Generation Across Six Sites

After five months of running six AI-driven sites in parallel, I built an idempotency layer in front of the Gemini API to neutralize retry storms. This deep dive shares the SHA-256 + Cloudflare Workers KV design, the operational numbers behind it, and the four gotchas that only surface in production.

◈ API & SDK/2026-05-22Advanced

Why Gemini API Returns Empty Responses with finishReason: RECITATION, and the Prompt + Post-Processing Design That Stopped It

Run a Gemini content agent long enough and one day logs fill with finishReason: 'RECITATION' and empty content arrays. This is the verbatim-quotation safety system firing. Here is the prompt rewriting pattern and TypeScript post-processor I deployed across six auto-publishing pipelines at Dolice — it dropped my incident rate by 90%.

All Articles

SwiftData × Gemini API Offline Response Cache — Persisting and Reusing AI Responses on iOS

Taming Gemini API Tail Latency with Request Hedging: A p99 Design Notebook

Why Your Gemini File API Uploads Vanish After 48 Hours — and How to Code Around It

Gemini API × Sentry: A Production Pipeline for LLM Error Tracking and Prompt Failure Observability

When Gemini API Streaming Cuts Off Mid-Response in Production: The Diagnosis Order I Run

Designing Around the Gemini 2.0 Flash Deprecation Without Letting It Disrupt Indie Development: My May 2026 Risk-Distribution Notes

Why Your Gemini API Structured Output Keeps Failing Validation — and How to Stabilize It

Google AI Pro Now Includes YouTube Premium Lite: A Pricing Read From a Solo Developer

LM Studio 'Failed to Load Model' for Gemma 4 MLX — A 4-Bucket Diagnostic for Apple Silicon

Six Weeks of Running an App Store vs. Google Play Review Diff with Gemini

Idempotency Key Design for the Gemini API: Patterns I Use to Prevent Duplicate Generation Across Six Sites

Why Gemini API Returns Empty Responses with finishReason: RECITATION, and the Prompt + Post-Processing Design That Stopped It