All Articles
Controlling Function Calls in Gemini API with tool_config — AUTO, ANY, and NONE in Practice
A practical guide to tool_config in Gemini API. Learn the difference between AUTO, ANY, and NONE, how to stop Gemini from calling functions when you don't want it to, and how to restrict the callable set with allowed_function_names.
Running Gemini API Keys Safely: A Practical Checklist for Indie Developers
API key leaks are a real-world threat for solo developers. This practical 5-point checklist covers the common mistakes — accidental Git commits, client-side exposure, missing spend caps — and how to close those gaps quickly.
Driving Down Gemini 2.0 Flash RAG Costs with a 3-Tier Cache Design
Flash is cheap, but a RAG app still grows linearly with traffic. This tiered caching design — response, retrieval, and embedding layers — routinely cuts our bill by half. Here is the implementation.
Calibrating Gemini 2.5 Pro's Thinking Budget by Task, Not by Default
Gemini 2.5 Pro's thinking_budget isn't a dial where bigger always means smarter. Here is the measurement procedure I use to pick the right value per task — including the results that surprised me.
Scaling a Gemini API SaaS to $10K MRR: Acquisition, LTV, and Churn Defense
Turning a Gemini-powered SaaS from $1,000 MRR to $10,000 MRR is not a product problem but a customer problem. A practical 12-month playbook covering acquisition channels, pricing architecture, and churn defense.
Gemini × DSPy: Retire from Prompt Craftsmanship — A Complete Guide to Automated Prompt Optimization
A hands-on implementation guide for combining Stanford's DSPy framework with Gemini to end the era of hand-written prompts. Covers Signatures, Modules, Optimizers, LLM-as-a-Judge metrics, and production pipelines — all with working code.
Monetizing a Solo SaaS on Gemini 2.5 Pro: Pricing, Billing, and Usage-Control Roadmap
A hands-on roadmap for turning a Gemini 2.5 Pro-powered solo SaaS into a monthly revenue business, covering pricing design, Stripe integration, and token usage management.
Diagnosing Stuck or Failed Jobs in the Gemini Batch API
A field guide to the Gemini Batch API: how to diagnose jobs stuck in QUEUED or RUNNING, how to read FAILED error messages, and how to design fallbacks that survive the 24-hour SLA.
Google AI Studio's Quota Expansion — What AI Pro and Ultra Users Actually Gained
In April 2026, Google AI Studio relaxed its usage limits for AI Pro and Ultra subscribers. Here's what actually changed across Nano Banana Pro, Gemini Pro, Antigravity, Gemini CLI, and Jules.
Quietly Catching Wrong Answers in Your Gemini-Powered App — A Production Auto-Eval Loop
Running Gemini in production eventually shows you responses that are 'kind of wrong.' I want to catch them before users do. This is the exact auto-eval loop I run over live traffic, with the prompts I use and the mistakes I had to learn my way through.
Async AI Job Queues with Gemini API and Cloud Tasks — Production Patterns for Timeouts, Retries, and Rate Limits
Migrate synchronous Cloud Run + Gemini calls to a Cloud Tasks async job queue. Covers retries, DLQ, idempotent workers, and cost modeling with working code.
Don't Let Your Gemini Prompts Silently Rot — A Practical Regression Testing Playbook with Pytest
Ever tweaked a prompt and watched production quality quietly degrade? This article walks through testing Gemini API prompts with Pytest, combining snapshot tests and LLM-as-Judge to catch regressions automatically — all from the perspective of an individual developer running things solo.