GEMINI LABJP
MODEL — Gemini 3.5 Flash reaches general availability and becomes gemini-flash-latestAPI — The Interactions API hits GA as the primary way to work with Gemini models and agentsAGENT — Managed Agents enter public preview, running stateful agents in isolated Linux sandboxesAPI — Background execution lands, letting you fire long-running jobs and collect results laterSEARCH — File Search now embeds and searches images natively via gemini-embedding-2NOTICE — Since June 19, requests from unrestricted API keys are blockedMODEL — Gemini 3.5 Flash reaches general availability and becomes gemini-flash-latestAPI — The Interactions API hits GA as the primary way to work with Gemini models and agentsAGENT — Managed Agents enter public preview, running stateful agents in isolated Linux sandboxesAPI — Background execution lands, letting you fire long-running jobs and collect results laterSEARCH — File Search now embeds and searches images natively via gemini-embedding-2NOTICE — Since June 19, requests from unrestricted API keys are blocked
Articles/Advanced
Advanced/2026-06-29Advanced

When Your Gemini Agent Has Three Tool Routes and Quietly Picks the Wrong One

Put Function Calling, Code Execution, and Grounding into one agent and the model will sometimes choose the wrong route, while the output still looks perfectly plausible. Here is how I instrument route selection and correct it with phase separation and verification gates, with working code.

gemini-api256function-calling19code-execution3grounding6agent9observability9

Premium Article

One morning I was looking at the logs of a report pipeline I run on a schedule, and the outputs were as clean as ever, but the numbers felt slightly stale. No errors. No swallowed exceptions. And yet the value that should have come fresh from an external API had quietly been replaced by a plausible-looking number generated from the model's memory.

The cause was that I had given a single agent three routes: Function Calling, Code Execution, and Grounding. When several routes exist, the model decides which one to use. And when it chooses wrong, the output does not break. It comes back looking finished and reasonable. That is the hard part. The failure shows up not as an exception but as a normal-looking response.

This is a set of notes on the design I rebuilt to detect and correct that quiet misrouting. The official docs explain each tool on its own, but what breaks when you bundle all three only became visible once I measured it myself.

Why route selection fails silently

The three tools look similar but solve different problems. Function Calling is a hand into external resources — databases, REST APIs, internal systems. Code Execution lets the model write Python and run it itself, which suits computation and aggregation. Grounding with Google Search fetches the "now" that the training data does not contain.

The trouble is that the boundaries are fuzzy in plain language. "Analyze the latest sales" does not uniquely map to one route: should it ground against news, call an internal API, or aggregate local data with code? The model picks one probabilistically and produces a coherent answer inside whatever route it chose. So the mistake is not a blank or an exception — it is a plausible wrong answer.

On top of that, as of 2026, Grounding and custom Function Calling still cannot be combined in the same request. Pass both in tools without knowing this and one of them silently stops working. The error is easy to misread, so many people stumble here once.

Measure first — push a structured trace through

Before correcting anything, make what is happening visible. The first thing I added was a single structured log line per request: which route, chosen why, attempted how many times.

import json
import logging
from dataclasses import dataclass, asdict, field
 
logger = logging.getLogger("agent.route")
 
@dataclass
class RouteTrace:
    request_id: str
    intent: str = ""           # classified intent
    route: str = ""            # grounding / function / code
    fallback_count: int = 0
    grounded_sources: int = 0  # sources actually referenced
    verified: bool = False     # passed the verification gate
    latency_ms: int = 0
    notes: list[str] = field(default_factory=list)
 
    def emit(self):
        # One request = one line, easy to aggregate later
        logger.info(json.dumps(asdict(self), ensure_ascii=False))

The key is always recording grounded_sources. If the model chose the grounding route but referenced zero sources, it likely answered from memory rather than from search results. That stale number at the start of this article? This field was sitting at 0 the whole time. Only after instrumenting did I learn that about 18% of requests chose grounding yet returned zero sources. You cannot fix what you cannot see.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
A structured trace that lets you reconstruct which route was chosen and how many times it fell back
Phase separation that works around the Grounding-and-Function-Calling restriction and makes routing explicit
Per-route verification gates and rerouting that stop plausible-but-wrong answers from shipping
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API / SDK2026-03-20
Build an AI Data Analysis Agent with Gemini API — Combining Code Execution, Function Calling, and Structured Output
Learn how to build a production-ready AI data analysis agent in Python that combines Gemini API's Code Execution, Function Calling, and Structured Output to automatically analyze CSV/Excel data, generate visualizations, and produce structured reports.
Advanced2026-04-26
Custom Gemini API Agent Loop Without ADK — A Complete Production Guide to Tool Calling, Memory, and Parallel Execution
Build production-grade AI agents using Gemini API directly without Google ADK. This guide covers custom agent loops, tool calling patterns, sliding window memory, parallel execution, and battle-tested error recovery strategies.
Advanced2026-04-20
to Production Architecture for Gemini API 2026— Design Patterns for Building Scalable, Reliable AI Systems
A comprehensive guide to production-grade design patterns for Gemini API. Covers resilient API clients, multi-layer caching, multi-tenant design, observability, and cost control with complete code examples.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →