◈ API / SDK/2026-04-03Intermediate

to Gemini API Function Calling: Tool Integration and Practical Usage

A practical deep dive into using Gemini API's Function Calling to give AI real tools and external API access. From design patterns to production implementation, covered systematically.

Gemini API¹⁹³ Function Calling¹⁶ tool integration AI development⁶ automation⁵²

✦ Premium Article

What Is Function Calling?

Gemini API's Function Calling lets AI models invoke external functions and APIs during a conversation. This moves AI beyond text generation alone — enabling real-time data retrieval, computation, and integration with external services that affect the real world.

As of 2026, Gemini API Function Calling has matured considerably. Parallel tool invocation, forced tool-use mode, and well-structured tool definitions are all production-ready. This guide covers everything from first-time setup to advanced patterns.

How Function Calling Works

The End-to-End Flow

Function Calling operates in the following sequence:

The developer defines available tools (functions) in the API request
The user sends a message
The Gemini model decides which tool to call, and with what arguments
The model returns a tool_calls response with those instructions
The application executes the tool and passes results back to the model
The model generates a final response incorporating the tool output

The critical point: the Gemini model itself doesn't execute the tools. It only decides which tool to call and with what arguments. Actual execution happens in your application code. This keeps security and control in your hands.

Basic Tool Definition Structure

import google.generativeai as genai
 
# Define a tool
weather_function = {
    "name": "get_current_weather",
    "description": "Retrieves the current weather for a specified city. Temperature unit can be Celsius or Fahrenheit.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name and country code (e.g., Tokyo, JP)"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature unit"
            }
        },
        "required": ["location"]
    }
}
 
# Pass the tool to the model
model = genai.GenerativeModel(
    model_name="gemini-2.0-flash",
    tools=[weather_function]
)

The quality of the description field is what matters most. The model uses it to decide when to invoke the tool. Vague descriptions lead to wrong tool selection. Be explicit: what does this tool do, when should it be used, and what does it take and return?

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦How Function Calling works and how to design effective tool definitions — ready to implement today

✦Controlling parallel and sequential tool calls to build complex workflows

✦Production-ready implementation with robust error handling and security best practices

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Practical Tool Implementation

Basic Single-Tool Call

import google.generativeai as genai
import json
 
genai.configure(api_key="YOUR_GEMINI_API_KEY")
 
def get_current_weather(location: str, unit: str = "celsius") -> dict:
    """Calls a real weather API — using dummy data here"""
    return {
        "location": location,
        "temperature": 22 if unit == "celsius" else 71.6,
        "unit": unit,
        "condition": "Sunny",
        "humidity": 65
    }
 
def run_function_call_example():
    model = genai.GenerativeModel(
        model_name="gemini-2.0-flash",
        tools=[get_current_weather]  # Pass function directly — schema is auto-generated
    )
    
    chat = model.start_chat()
    response = chat.send_message("What's the weather in Tokyo right now?")
    
    # Process function calls until none remain
    while response.candidates[0].content.parts[0].function_call.name:
        fc = response.candidates[0].content.parts[0].function_call
        
        # Execute the tool
        if fc.name == "get_current_weather":
            result = get_current_weather(**fc.args)
        
        # Return results to the model
        response = chat.send_message(
            genai.protos.Content(
                parts=[genai.protos.Part(
                    function_response=genai.protos.FunctionResponse(
                        name=fc.name,
                        response={"result": json.dumps(result)}
                    )
                )]
            )
        )
    
    return response.text

Implementing Parallel Tool Calls

Gemini 2.0 and later support calling multiple tools in parallel, allowing multiple independent pieces of information to be fetched simultaneously:

def handle_parallel_function_calls(response, tools_map):
    """Generic handler for parallel tool invocations"""
    function_calls = [
        part.function_call 
        for part in response.candidates[0].content.parts
        if hasattr(part, 'function_call') and part.function_call.name
    ]
    
    if not function_calls:
        return None
    
    # Execute all tools (use asyncio or ThreadPool in production)
    results = []
    for fc in function_calls:
        if fc.name in tools_map:
            result = tools_map[fc.name](**fc.args)
            results.append(
                genai.protos.Part(
                    function_response=genai.protos.FunctionResponse(
                        name=fc.name,
                        response={"result": json.dumps(result)}
                    )
                )
            )
    
    return genai.protos.Content(parts=results)

Advanced Function Calling Design Patterns

Controlling Tool Selection Mode

Gemini API provides configuration options to control how tools are used:

from google.generativeai.types import ToolConfig, FunctionCallingConfig
 
# Mode 1: AUTO (default) — model decides on its own
auto_config = ToolConfig(
    function_calling_config=FunctionCallingConfig(mode="AUTO")
)
 
# Mode 2: ANY — must use at least one tool (forced)
forced_config = ToolConfig(
    function_calling_config=FunctionCallingConfig(
        mode="ANY",
        allowed_function_names=["get_current_weather", "search_database"]
    )
)
 
# Mode 3: NONE — no tool use (text response only)
no_tools_config = ToolConfig(
    function_calling_config=FunctionCallingConfig(mode="NONE")
)

The ANY mode is particularly useful when you always need structured output — for example, when parsing user input into a database schema regardless of how the user phrases their request.

Stateful Multi-Turn Conversations

Function Calling really shines across multi-turn conversations. Here's an example of a data analysis assistant:

class DataAnalysisAssistant:
    def __init__(self):
        self.tools = [
            self.query_database,
            self.calculate_statistics,
            self.create_visualization,
            self.export_report
        ]
        self.model = genai.GenerativeModel(
            model_name="gemini-2.0-pro",
            tools=self.tools,
            system_instruction="""
You are a data analysis expert.
Use tools as needed to answer the user's questions.
When multiple tools are required, use them in logical order.
            """
        )
        self.chat = self.model.start_chat()
    
    def analyze(self, user_query: str) -> str:
        response = self.chat.send_message(user_query)
        
        # Loop until no more tool calls
        while True:
            function_calls = self._extract_function_calls(response)
            if not function_calls:
                break
            
            results = self._execute_function_calls(function_calls)
            response = self.chat.send_message(results)
        
        return response.text

Error Handling and Security

Robust Error Handling

Production environments will encounter failures during tool execution. Proper error handling lets the AI understand what went wrong and try alternatives:

def safe_execute_tool(tool_name: str, args: dict, tools_map: dict) -> dict:
    """Tool execution wrapper with graceful error handling"""
    try:
        if tool_name not in tools_map:
            return {
                "error": f"Tool '{tool_name}' not found",
                "available_tools": list(tools_map.keys())
            }
        
        result = tools_map[tool_name](**args)
        return {"success": True, "result": result}
        
    except ValueError as e:
        return {
            "error": "Invalid argument value",
            "details": str(e),
            "suggestion": "Check argument types and acceptable ranges"
        }
    except TimeoutError:
        return {
            "error": "Tool execution timed out",
            "suggestion": "Try again later or consider a different approach"
        }
    except Exception as e:
        return {
            "error": "Unexpected error occurred",
            "details": str(e)
        }

Security Considerations

Security deserves careful attention when using Function Calling in production:

Input validation: Never pass AI-generated tool arguments directly to your systems without validation. Arguments that might contain file paths, SQL queries, or shell commands must always be validated and sanitized first.

Least privilege: Each tool should have only the permissions it needs. A read-only data retrieval tool should never have write access.

Rate limiting: Prevent AI from triggering excessive API calls by rate-limiting tool execution — especially for tools that call external services.

Audit logging: Log every tool invocation in production — what was called, when, and with what arguments. Critical for debugging and detecting anomalous patterns.

import logging
from functools import wraps
 
def audit_tool_call(func):
    """Decorator that logs every tool invocation"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        logging.info(
            f"Tool called: {func.__name__}, "
            f"args: {args}, kwargs: {kwargs}"
        )
        try:
            result = func(*args, **kwargs)
            logging.info(f"Tool {func.__name__} succeeded")
            return result
        except Exception as e:
            logging.error(f"Tool {func.__name__} failed: {e}")
            raise
    return wrapper
 
@audit_tool_call
def sensitive_database_query(query: str) -> list:
    # Database query implementation
    pass

Real-World Use Cases

Use Case 1: Intelligent Customer Support

customer_support_tools = [
    {
        "name": "get_order_status",
        "description": "Retrieves order status, shipping info, and estimated delivery date for a given order number",
        "parameters": {
            "type": "object",
            "properties": {
                "order_id": {"type": "string", "description": "Order number (e.g., ORD-2026-12345)"}
            },
            "required": ["order_id"]
        }
    },
    {
        "name": "process_return_request",
        "description": "Processes a return or exchange request. Only use after getting explicit user confirmation.",
        "parameters": {
            "type": "object",
            "properties": {
                "order_id": {"type": "string"},
                "reason": {"type": "string"},
                "type": {"type": "string", "enum": ["return", "exchange"]}
            },
            "required": ["order_id", "reason", "type"]
        }
    }
]

Use Case 2: Code Review Assistant

code_review_tools = [
    "analyze_code_complexity",           # Measure code complexity
    "check_security_vulnerabilities",    # Scan for security issues
    "suggest_refactoring",               # Propose refactoring opportunities
    "run_tests",                         # Execute test suite
    "check_dependency_updates"           # Check for dependency updates
]

Orchestrating these tools creates a fully automated pipeline from code submission to comprehensive review report.

Performance Optimization

Caching Tool Definitions

When using the same toolset repeatedly, caching tool definitions saves costs. The Gemini API supports caching system prompts that include tool definitions — especially valuable when you have many tools.

Async Parallelization for Speed

When multiple tools can run independently, async parallel execution dramatically reduces latency:

import asyncio
 
async def execute_parallel_tools(function_calls: list, tools_map: dict) -> list:
    """Execute multiple tools in parallel using async"""
    async def execute_single(fc):
        tool = tools_map.get(fc.name)
        if not tool:
            return {"error": f"Unknown tool: {fc.name}"}
        
        if asyncio.iscoroutinefunction(tool):
            result = await tool(**fc.args)
        else:
            # Run synchronous functions in a thread pool
            loop = asyncio.get_event_loop()
            result = await loop.run_in_executor(None, lambda: tool(**fc.args))
        
        return result
    
    return await asyncio.gather(*[execute_single(fc) for fc in function_calls])

A Note from an Indie Developer

Closing Thoughts

Gemini API's Function Calling opens a window from AI into the real world. With well-designed tool definitions, robust error handling, and security-conscious implementation, you can build AI assistants that are genuinely useful in production.

Start with a simple single-tool implementation and work your way up to parallel calls and complex workflows as you get comfortable. Mastering Function Calling substantially expands what's possible with AI application development.

We're rooting for you to build something truly impactful.

Streaming Responses: The Basics

With standard API requests, the model generates the entire response before sending it back. Streaming delivers text incrementally as it's generated, dramatically improving the user experience.

Streaming in Python

import google.generativeai as genai
 
# Configure your API key
genai.configure(api_key="YOUR_API_KEY")
 
# Initialize the model
model = genai.GenerativeModel("gemini-2.5-pro")
 
# Receive streaming response
response = model.generate_content(
    "Explain how to build a web scraper in Python, step by step.",
    stream=True
)
 
# Output each chunk in real time
for chunk in response:
    if chunk.text:
        print(chunk.text, end="", flush=True)
 
print()  # Final newline
 
# Expected behavior:
# Text appears in real time (token by token)
# No waiting for the complete response to generate

Streaming in TypeScript/JavaScript

import { GoogleGenerativeAI } from "@google/generative-ai";
 
const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-2.5-pro" });
 
async function streamResponse(prompt: string) {
  const result = await model.generateContentStream(prompt);
  
  // Process chunks as they arrive
  for await (const chunk of result.stream) {
    const text = chunk.text();
    process.stdout.write(text);
  }
  
  // You can also get the complete final response
  const finalResponse = await result.response;
  console.log("\n\nTotal tokens:", finalResponse.usageMetadata?.totalTokenCount);
}
 
// Usage
streamResponse("What are the best practices for designing REST APIs in TypeScript?");
 
// Expected output:
// (Text streams in real time)
// Total tokens: 1234

Combining Streaming with Function Calling

By combining streaming and Function Calling, you can build advanced applications that respond in real time while also fetching external data.

import google.generativeai as genai
 
genai.configure(api_key="YOUR_API_KEY")
 
model = genai.GenerativeModel(
    model_name="gemini-2.5-pro",
    tools=[get_weather]
)
 
# Streaming + Function Calling
response = model.generate_content(
    "Help me plan a picnic for tomorrow. Check Tokyo's weather and create an appropriate packing list.",
    stream=True
)
 
for chunk in response:
    # Handle Function Call requests in stream
    for part in chunk.parts:
        if hasattr(part, 'function_call'):
            result = get_weather(city="Tokyo")
            print(f"\n[Weather data fetched: {result}]\n")
    
    # Handle text responses
    if chunk.text:
        print(chunk.text, end="", flush=True)

Wrapping Up

Gemini API's streaming and Function Calling capabilities are powerful building blocks for real-time AI applications. Streaming improves the user experience with instant feedback, while Function Calling seamlessly connects your AI to external data and tools.

Start with a simple streaming chat, then layer in Function Calling to integrate external APIs like weather services or news feeds. This incremental approach is the most reliable way to build sophisticated AI-powered applications.

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.