GEMINI LABJP
SIRI — WWDC 2026 confirms the revamped Siri runs on a Google Gemini model, though it won't ship in the EU at iOS 27 due to the DMAFLASH3.5 — Gemini 3.5 Flash is now GA, the top Flash model for sustained frontier performance on agentic and coding tasksIMAGE-GA — Gemini 3.1 Flash Image and 3.1 Pro Image are GA as native visual models; the preview versions shut down Jun 25MANAGED-AGENTS — Managed Agents launch in public preview in the Gemini API, running autonomous agents in Google-hosted isolated Linux sandboxesFILE-SEARCH — File Search now supports multimodal search, with native image embedding and retrieval via gemini-embedding-2DEPRECATION — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25 — migrate to the GA models soonSIRI — WWDC 2026 confirms the revamped Siri runs on a Google Gemini model, though it won't ship in the EU at iOS 27 due to the DMAFLASH3.5 — Gemini 3.5 Flash is now GA, the top Flash model for sustained frontier performance on agentic and coding tasksIMAGE-GA — Gemini 3.1 Flash Image and 3.1 Pro Image are GA as native visual models; the preview versions shut down Jun 25MANAGED-AGENTS — Managed Agents launch in public preview in the Gemini API, running autonomous agents in Google-hosted isolated Linux sandboxesFILE-SEARCH — File Search now supports multimodal search, with native image embedding and retrieval via gemini-embedding-2DEPRECATION — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25 — migrate to the GA models soon
Articles/API / SDK
API / SDK/2026-04-07Advanced

Gemini API × PostgreSQL Complete Implementation Guide — Building an AI-Driven Database Optimization System for Production

A complete production-ready guide to automating PostgreSQL optimization with Gemini 2.5 Pro — covering Text-to-SQL generation, EXPLAIN plan analysis, index recommendations, and schema reviews using Python and FastAPI.

gemini-api285postgresql2databasesql2optimization6python132fastapi7

Setup and context — Why AI × PostgreSQL?

Database optimization is an unavoidable challenge for every backend engineer. Slow queries, missing indexes, suboptimal schema design — diagnosing and resolving these issues traditionally required deep expertise and significant time investment.

Gemini 2.5 Pro is changing that. By embedding schema information in a system prompt, you can ask it to generate SQL from natural language, analyze EXPLAIN plans, recommend indexing strategies, and review DDL changes — all through a single API.

This article walks you through building a production-ready AI-driven database optimization system that combines PostgreSQL with the Gemini API. Beyond basic Text-to-SQL, we cover query performance diagnostics, index recommendations, schema reviews, and migration planning assistance — turning Gemini into a full-fledged database assistant.

This guide targets backend engineers and SREs who are comfortable with Python and PostgreSQL basics.


1. System Architecture Overview

The system is built around five core components:

  • Schema Introspector: Automatically collects schema information from PostgreSQL system catalogs (pg_catalog)
  • Gemini Query Agent: Converts natural language to SQL with schema context embedded in the system prompt
  • Query Analyzer: Feeds EXPLAIN plans to Gemini to identify performance bottlenecks
  • Index Advisor: Recommends optimal indexing strategies based on table statistics and query patterns
  • Schema Reviewer: Reviews proposed DDL changes for design issues before they reach production
# System architecture (pseudocode)
#
# ┌─────────────────────────────────────────────┐
# │              FastAPI Application             │
# │                                             │
# │  /sql/generate  ← Text-to-SQL              │
# │  /query/analyze ← EXPLAIN Plan Analysis    │
# │  /index/advise  ← Index Recommendations    │
# │  /schema/review ← DDL Review               │
# └──────────────┬──────────────────────────────┘
#                │
#     ┌──────────▼──────────┐
#     │   Gemini 2.5 Pro    │
#     │   (google-genai)    │
#     └──────────┬──────────┘
#                │
#     ┌──────────▼──────────┐
#     │   PostgreSQL 16     │
#     │   (asyncpg)         │
#     └─────────────────────┘

2. Setup and Dependencies

Install required libraries

pip install google-genai asyncpg fastapi uvicorn pydantic python-dotenv

Configure your .env file

# .env
GEMINI_API_KEY=YOUR_GEMINI_API_KEY
DATABASE_URL=postgresql://user:password@localhost:5432/yourdb
 
# Optional: cost control
GEMINI_MAX_TOKENS_PER_REQUEST=8192
GEMINI_DAILY_TOKEN_BUDGET=1000000

Project structure

db_ai_optimizer/
├── main.py                 # FastAPI application
├── schema_introspector.py  # Schema collection
├── gemini_agent.py         # Gemini API client
├── query_analyzer.py       # Query analysis
├── index_advisor.py        # Index recommendations
├── schema_reviewer.py      # Schema reviews
├── models.py               # Pydantic models
└── .env

3. Automatic Schema Collection from PostgreSQL

For Gemini to generate accurate SQL, it needs precise schema context. The SchemaIntrospector collects that information from PostgreSQL system catalogs.

# schema_introspector.py
import asyncpg
import json
from typing import Optional
 
class SchemaIntrospector:
    """Collects schema information from PostgreSQL system catalogs"""
 
    def __init__(self, dsn: str):
        self.dsn = dsn
        self._pool: Optional[asyncpg.Pool] = None
 
    async def init_pool(self):
        self._pool = await asyncpg.create_pool(
            self.dsn,
            min_size=2,
            max_size=10,
            command_timeout=30
        )
 
    async def get_schema_context(self, schema_name: str = "public") -> str:
        """
        Returns a formatted string with table/column/index/FK information
        suitable for passing to Gemini as context
        """
        async with self._pool.acquire() as conn:
            # Fetch tables and columns
            tables = await conn.fetch("""
                SELECT
                    t.table_name,
                    c.column_name,
                    c.data_type,
                    c.is_nullable,
                    c.column_default,
                    col_description(
                        (t.table_schema || '.' || t.table_name)::regclass::oid,
                        c.ordinal_position
                    ) AS column_comment
                FROM information_schema.tables t
                JOIN information_schema.columns c
                    ON t.table_name = c.table_name
                    AND t.table_schema = c.table_schema
                WHERE t.table_schema = $1
                    AND t.table_type = 'BASE TABLE'
                ORDER BY t.table_name, c.ordinal_position
            """, schema_name)
 
            # Fetch indexes
            indexes = await conn.fetch("""
                SELECT tablename, indexname, indexdef
                FROM pg_indexes
                WHERE schemaname = $1
                ORDER BY tablename, indexname
            """, schema_name)
 
            # Fetch foreign key constraints
            fkeys = await conn.fetch("""
                SELECT
                    tc.table_name,
                    kcu.column_name,
                    ccu.table_name AS foreign_table_name,
                    ccu.column_name AS foreign_column_name
                FROM information_schema.table_constraints AS tc
                JOIN information_schema.key_column_usage AS kcu
                    ON tc.constraint_name = kcu.constraint_name
                    AND tc.table_schema = kcu.table_schema
                JOIN information_schema.constraint_column_usage AS ccu
                    ON ccu.constraint_name = tc.constraint_name
                WHERE tc.constraint_type = 'FOREIGN KEY'
                    AND tc.table_schema = $1
            """, schema_name)
 
            # Fetch estimated row counts from statistics
            row_counts = await conn.fetch("""
                SELECT relname AS table_name, n_live_tup AS estimated_rows
                FROM pg_stat_user_tables
                WHERE schemaname = $1
            """, schema_name)
 
        return self._format_schema_context(tables, indexes, fkeys, row_counts)
 
    def _format_schema_context(self, tables, indexes, fkeys, row_counts) -> str:
        """Format schema data into a structured string for the AI prompt"""
        row_count_map = {r["table_name"]: r["estimated_rows"] for r in row_counts}
 
        table_cols: dict[str, list] = {}
        for row in tables:
            tname = row["table_name"]
            if tname not in table_cols:
                table_cols[tname] = []
            table_cols[tname].append(row)
 
        lines = ["## Database Schema\n"]
        for tname, cols in table_cols.items():
            est_rows = row_count_map.get(tname, "unknown")
            lines.append(f"### Table: {tname} (estimated rows: {est_rows})")
            for col in cols:
                nullable = "NULL" if col["is_nullable"] == "YES" else "NOT NULL"
                default = f" DEFAULT {col['column_default']}" if col["column_default"] else ""
                comment = f" -- {col['column_comment']}" if col["column_comment"] else ""
                lines.append(
                    f"  - {col['column_name']}: {col['data_type']} {nullable}{default}{comment}"
                )
 
        lines.append("\n## Indexes")
        for idx in indexes:
            lines.append(f"  - [{idx['tablename']}] {idx['indexname']}: {idx['indexdef']}")
 
        lines.append("\n## Foreign Keys")
        for fk in fkeys:
            lines.append(
                f"  - {fk['table_name']}.{fk['column_name']} → "
                f"{fk['foreign_table_name']}.{fk['foreign_column_name']}"
            )
 
        return "\n".join(lines)
 
    async def close(self):
        if self._pool:
            await self._pool.close()

4. The Gemini Query Agent — Core Text-to-SQL Implementation

The query agent embeds schema context into the system prompt and converts natural language queries into safe, executable SQL. Crucially, generated SQL should only ever run against a read-only database connection.

# gemini_agent.py
import re
from google import genai
from google.genai import types
 
class GeminiQueryAgent:
    """
    AI agent that converts natural language to PostgreSQL queries
    using Gemini 2.5 Pro with embedded schema context
    """
 
    MODEL_ID = "gemini-2.5-pro"
 
    def __init__(self, api_key: str, schema_context: str):
        self.client = genai.Client(api_key=api_key)
        self.schema_context = schema_context
 
    def _build_system_prompt(self) -> str:
        return f"""You are a PostgreSQL expert. Using the database schema below,
convert the user's natural language question into a precise PostgreSQL query.
 
Rules:
1. ONLY generate SELECT statements — never INSERT, UPDATE, DELETE, DROP, or any DDL
2. Always end the query with a semicolon
3. Always quote column and table names (e.g., "users"."email")
4. Always include a LIMIT clause (default: LIMIT 100)
5. Return only the SQL — no additional explanation
 
{self.schema_context}
"""
 
    async def generate_sql(self, natural_language_query: str) -> dict:
        """
        Convert a natural language query to SQL.
 
        Args:
            natural_language_query: e.g. "Show me users whose revenue last month exceeded $1000"
 
        Returns:
            {"sql": "...", "explanation": "...", "usage": {...}}
        """
        response = self.client.models.generate_content(
            model=self.MODEL_ID,
            contents=[
                types.Content(
                    role="user",
                    parts=[types.Part(text=natural_language_query)]
                )
            ],
            config=types.GenerateContentConfig(
                system_instruction=self._build_system_prompt(),
                temperature=0.1,   # Low temperature for deterministic output
                max_output_tokens=2048,
            )
        )
 
        raw_text = response.text.strip()
        sql = self._extract_sql(raw_text)
 
        # Safety check: SELECT-only enforcement
        if not self._is_safe_sql(sql):
            raise ValueError(f"Unsafe SQL detected: {sql[:100]}...")
 
        explanation = await self._generate_explanation(sql)
 
        return {
            "sql": sql,
            "explanation": explanation,
            "raw_response": raw_text,
            "model": self.MODEL_ID,
            "usage": {
                "input_tokens": response.usage_metadata.prompt_token_count,
                "output_tokens": response.usage_metadata.candidates_token_count,
            }
        }
 
    def _extract_sql(self, text: str) -> str:
        """Extract SQL from a code block or raw text"""
        match = re.search(r"```(?:sql)?\s*(.*?)```", text, re.DOTALL | re.IGNORECASE)
        if match:
            return match.group(1).strip()
        return text.strip()
 
    def _is_safe_sql(self, sql: str) -> bool:
        """Reject any SQL that isn't a pure SELECT statement"""
        sql_upper = sql.upper().strip()
        dangerous = [
            "INSERT", "UPDATE", "DELETE", "DROP", "CREATE",
            "ALTER", "TRUNCATE", "GRANT", "REVOKE", "EXECUTE"
        ]
        if not (sql_upper.startswith("SELECT") or sql_upper.startswith("WITH")):
            return False
        for keyword in dangerous:
            if re.search(rf"\b{keyword}\b", sql_upper):
                return False
        return True
 
    async def _generate_explanation(self, sql: str) -> str:
        """Generate a plain-English explanation of the SQL"""
        response = self.client.models.generate_content(
            model="gemini-2.5-flash",   # Flash is sufficient and more cost-efficient
            contents=[
                types.Content(
                    role="user",
                    parts=[types.Part(
                        text=f"Explain the following SQL in 1-2 sentences:\n{sql}"
                    )]
                )
            ],
            config=types.GenerateContentConfig(
                temperature=0.3,
                max_output_tokens=256
            )
        )
        return response.text.strip()

5. Query Performance Analysis — AI-Powered EXPLAIN Plan Reader

EXPLAIN plans are notoriously difficult to read for engineers who don't work with them daily. By feeding the plan directly to Gemini, you get instant bottleneck identification and concrete improvement suggestions.

# query_analyzer.py
import asyncpg
from google import genai
from google.genai import types
import json
 
class QueryAnalyzer:
    """Analyzes SQL EXPLAIN plans using Gemini"""
 
    def __init__(self, api_key: str, dsn: str):
        self.client = genai.Client(api_key=api_key)
        self.dsn = dsn
 
    async def analyze_query(self, sql: str) -> dict:
        """
        Fetches the EXPLAIN plan for a SQL query and runs AI analysis.
 
        Returns:
            {"explain_plan": "...", "analysis": "...", "bottlenecks": [...], ...}
        """
        explain_plan = await self._get_explain_plan(sql)
        analysis_result = await self._analyze_with_gemini(sql, explain_plan)
 
        return {"explain_plan": explain_plan, **analysis_result}
 
    async def _get_explain_plan(self, sql: str) -> str:
        """Fetch the EXPLAIN plan in JSON format from PostgreSQL"""
        conn = await asyncpg.connect(self.dsn)
        try:
            async with conn.transaction():
                plan = await conn.fetchval(
                    f"EXPLAIN (FORMAT JSON, COSTS true, BUFFERS false) {sql}"
                )
            return str(plan)
        finally:
            await conn.close()
 
    async def _analyze_with_gemini(self, sql: str, explain_plan: str) -> dict:
        """Run Gemini analysis on the EXPLAIN plan"""
        prompt = f"""You are a PostgreSQL performance tuning expert.
Analyze the following SQL and its EXPLAIN plan, then respond in JSON format.
 
## SQL:
```sql
{sql}

EXPLAIN Plan:

{explain_plan}

Respond with this JSON structure: {{ "summary": "One-line performance verdict (Good / Needs Improvement / Critical)", "estimated_cost": <total_cost as number>, "bottlenecks": [ "Identified bottleneck 1", "Identified bottleneck 2" ], "suggestions": [ "Concrete improvement suggestion 1 (e.g., Add index on users.email)", "Concrete improvement suggestion 2" ], "analysis": "Detailed analysis (3-5 sentences)" }} """

    response = self.client.models.generate_content(
        model="gemini-2.5-pro",
        contents=[types.Content(role="user", parts=[types.Part(text=prompt)])],
        config=types.GenerateContentConfig(
            temperature=0.2,
            max_output_tokens=1024,
            response_mime_type="application/json"
        )
    )

    return json.loads(response.text)

---

## 6. AI-Powered Index Recommendation Engine

By combining table statistics from `pg_stats` with representative query patterns, Gemini can suggest precise, actionable index strategies.

```python
# index_advisor.py
import asyncpg
from google import genai
from google.genai import types
import json

class IndexAdvisor:
    """Recommends optimal indexes based on table statistics and query patterns"""

    def __init__(self, api_key: str, dsn: str):
        self.client = genai.Client(api_key=api_key)
        self.dsn = dsn

    async def advise(self, table_name: str, sample_queries: list[str]) -> dict:
        """
        Generate index recommendations for a given table.

        Args:
            table_name: Target table name
            sample_queries: Representative queries that use this table

        Returns:
            {"recommended_indexes": [...], "ddl_statements": [...], "rationale": "..."}
        """
        table_stats = await self._get_table_statistics(table_name)
        existing_indexes = await self._get_existing_indexes(table_name)
        return await self._generate_recommendations(
            table_name, table_stats, existing_indexes, sample_queries
        )

    async def _get_table_statistics(self, table_name: str) -> list[dict]:
        """Fetch column statistics from pg_stats"""
        conn = await asyncpg.connect(self.dsn)
        try:
            stats = await conn.fetch("""
                SELECT
                    attname AS column_name,
                    n_distinct,
                    correlation,
                    most_common_vals::text,
                    null_frac
                FROM pg_stats
                WHERE tablename = $1
                ORDER BY n_distinct DESC
            """, table_name)
            return [dict(s) for s in stats]
        finally:
            await conn.close()

    async def _get_existing_indexes(self, table_name: str) -> list[str]:
        """Retrieve existing index definitions"""
        conn = await asyncpg.connect(self.dsn)
        try:
            indexes = await conn.fetch(
                "SELECT indexdef FROM pg_indexes WHERE tablename = $1", table_name
            )
            return [idx["indexdef"] for idx in indexes]
        finally:
            await conn.close()

    async def _generate_recommendations(
        self, table_name, stats, existing_indexes, sample_queries
    ) -> dict:
        prompt = f"""You are a PostgreSQL index design expert.

## Table: {table_name}

## Column Statistics:
{json.dumps(stats, indent=2)}

## Existing Indexes:
{chr(10).join(existing_indexes)}

## Representative Query Patterns:
{chr(10).join(f"- {q}" for q in sample_queries)}

Based on the above, return index recommendations in this JSON format:
{{
  "recommended_indexes": [
    {{
      "columns": ["col1", "col2"],
      "type": "btree",
      "reason": "Why this index is beneficial",
      "priority": "high/medium/low"
    }}
  ],
  "ddl_statements": [
    "CREATE INDEX CONCURRENTLY idx_{table_name}_xxx ON {table_name}(col1, col2);"
  ],
  "existing_redundant_indexes": ["Any existing indexes that could be dropped"],
  "rationale": "Overall reasoning behind the recommendations (2-3 sentences)"
}}
"""

        response = self.client.models.generate_content(
            model="gemini-2.5-pro",
            contents=[types.Content(role="user", parts=[types.Part(text=prompt)])],
            config=types.GenerateContentConfig(
                temperature=0.2,
                max_output_tokens=2048,
                response_mime_type="application/json"
            )
        )

        return json.loads(response.text)

7. Schema Review System

Before any DDL change reaches production, let Gemini review it. This catches common mistakes — missing NOT NULL constraints, poor data type choices, risky migrations without CONCURRENT — before they cause downtime.

# schema_reviewer.py
from google import genai
from google.genai import types
import json
 
class SchemaReviewer:
    """Reviews proposed DDL changes using Gemini"""
 
    def __init__(self, api_key: str, schema_context: str):
        self.client = genai.Client(api_key=api_key)
        self.schema_context = schema_context
 
    async def review_ddl(self, proposed_ddl: str) -> dict:
        """
        Review a proposed DDL statement.
 
        Returns:
            {"score": 85, "approved": True, "issues": [...], "suggestions": [...]}
        """
        prompt = f"""You are a PostgreSQL database design reviewer.
Review the proposed DDL change in the context of the existing schema.
 
## Existing Schema:
{self.schema_context}
 
## Proposed DDL:
```sql
{proposed_ddl}

Return your review as JSON: {{ "score": <0-100>, "approved": <true/false>, "issues": [ {{ "severity": "critical/warning/info", "message": "Description of the issue", "line": "Relevant line if applicable" }} ], "suggestions": [ "Improvement suggestion 1", "Improvement suggestion 2" ], "performance_impact": "Assessment of performance impact", "migration_notes": "Notes for safe production deployment" }}

Review dimensions:

  • NOT NULL constraints and DEFAULT values

  • Index strategy (PK, FK, search indexes)

  • Data type selection (UUID vs SERIAL vs BIGSERIAL)

  • Naming convention consistency

  • Impact on existing data (locks, downtime risk)

  • Use of CONCURRENT for index creation """

      response = self.client.models.generate_content(
          model="gemini-2.5-pro",
          contents=[types.Content(role="user", parts=[types.Part(text=prompt)])],
          config=types.GenerateContentConfig(
              temperature=0.1,
              max_output_tokens=2048,
              response_mime_type="application/json"
          )
      )
    
      return json.loads(response.text)
    

---

## 8. Assembling the FastAPI Server

With all components built, wire them together into a REST API.

```python
# main.py
import os
from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from dotenv import load_dotenv

from schema_introspector import SchemaIntrospector
from gemini_agent import GeminiQueryAgent
from query_analyzer import QueryAnalyzer
from index_advisor import IndexAdvisor
from schema_reviewer import SchemaReviewer

load_dotenv()

GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
DATABASE_URL = os.getenv("DATABASE_URL")

introspector: SchemaIntrospector = None
query_agent: GeminiQueryAgent = None

@asynccontextmanager
async def lifespan(app: FastAPI):
    """Load schema context at startup"""
    global introspector, query_agent
    introspector = SchemaIntrospector(DATABASE_URL)
    await introspector.init_pool()
    schema_context = await introspector.get_schema_context()
    query_agent = GeminiQueryAgent(GEMINI_API_KEY, schema_context)
    print("✅ Schema context loaded")
    yield
    await introspector.close()

app = FastAPI(
    title="DB AI Optimizer API",
    description="Gemini 2.5 Pro × PostgreSQL Optimization System",
    lifespan=lifespan
)

class SQLGenerateRequest(BaseModel):
    query: str

class QueryAnalyzeRequest(BaseModel):
    sql: str

class IndexAdviseRequest(BaseModel):
    table_name: str
    sample_queries: list[str]

class SchemaReviewRequest(BaseModel):
    ddl: str

@app.post("/sql/generate")
async def generate_sql(req: SQLGenerateRequest):
    """Convert natural language to SQL"""
    try:
        return await query_agent.generate_sql(req.query)
    except ValueError as e:
        raise HTTPException(status_code=400, detail=str(e))
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"SQL generation error: {str(e)}")

@app.post("/query/analyze")
async def analyze_query(req: QueryAnalyzeRequest):
    """Analyze query performance with EXPLAIN + Gemini"""
    analyzer = QueryAnalyzer(GEMINI_API_KEY, DATABASE_URL)
    try:
        return await analyzer.analyze_query(req.sql)
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Analysis error: {str(e)}")

@app.post("/index/advise")
async def advise_index(req: IndexAdviseRequest):
    """Generate index recommendations for a table"""
    advisor = IndexAdvisor(GEMINI_API_KEY, DATABASE_URL)
    try:
        return await advisor.advise(req.table_name, req.sample_queries)
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Advisor error: {str(e)}")

@app.post("/schema/review")
async def review_schema(req: SchemaReviewRequest):
    """AI-powered DDL review"""
    schema_context = await introspector.get_schema_context()
    reviewer = SchemaReviewer(GEMINI_API_KEY, schema_context)
    try:
        return await reviewer.review_ddl(req.ddl)
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Review error: {str(e)}")

@app.get("/health")
async def health_check():
    return {"status": "ok", "model": "gemini-2.5-pro"}

Starting the server

# Development
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
 
# Production (multi-worker)
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

9. Production Security Design

Create a read-only database role

All SQL generated by the AI should execute under a read-only database user. This is non-negotiable for production systems.

-- Create read-only role
CREATE ROLE ai_optimizer_readonly;
GRANT CONNECT ON DATABASE yourdb TO ai_optimizer_readonly;
GRANT USAGE ON SCHEMA public TO ai_optimizer_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO ai_optimizer_readonly;
 
-- Apply to future tables automatically
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO ai_optimizer_readonly;
 
CREATE USER ai_optimizer_user WITH PASSWORD 'strong_password'
  IN ROLE ai_optimizer_readonly;

Token budget management

# token_budget.py
import asyncio
from collections import defaultdict
from datetime import date
 
class TokenBudgetManager:
    """
    Tracks Gemini API token consumption and enforces
    daily budget limits to prevent runaway costs
    """
 
    def __init__(self, daily_budget: int = 1_000_000):
        self.daily_budget = daily_budget
        self._usage: dict[str, int] = defaultdict(int)
        self._lock = asyncio.Lock()
 
    async def consume(self, tokens: int) -> bool:
        """Consume tokens. Returns False if daily budget is exceeded."""
        async with self._lock:
            today = str(date.today())
            if self._usage[today] + tokens > self.daily_budget:
                return False
            self._usage[today] += tokens
            return True
 
    def get_today_usage(self) -> int:
        return self._usage.get(str(date.today()), 0)

10. Cost Management and Monitoring

Understanding the cost profile of this system helps you plan capacity and avoid surprises.

Estimated token costs (Gemini 2.5 Pro, April 2026):

  • Input: $1.25 / 1M tokens (≤128k context)
  • Output: $10.00 / 1M tokens

Per-request estimates:

  • Text-to-SQL: 2,000–5,000 input tokens, 200–500 output tokens
  • EXPLAIN analysis: 3,000–8,000 input tokens, 500–1,000 output tokens
  • Index recommendations: 2,000–4,000 input tokens, 1,000–2,000 output tokens

At 100 requests per day, total costs typically fall in the $0.50–$2.00 range — well within budget for internal tooling.

To cut costs further, leverage Gemini's native context caching (cachedContent). Since the schema prompt is reused across every request, caching it can reduce input token costs significantly on high-volume deployments. See the context caching cost optimization guide for implementation details.


11. Extending to a Slack Bot for Team-Wide Access

One of the most practical extensions of this system is a Slack bot that lets any engineer on the team query the database using natural language — without needing direct database access or SQL knowledge.

# slack_bot.py — minimal Slack integration using slack_bolt
from slack_bolt import App
from slack_bolt.adapter.fastapi import SlackRequestHandler
import httpx
 
app = App(token="YOUR_SLACK_BOT_TOKEN", signing_secret="YOUR_SIGNING_SECRET")
handler = SlackRequestHandler(app)
 
DB_AI_API = "http://localhost:8000"
 
@app.command("/dbquery")
async def handle_db_query(ack, body, say):
    """
    Slack slash command: /dbquery <natural language question>
    Example: /dbquery Show top 10 users by revenue this month
    """
    await ack()
 
    user_query = body.get("text", "").strip()
    if not user_query:
        await say("Please provide a query. Example: `/dbquery Show top 10 users by revenue`")
        return
 
    # Post "thinking" message first
    await say(f"🤖 Generating SQL for: _{user_query}_")
 
    async with httpx.AsyncClient() as client:
        try:
            # Generate SQL
            sql_response = await client.post(
                f"{DB_AI_API}/sql/generate",
                json={"query": user_query},
                timeout=30.0
            )
            result = sql_response.json()
 
            # Format and post result
            message = (
                f"*Generated SQL:*\n```{result['sql']}```\n"
                f"*Explanation:* {result['explanation']}\n"
                f"*Token usage:* {result['usage']['input_tokens']} in / "
                f"{result['usage']['output_tokens']} out"
            )
            await say(message)
 
        except Exception as e:
            await say(f"❌ Error: {str(e)}")
 
@app.command("/dbexplain")
async def handle_db_explain(ack, body, say):
    """
    Slack slash command: /dbexplain <SQL>
    Analyzes the EXPLAIN plan and returns bottlenecks
    """
    await ack()
    sql = body.get("text", "").strip()
    if not sql:
        await say("Please provide a SQL statement to analyze.")
        return
 
    await say("🔍 Analyzing query performance...")
 
    async with httpx.AsyncClient() as client:
        try:
            response = await client.post(
                f"{DB_AI_API}/query/analyze",
                json={"sql": sql},
                timeout=30.0
            )
            result = response.json()
 
            bottlenecks = "\n".join(f"• {b}" for b in result.get("bottlenecks", []))
            suggestions = "\n".join(f"• {s}" for s in result.get("suggestions", []))
 
            message = (
                f"*Performance Summary:* {result.get('summary', 'N/A')}\n"
                f"*Estimated Cost:* {result.get('estimated_cost', 'N/A')}\n\n"
                f"*Bottlenecks:*\n{bottlenecks}\n\n"
                f"*Suggestions:*\n{suggestions}"
            )
            await say(message)
 
        except Exception as e:
            await say(f"❌ Error: {str(e)}")

To deploy the Slack bot alongside your FastAPI server, add it as a new router:

# In main.py — add Slack event router
from fastapi import Request
from slack_bot import handler
 
@app.post("/slack/events")
async def slack_events(req: Request):
    return await handler.handle(req)

With this in place, your team can run natural language database queries directly from Slack — and get EXPLAIN analysis without ever touching the database console.


12. Deploying with Docker and GitHub Actions

To make this system production-ready, containerize it and set up automated CI/CD with schema review as a gate.

Dockerfile

FROM python:3.12-slim
 
WORKDIR /app
 
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
 
COPY . .
 
EXPOSE 8000
 
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]

GitHub Actions: Automated Schema Review Gate

Add this workflow to your repository to automatically review any migration file that is modified in a pull request:

# .github/workflows/schema-review.yml
name: AI Schema Review
 
on:
  pull_request:
    paths:
      - 'migrations/**/*.sql'
      - 'db/migrations/**'
 
jobs:
  schema-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
 
      - name: Get changed migration files
        id: changed
        run: |
          CHANGED=$(git diff --name-only origin/main...HEAD | grep -E '\.(sql)$' | head -5)
          echo "files=$CHANGED" >> $GITHUB_OUTPUT
 
      - name: Run AI schema review
        if: steps.changed.outputs.files != ''
        env:
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
          DB_AI_API: ${{ secrets.DB_AI_API_URL }}
        run: |
          for file in ${{ steps.changed.outputs.files }}; do
            echo "Reviewing: $file"
            DDL=$(cat "$file")
            RESULT=$(curl -s -X POST "$DB_AI_API/schema/review" \
              -H "Content-Type: application/json" \
              -d "{\"ddl\": $(echo "$DDL" | python3 -c 'import json,sys; print(json.dumps(sys.stdin.read()))')}")
            SCORE=$(echo $RESULT | python3 -c "import json,sys; print(json.loads(sys.stdin.read()).get('score', 0))")
            echo "Score for $file: $SCORE/100"
            if [ "$SCORE" -lt 60 ]; then
              echo "❌ Schema review failed (score: $SCORE < 60). Check issues in the review result."
              echo "$RESULT" | python3 -m json.tool
              exit 1
            fi
            echo "✅ Schema review passed (score: $SCORE)"
          done

This workflow ensures that every database migration is reviewed by Gemini before it can be merged into main — giving your team an always-on AI code reviewer for database changes.


Summary

In this guide, we built a complete AI-driven PostgreSQL optimization system powered by Gemini 2.5 Pro. The four core capabilities we implemented:

  • Text-to-SQL: Natural language to safe, executable PostgreSQL queries
  • Query Analysis: EXPLAIN plan parsing with AI-generated bottleneck identification and improvement suggestions
  • Index Recommendations: Data-driven indexing strategy from table statistics and query patterns
  • Schema Review: Pre-production DDL review to catch design issues before they cause incidents

The two most important production concerns to keep in mind: enforce read-only database access for all AI-generated queries, and use context caching to control Gemini API costs at scale. From this foundation, you can extend the system to integrate with Slack for interactive DB queries, embed it in your CI/CD pipeline for automatic schema review gates, or build a full internal DB assistant portal.

Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

API / SDK2026-03-24
Building a Semantic Search Engine with Gemini API and PostgreSQL pgvector — From Embeddings to Production Deployment
Learn how to build a production-ready semantic search engine using Gemini's Embedding API and PostgreSQL pgvector. Covers vector generation, index optimization, and building a FastAPI search service with practical code examples.
API / SDK2026-04-21
Building a Voice Memo → Structured Data Pipeline with Gemini API: From Recording to Calendar Integration
Build a complete Python pipeline that transcribes voice memos with Gemini API, extracts tasks and events using Structured Output, and automatically registers them to Google Calendar and Tasks. Includes production-ready FastAPI implementation with error handling and cost analysis.
API / SDK2026-04-15
Designing a Production Prompt Management System for Gemini API — Versioning, A/B Testing, and Canary Rollouts
A complete implementation guide for solving the prompt versioning, attribution, and safety challenges in production Gemini API deployments — using FastAPI, PostgreSQL, Redis, A/B testing, and canary rollouts.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →