GEMINI LABJP
CLI — As of Jun 18, Gemini CLI and the Gemini Code Assist IDE extensions stop serving AI Pro/Ultra and free individual users; Antigravity CLI is the successorFLASH — The Gemini 3.5 series begins with 3.5 Flash, built for agents and coding with strength on long-horizon tasksDEEPTHINK — Gemini 3 Deep Think is rolling out to Google AI Ultra as the top reasoning mode for math, science, and logicAPP — The Gemini app gains a Daily Brief, a redesigned interface, the Gemini Omni video model, and a personal agent called Gemini SparkDESIGN — A new design language, Neural Expressive, rebuilds the experience for richer visuals and faster switching between modalitiesULTRA — Google AI Ultra bundles top model access, Deep Research, Veo 3 video, and a 1M-token context windowCLI — As of Jun 18, Gemini CLI and the Gemini Code Assist IDE extensions stop serving AI Pro/Ultra and free individual users; Antigravity CLI is the successorFLASH — The Gemini 3.5 series begins with 3.5 Flash, built for agents and coding with strength on long-horizon tasksDEEPTHINK — Gemini 3 Deep Think is rolling out to Google AI Ultra as the top reasoning mode for math, science, and logicAPP — The Gemini app gains a Daily Brief, a redesigned interface, the Gemini Omni video model, and a personal agent called Gemini SparkDESIGN — A new design language, Neural Expressive, rebuilds the experience for richer visuals and faster switching between modalitiesULTRA — Google AI Ultra bundles top model access, Deep Research, Veo 3 video, and a 1M-token context window
Articles/Dev Tools
Dev Tools/2026-04-30Intermediate

Ship a Production Gemini Agent in 30 Minutes with Mastra and TypeScript

Mastra keeps the lightness of Vercel AI SDK while adding the agent primitives you actually need in production. This guide walks through building, debugging, and deploying a Gemini-powered Mastra agent end-to-end, including the Cloudflare Workers gotchas that bit me first.

gemini83mastratypescript15ai-agent2vercel-ai-sdkcloudflare-workers5

The moment you decide to write an AI agent in TypeScript, you immediately have to pick a stack — call Vercel AI SDK directly, wire up LangChain.js, try Google's TypeScript ADK, or something else. Each option has a real reason to exist, and the choice paralysis is real.

After spending the past few months actually shipping with all of them, I keep coming back to Mastra for the indie-and-production sweet spot. It preserves the thinness of Vercel AI SDK while giving you a unified API for agents, tools, memory, and workflows — which fits the rhythm of a single developer iterating quickly.

Why Mastra (vs. AI SDK alone, vs. LangChain.js)

The shortest description I can give Mastra is: a thin layer on top of Vercel AI SDK that adds only what an agent really needs. It uses AI SDK's generateText and streamText under the hood, so if you already know AI SDK, the learning curve is essentially zero.

Comparing the three options as I've used them:

  • Vercel AI SDK alone. Minimal and elegant for one-shot model calls, but you end up writing your own agent loop, memory store, and workflow runner. The codebase grows fast.
  • LangChain.js. Feature-rich but the abstractions are deep. When something goes wrong, tracing where the prompt was actually assembled tends to take more time than it should.
  • Mastra. Stays close to AI SDK's surface. You get createAgent, createTool, and Workflow primitives without losing the feeling of writing TypeScript. The built-in mastra dev dashboard makes agent debugging far less painful.

If you've never used Vercel AI SDK with Gemini, the Next.js + AI SDK guide is a smoother on-ramp before this one.

Project setup

The fastest path is the official create-mastra scaffold. You'll want Node.js 20 or newer.

# Interactive scaffold (asks for project name / components / examples)
npx create-mastra@latest my-gemini-agent
 
cd my-gemini-agent
 
# Add the Google provider for Gemini
npm install @ai-sdk/google

Drop your Gemini API key into .env. Forgetting this gives you a "model not found" style error on first run, not a clean 401, so it's worth checking before anything else.

# .env
GOOGLE_GENERATIVE_AI_API_KEY=YOUR_GEMINI_API_KEY

You can issue a key from Google AI Studio. For production we'll move it into Cloudflare Workers Secrets or Vercel environment variables, so this value is just for local dev.

A first agent on Gemini 2.5 Flash

An agent in Mastra is "model + instructions + tools." We'll start with no tools and just instructions to keep things minimal.

// src/mastra/agents/assistant.ts
import { Agent } from "@mastra/core/agent";
import { google } from "@ai-sdk/google";
 
export const assistantAgent = new Agent({
  name: "AssistantAgent",
  instructions: `
You are an editor-assistant for a developer blog.
- Reply concisely.
- If you are unsure about something, say "needs verification" rather than guessing.
- When asked for code, always answer in TypeScript.
  `,
  model: google("gemini-2.5-flash"),
});

Register it on the Mastra instance:

// src/mastra/index.ts
import { Mastra } from "@mastra/core";
import { assistantAgent } from "./agents/assistant";
 
export const mastra = new Mastra({
  agents: { assistantAgent },
});

Run npx mastra dev and a local dashboard opens in your browser. You can chat with the agent, but more importantly you can see every internal LLM call, the tokens used, and the latency per step. That alone makes the early tuning loop much shorter.

A pragmatic note on model choice: I run on gemini-2.5-flash during the build phase and only switch to gemini-2.5-pro once I'm tightening quality. The latency and cost profile of Flash matches the trial-and-error rhythm of agent development much better than Pro.

Adding a tool so the agent can act

A purely conversational agent can be written without Mastra. The leverage shows up the moment you start adding tools, because Mastra lets you declare them with Zod schemas and Gemini's function calling routes to them automatically.

// src/mastra/tools/weather.ts
import { createTool } from "@mastra/core/tools";
import { z } from "zod";
 
export const weatherTool = createTool({
  id: "get-weather",
  description: "Fetch the current weather for a given city",
  inputSchema: z.object({
    city: z.string().describe("City name, e.g. Tokyo, New York"),
  }),
  outputSchema: z.object({
    temperature: z.number(),
    description: z.string(),
  }),
  execute: async ({ context }) => {
    const res = await fetch(
      `https://wttr.in/${encodeURIComponent(context.city)}?format=j1`,
    );
    if (!res.ok) {
      throw new Error(`Weather API failed: ${res.status}`);
    }
    const data = await res.json();
    return {
      temperature: Number(data.current_condition[0].temp_C),
      description: data.current_condition[0].weatherDesc[0].value,
    };
  },
});

Wire it to the agent:

import { weatherTool } from "../tools/weather";
 
export const assistantAgent = new Agent({
  name: "AssistantAgent",
  instructions: "...(see above)",
  model: google("gemini-2.5-flash"),
  tools: { weatherTool },
});

The non-obvious part: write your description strings as if they were the only thing the model can see, because they are. Gemini decides whether and when to call a tool largely from the description, so vague descriptions cause both false positives ("called the tool when it shouldn't have") and false negatives ("ignored the tool when it should have used it"). I burned a lot of cycles on this before realizing how much weight that one string carries.

Memory: keep context without blowing up the prompt

For short conversations you can stuff history into the prompt manually, but it doesn't scale — both Gemini's input limit and your bill object. Mastra's Memory abstracts this out and lets you swap storage backends without touching your agent code.

// src/mastra/memory.ts
import { Memory } from "@mastra/memory";
import { LibSQLStore } from "@mastra/libsql";
 
export const memory = new Memory({
  storage: new LibSQLStore({
    url: "file:./mastra.db", // swap to Turso or Cloudflare D1 in production
  }),
  options: {
    lastMessages: 20,        // include the last 20 messages in the prompt
    semanticRecall: false,   // flip to true to enable embedding-based recall
  },
});
export const assistantAgent = new Agent({
  // ...
  memory,
});

Local dev runs against a SQLite file. Cloudflare Workers has no filesystem, so plan to switch to Turso (hosted libSQL) or Cloudflare D1 before you deploy — much cheaper to do that swap when there are 50 lines of code than 500.

If you want a deeper comparison of memory and retrieval design, the LangChain.js production agent guide is a good companion read for this section.

Tightening with mastra dev traces

npx mastra dev does more than chat — it gives you a step-by-step trace per request: which tools fired, which prompts were sent, how many tokens each call used. The loop I recommend is:

  1. After adding a tool, run a few prompts in mastra dev and check the trace to see whether the tool gets selected when expected.
  2. After every instruction tweak, re-verify that tool selection still works for both positive and negative cases.
  3. When the agent picks the wrong tool, edit the description first — not the system prompt.

Agent development comes down to a tight "instructions → run → inspect → edit" loop. Mastra's win is that this loop fits on one screen.

Cloudflare Workers deployment gotchas

A few things I tripped over the first time I shipped a Mastra agent to Workers:

  • @mastra/libsql does not run on Workers. It assumes a filesystem. Replace it with @mastra/cloudflare-d1 or a hosted Turso client.
  • You need the Node.js compatibility flag. Without compatibility_flags = ["nodejs_compat"] in wrangler.toml, @ai-sdk/google blows up on a Buffer reference.
  • Workers don't expose process.env. Read secrets from the env argument of the fetch handler. Either initialize your Mastra agent inside the handler, or inject the API key into a global once per request — both work, but mixing the two patterns gets messy.

The Hono + Cloudflare Workers + Gemini guide is also a useful reference if you want a thin API gateway in front of your Mastra agent at the edge.

Wrapping up

After running through this stack a few times, my honest take is that Mastra hits the rare sweet spot of "doesn't hide what AI SDK is doing, but adds the production scaffolding you'd otherwise write yourself." It's not as feature-complete as LangChain.js, but the trade-off is that you spend less time decoding abstractions and more time writing prompts and tools.

If a single agent worked for you today, the natural next step is Workflow. Splitting "user question → research agent → summarizer → social-post formatter" into a workflow makes each agent's prompt simpler and the overall behavior much easier to reason about. That's where I'd point you next.

Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

Dev Tools2026-03-31
Building Custom MCP Servers for Gemini API — Extending AI Agents with TypeScript
Learn how to build custom Model Context Protocol (MCP) servers in TypeScript and integrate them with Gemini API. Covers architecture, authentication, error handling, and production deployment patterns.
Dev Tools2026-06-17
Running Gemini Chat History on Redis — Field Notes on Not Losing Conversation State in Production
Keep a Gemini ChatSession in process memory and it evaporates on every redeploy or scale event. Here is how I back it with Redis in production, covering token budgets, concurrent sends, SDK coupling, and graceful degradation, with the code I actually run.
Dev Tools2026-06-17
Catching Deprecated Gemini Models in CI ― A Guard for Back-to-Back Shutdown Deadlines
When shutdowns and deprecations pile up, build a CI check that mechanically finds stale Gemini model strings across your repo. Includes a deprecation registry, a scanner, and a days-remaining warn/fail tier you can copy and run.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →