GEMINI LABJP
SIRI — WWDC 2026 confirms the revamped Siri runs on a Google Gemini model, though it won't ship in the EU at iOS 27 due to the DMAFLASH3.5 — Gemini 3.5 Flash is now GA, the top Flash model for sustained frontier performance on agentic and coding tasksIMAGE-GA — Gemini 3.1 Flash Image and 3.1 Pro Image are GA as native visual models; the preview versions shut down Jun 25MANAGED-AGENTS — Managed Agents launch in public preview in the Gemini API, running autonomous agents in Google-hosted isolated Linux sandboxesFILE-SEARCH — File Search now supports multimodal search, with native image embedding and retrieval via gemini-embedding-2DEPRECATION — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25 — migrate to the GA models soonSIRI — WWDC 2026 confirms the revamped Siri runs on a Google Gemini model, though it won't ship in the EU at iOS 27 due to the DMAFLASH3.5 — Gemini 3.5 Flash is now GA, the top Flash model for sustained frontier performance on agentic and coding tasksIMAGE-GA — Gemini 3.1 Flash Image and 3.1 Pro Image are GA as native visual models; the preview versions shut down Jun 25MANAGED-AGENTS — Managed Agents launch in public preview in the Gemini API, running autonomous agents in Google-hosted isolated Linux sandboxesFILE-SEARCH — File Search now supports multimodal search, with native image embedding and retrieval via gemini-embedding-2DEPRECATION — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25 — migrate to the GA models soon
Articles/Dev Tools
Dev Tools/2026-06-02Advanced

A Lightweight Gemini Backend with Bun and Hono — Reclaiming the Small Tools of Indie Development

Has your Node and Express Gemini backend grown heavy with dependencies and build times? Here is how I moved one to Bun and Hono — folding streaming, rate limiting, cost caps, testing, and self-hosting into a single light runtime — along with the pitfalls I hit in production.

gemini-api285bunhonobackend4streaming29indie-dev38production124cloudflare-workers8

Premium Article

Late one night, fixing a tiny internal tool that does nothing more than call Gemini, I noticed its node_modules had crept past 300MB. The whole job was: take one app review, return a summary. Yet it was dragging Express, a TypeScript build, and a process manager for hot reload, and every small change meant waiting for the environment to spin back up.

I have built iOS and Android apps on my own since 2014 — these days mostly wallpaper and calm, well-being titles, several running in parallel. Cumulative downloads have passed 50 million, and revenue comes mainly from AdMob. The apps themselves are Swift and Kotlin, but the small tools behind them — review analysis, metadata generation, image tagging — I had long left on the same recycled Node and Express stack. Moving one of them to Bun and Hono made it noticeably lighter, so I want to leave behind the reasoning as much as the code.

Why I decided to add "one more backend"

Let me be honest up front: most of my production backends still run on Cloudflare Workers. For a few hundred yen a month I get global edge deployment, and I can add a Worker per app without operations falling apart. So this is not a "move everything to Bun" story.

The trigger was local development experience. Workers are wonderful in production, but when I want to iterate on a long Gemini stream locally, nudging the rate-limit logic a little at a time, the emulator restarts and build waits add up. For a small tool I run by hand, an environment where install, run, and test all live in one binary fits the limited hands of indie development better.

Bun is a runtime, a package manager, and a test runner at once. Hono is a routing framework on top that stays close to web standards (Request / Response) — and it also runs unchanged on Cloudflare Workers. That means "iterate locally on Bun, ship the same code to Workers" actually holds. That is what made the extra environment worth its keep.

What changes from Node + Express — the smallest Before / After

Let me start with the dullest and most effective difference. Here is an endpoint that calls Gemini once, in Express and then in Hono.

The Express version I used to write:

// server.express.ts — the old way
import express from "express";
import { GoogleGenerativeAI } from "@google/generative-ai";
 
const app = express();
app.use(express.json());
 
const genai = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
 
app.post("/summarize", async (req, res) => {
  try {
    const model = genai.getGenerativeModel({ model: "gemini-2.5-flash" });
    const result = await model.generateContent(req.body.text);
    res.json({ summary: result.response.text() });
  } catch (e) {
    res.status(500).json({ error: "failed" });
  }
});
 
app.listen(3000, () => console.log("listening on 3000"));

The same thing in Hono:

// server.ts — Bun + Hono
import { Hono } from "hono";
import { GoogleGenerativeAI } from "@google/generative-ai";
 
const app = new Hono();
const genai = new GoogleGenerativeAI(Bun.env.GEMINI_API_KEY!);
 
app.post("/summarize", async (c) => {
  const { text } = await c.req.json();
  const model = genai.getGenerativeModel({ model: "gemini-2.5-flash" });
  const result = await model.generateContent(text);
  return c.json({ summary: result.response.text() });
});
 
export default app; // Bun and Workers both accept this as-is

The line count barely differs, but the real difference is the final export default app. Express's app.listen is code that "starts" a server, bound to its environment. Hono only "exposes" a function that takes a Request and returns a Response, leaving who starts it — Bun's server, the Workers runtime, app.request() inside a test — to the caller. That single fact is what later lets the same code run in two places.

Running it is just bun run server.ts. No ts-node, no nodemon. Switch to bun --hot server.ts and hot reload comes built in. On my machine, node_modules dropped from roughly 300MB on the Express setup to the low 40s of MB, and bun install finished in under a second. More than the numbers, the "edit, try" round trip simply felt lighter — which matters for a tool you touch every day.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
Move a bloated Node + Express Gemini backend onto Bun + Hono and cut dependencies and cold start with a measured, reproducible setup
Split streaming, rate limiting, cost caps, and observability into small one-file middlewares you can reuse across several indie apps
Run the exact same code on both Cloudflare Workers and a Bun self-host, and decide which to lean on based on your own cost structure
Secure payment via Stripe · Cancel anytime
Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Dev Tools2026-04-03
Next.js 15 App Router × Gemini API: The Complete Full-Stack
Build production-grade full-stack AI applications with Next.js 15 App Router and the Gemini API. Covers Server Actions, Streaming, RAG pipelines, authentication, rate limiting, and deployment.
Dev Tools2026-03-31
Gemini API × React Native Operational Notes — Indie Mobile App in Production
Operational notes from running Gemini API inside a React Native/Expo indie mobile app on iOS and Android: real device pitfalls, AdMob coexistence, Cold Start mitigation, AsyncStorage TTL design, and cost realities at 65,000 monthly requests
Dev Tools2026-03-27
Gemini 3.1 Pro × Cloud Run: Building Production Serverless AI APIs
Deploy Gemini 3.1 Pro on Cloud Run with SSE streaming, auto-scaling, cold start optimization, and production monitoring — the definitive guide to building serverless AI APIs.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →