⟐ Dev Tools/2026-04-18Advanced

Building Cross-Platform AI Desktop Apps with Gemini API and Tauri 2.0

A practical guide to building macOS, Windows, and Linux desktop AI apps using Tauri 2.0 and Gemini API. Learn secure API key management with a Rust backend, real-time streaming via Tauri's event system, and native OS integration with working production code.

gemini-api²⁷⁸ tauri rust² desktop cross-platform³ streaming²⁸

✦ Premium Article

The question of where to store an API key is one of those nagging problems that follows you when building AI-powered apps. A separate backend server works, but it's overkill for a solo project. Putting the key directly in browser JavaScript is obviously a non-starter. For a while, I just lived with the trade-off — until I started taking Tauri seriously.

Tauri 2.0 solves this cleanly. Its Rust backend owns the API key entirely, and frontend JavaScript can never reach it. The bundle size comes in at a fraction of Electron's footprint, startup is noticeably faster, and the Rust-to-TypeScript interop — via Tauri's IPC command system — is genuinely ergonomic once you understand the patterns.

This guide covers everything you need to ship a Gemini-powered desktop AI assistant with Tauri 2.0: secure API key handling in the Rust layer, real-time streaming through Tauri's event system, multi-turn conversation management, native OS feature integration (clipboard, notifications, filesystem), common pitfalls with concrete fixes, and a production build pipeline for all three major platforms.

Why Tauri 2.0 Over Electron for AI Apps

The Electron vs. Tauri conversation usually focuses on bundle size, but for AI apps the more meaningful difference is the security model.

Bundle size is still worth mentioning. Electron ships Node.js and Chromium together, landing at 60–100+ MB minimum for even a trivial app. Tauri uses the OS's existing browser engine — WKWebView on macOS, WebView2 on Windows, WebKitGTK on Linux — bringing bundles down to 6–15 MB. For a desktop tool you distribute to friends or colleagues, this is a real quality-of-life improvement.

Security model matters more for AI apps specifically. By default, Tauri's frontend JavaScript cannot access OS-level capabilities — clipboard, filesystem, HTTP, notifications — without explicit permission grants defined in src-tauri/capabilities/. This isn't a configuration option you can accidentally omit; it's the default posture. Your Gemini API key lives only in the Rust process. The TypeScript layer calls named commands and receives results. That's it.

Backend performance: Rust's async runtime (tokio) handles concurrent Gemini API requests efficiently. If you're building something that fans out multiple API calls — like a document analyzer that processes several sections in parallel — you get genuine concurrency without worrying about an event loop.

Memory footprint: A typical Tauri app idles at 20–40 MB RAM. The equivalent Electron app runs 150–300 MB. For a background AI assistant that users keep open all day, this actually matters.

Project Setup and Prerequisites

You'll need: Rust (via rustup, version 1.77 or later), Node.js 20+, and npm or pnpm. On macOS, install Xcode Command Line Tools first (xcode-select --install). On Windows, WebView2 ships with Windows 11 automatically; for Windows 10 you'll need the standalone WebView2 Runtime installer.

# Install the Tauri CLI globally
npm install --global @tauri-apps/cli@latest
 
# Scaffold a new project using the React + TypeScript template
npm create tauri-app@latest gemini-desktop -- --template react-ts
cd gemini-desktop
 
# Install frontend dependencies
npm install

Open src-tauri/Cargo.toml and replace the [dependencies] section with the following:

[dependencies]
tauri = { version = "2", features = ["protocol-asset"] }
tauri-plugin-shell = "2"
tauri-plugin-notification = "2"
tauri-plugin-clipboard-manager = "2"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
reqwest = { version = "0.12", features = ["json", "stream"] }
futures-util = "0.3"
tokio = { version = "1", features = ["full"] }
 
[dev-dependencies]
dotenvy = "0.15"
 
[build-dependencies]
tauri-build = { version = "2", features = [] }

Create a .env file inside src-tauri/ for local development and add it to .gitignore immediately:

# src-tauri/.env  (never commit this file)
GEMINI_API_KEY=YOUR_GEMINI_API_KEY

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Developers struggling with where to safely store API keys will get a clean, production-grade solution using Tauri 2.0's Rust backend — keeping keys completely out of client-side JavaScript

✦You'll receive working code for streaming Gemini API responses through Tauri's event system in real time, ready to drop into your own project today

✦By the end, you'll have a multi-platform build pipeline covering macOS, Windows, and Linux — including code signing setup so users don't hit security warnings on first launch

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Secure API Key Management in the Rust Backend

Here's the pattern that makes Tauri compelling for AI apps: the Rust backend loads the API key from the environment, and the frontend JavaScript never touches it. All Gemini API calls flow through named Rust commands.

The full src-tauri/src/main.rs entry point:

// src-tauri/src/main.rs
#\![cfg_attr(not(debug_assertions), windows_subsystem = "windows")]
 
use reqwest::Client;
use serde::{Deserialize, Serialize};
use std::sync::OnceLock;
 
// Reuse a single HTTP client across all requests (connection pool)
static HTTP_CLIENT: OnceLock<Client> = OnceLock::new();
 
fn get_client() -> &'static Client {
    HTTP_CLIENT.get_or_init(|| {
        Client::builder()
            .timeout(std::time::Duration::from_secs(120))
            .build()
            .expect("Failed to initialize HTTP client")
    })
}
 
fn get_api_key() -> Result<String, String> {
    std::env::var("GEMINI_API_KEY").map_err(|_| {
        "GEMINI_API_KEY is not set. \
         Create a .env file in src-tauri/ with your key.".to_string()
    })
}
 
fn main() {
    // Load .env in debug builds only — release binaries never include this file
    #[cfg(debug_assertions)]
    dotenvy::dotenv().ok();
 
    tauri::Builder::default()
        .plugin(tauri_plugin_shell::init())
        .plugin(tauri_plugin_notification::init())
        .plugin(tauri_plugin_clipboard_manager::init())
        .invoke_handler(tauri::generate_handler\![
            generate_text,
            generate_text_stream,
        ])
        .run(tauri::generate_context\!())
        .expect("Failed to start Tauri application");
}

For production distribution, you have two practical strategies:

User-managed key: Let users enter their own Gemini API key on first launch, then store it in the OS keychain via tauri-plugin-stronghold. The user controls their key, their billing, and their usage. This works well for developer tools and power-user utilities where your audience already has API access.

Proxy backend: The app communicates with your own server (a Cloudflare Worker works well here), which holds the shared API key. The distributed binary contains no secrets. This adds a small round-trip latency overhead but is the correct choice for general consumer distribution.

Shared Request Structures and Error Handling

Define the Gemini API request shape once and reuse it across commands:

#[derive(Serialize, Deserialize, Clone)]
struct GeminiRequest {
    contents: Vec<GeminiContent>,
    #[serde(skip_serializing_if = "Option::is_none")]
    generation_config: Option<GenerationConfig>,
}
 
#[derive(Serialize, Deserialize, Clone)]
struct GeminiContent {
    role: String,
    parts: Vec<GeminiPart>,
}
 
#[derive(Serialize, Deserialize, Clone)]
struct GeminiPart {
    text: String,
}
 
#[derive(Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
struct GenerationConfig {
    temperature: f32,
    max_output_tokens: u32,
}

Single-Turn Text Generation

The simplest command — takes a prompt string, returns a response string, handles errors explicitly:

#[tauri::command]
async fn generate_text(prompt: String) -> Result<String, String> {
    let api_key = get_api_key()?;
    let client = get_client();
 
    let url = format\!(
        "https://generativelanguage.googleapis.com/v1beta/\
         models/gemini-2.5-pro-latest:generateContent?key={}",
        api_key
    );
 
    let request_body = GeminiRequest {
        contents: vec\![GeminiContent {
            role: "user".to_string(),
            parts: vec\![GeminiPart { text: prompt }],
        }],
        generation_config: Some(GenerationConfig {
            temperature: 0.7,
            max_output_tokens: 8192,
        }),
    };
 
    let response = client
        .post(&url)
        .json(&request_body)
        .send()
        .await
        .map_err(|e| format\!("Network error: {}", e))?;
 
    if \!response.status().is_success() {
        let status = response.status();
        let body = response.text().await.unwrap_or_default();
        return Err(format\!("Gemini API error (HTTP {}): {}", status, body));
    }
 
    let json: serde_json::Value = response
        .json()
        .await
        .map_err(|e| format\!("Failed to parse response: {}", e))?;
 
    // Check for safety filter blocks before accessing the text
    if let Some(reason) = json["promptFeedback"]["blockReason"].as_str() {
        return Err(format\!("Blocked by safety filter: {}", reason));
    }
 
    json["candidates"][0]["content"]["parts"][0]["text"]
        .as_str()
        .ok_or_else(|| "Response contained no text content".to_string())
        .map(|s| s.to_string())
}

Calling it from TypeScript:

import { invoke } from "@tauri-apps/api/core";
 
async function askGemini(prompt: string): Promise<string> {
    // Errors from Rust become rejected Promises with the error string as message
    return await invoke<string>("generate_text", { prompt });
}

Real-Time Streaming via Tauri's Event System

Streaming is where the Tauri model requires a mental shift compared to web APIs. Instead of returning a ReadableStream or an AsyncGenerator, you use Tauri's window event system: Rust emits named events as SSE chunks arrive, and TypeScript listens for them.

The flow: TypeScript registers event listeners → calls invoke() → Rust opens the SSE connection to Gemini → emits each text chunk as a stream-chunk event → emits stream-done when finished → TypeScript accumulates chunks into state.

use futures_util::StreamExt;
 
#[tauri::command]
async fn generate_text_stream(
    window: tauri::Window,
    prompt: String,
) -> Result<(), String> {
    let api_key = get_api_key()?;
    let client = get_client();
 
    let url = format\!(
        "https://generativelanguage.googleapis.com/v1beta/\
         models/gemini-2.5-pro-latest:streamGenerateContent?alt=sse&key={}",
        api_key
    );
 
    let request_body = GeminiRequest {
        contents: vec\![GeminiContent {
            role: "user".to_string(),
            parts: vec\![GeminiPart { text: prompt }],
        }],
        generation_config: Some(GenerationConfig {
            temperature: 0.7,
            max_output_tokens: 8192,
        }),
    };
 
    let response = client
        .post(&url)
        .json(&request_body)
        .send()
        .await
        .map_err(|e| format\!("Request failed: {}", e))?;
 
    if \!response.status().is_success() {
        let error = response.text().await.unwrap_or_default();
        window.emit("stream-error", &error).ok();
        return Err(error);
    }
 
    let mut byte_stream = response.bytes_stream();
    let mut line_buffer = String::new();
 
    while let Some(chunk) = byte_stream.next().await {
        match chunk {
            Ok(bytes) => {
                line_buffer.push_str(&String::from_utf8_lossy(&bytes));
 
                // SSE uses newlines to delimit events.
                // Process only complete lines; carry incomplete tail in the buffer.
                while let Some(newline_pos) = line_buffer.find('\n') {
                    let line = line_buffer[..newline_pos].trim().to_string();
                    line_buffer = line_buffer[newline_pos + 1..].to_string();
 
                    if let Some(data) = line.strip_prefix("data: ") {
                        if data == "[DONE]" {
                            window.emit("stream-done", ()).ok();
                            return Ok(());
                        }
                        if let Ok(json) = serde_json::from_str::<serde_json::Value>(data) {
                            if let Some(text) =
                                json["candidates"][0]["content"]["parts"][0]["text"].as_str()
                            {
                                if \!text.is_empty() {
                                    window.emit("stream-chunk", text).ok();
                                }
                            }
                        }
                    }
                }
            }
            Err(e) => {
                let msg = e.to_string();
                window.emit("stream-error", &msg).ok();
                return Err(msg);
            }
        }
    }
 
    window.emit("stream-done", ()).ok();
    Ok(())
}

The React Streaming Hook

This custom hook handles listener registration order, state accumulation, and cleanup:

// src/hooks/useGeminiStream.ts
import { invoke } from "@tauri-apps/api/core";
import { listen, type UnlistenFn } from "@tauri-apps/api/event";
import { useState, useCallback } from "react";
 
export function useGeminiStream() {
    const [response, setResponse] = useState("");
    const [isStreaming, setIsStreaming] = useState(false);
    const [error, setError] = useState<string | null>(null);
 
    const sendMessage = useCallback(
        async (prompt: string) => {
            if (\!prompt.trim() || isStreaming) return;
 
            setResponse("");
            setError(null);
            setIsStreaming(true);
 
            const unlisteners: UnlistenFn[] = [];
 
            try {
                // Register listeners BEFORE calling invoke().
                // Calling invoke() first risks missing initial chunks.
                unlisteners.push(
                    await listen<string>("stream-chunk", (event) => {
                        setResponse((prev) => prev + event.payload);
                    })
                );
                unlisteners.push(
                    await listen("stream-done", () => {
                        setIsStreaming(false);
                    })
                );
                unlisteners.push(
                    await listen<string>("stream-error", (event) => {
                        setError(`Stream error: ${event.payload}`);
                        setIsStreaming(false);
                    })
                );
 
                await invoke("generate_text_stream", { prompt });
            } catch (err) {
                setError(err instanceof Error ? err.message : String(err));
                setIsStreaming(false);
            } finally {
                // Always remove listeners — leaking them causes subtle bugs
                // when the same event names are reused across sessions
                unlisteners.forEach((fn) => fn());
            }
        },
        [isStreaming]
    );
 
    const reset = useCallback(() => {
        setResponse("");
        setError(null);
    }, []);
 
    return { response, isStreaming, error, sendMessage, reset };
}

Wiring It Into the UI

// src/App.tsx
import { useState, useRef, useEffect } from "react";
import { useGeminiStream } from "./hooks/useGeminiStream";
import "./App.css";
 
export default function App() {
    const [prompt, setPrompt] = useState("");
    const { response, isStreaming, error, sendMessage, reset } = useGeminiStream();
    const responseRef = useRef<HTMLDivElement>(null);
 
    // Auto-scroll as new content streams in
    useEffect(() => {
        if (responseRef.current) {
            responseRef.current.scrollTop = responseRef.current.scrollHeight;
        }
    }, [response]);
 
    const handleSubmit = async (e: React.FormEvent) => {
        e.preventDefault();
        const trimmed = prompt.trim();
        if (\!trimmed) return;
        setPrompt("");
        await sendMessage(trimmed);
    };
 
    return (
        <div className="container">
            <div className="response-area" ref={responseRef}>
                {error ? (
                    <div className="error-message">{error}</div>
                ) : response ? (
                    <div className="response-text">
                        {response}
                        {isStreaming && <span className="typing-cursor">▋</span>}
                    </div>
                ) : (
                    <div className="placeholder">
                        Ask Gemini anything — responses stream in real time
                    </div>
                )}
            </div>
 
            <form className="input-area" onSubmit={handleSubmit}>
                <textarea
                    value={prompt}
                    onChange={(e) => setPrompt(e.target.value)}
                    placeholder="Enter your prompt… (⌘+Enter to send)"
                    disabled={isStreaming}
                    rows={3}
                    onKeyDown={(e) => {
                        if (e.key === "Enter" && (e.metaKey || e.ctrlKey)) {
                            handleSubmit(e as unknown as React.FormEvent);
                        }
                    }}
                />
                <div className="button-row">
                    <button type="submit" disabled={isStreaming || \!prompt.trim()}>
                        {isStreaming ? "Generating…" : "Send"}
                    </button>
                    <button
                        type="button"
                        onClick={reset}
                        disabled={isStreaming}
                        className="secondary"
                    >
                        Clear
                    </button>
                </div>
            </form>
        </div>
    );
}

Native OS Feature Integration

This is where desktop apps genuinely justify themselves over web alternatives. Clipboard access, system notifications, and filesystem operations create experiences that browser sandboxing simply can't replicate.

First, grant necessary permissions in src-tauri/capabilities/default.json:

{
  "identifier": "default",
  "description": "Default capability set",
  "windows": ["main"],
  "permissions": [
    "core:default",
    "clipboard-manager:allow-read-text",
    "clipboard-manager:allow-write-text",
    "notification:allow-send-notification",
    "shell:allow-open"
  ]
}

Clipboard AI Summarizer

Select text in any application, trigger the shortcut, get an AI summary written back to your clipboard:

import { readText, writeText } from "@tauri-apps/plugin-clipboard-manager";
import { sendNotification } from "@tauri-apps/plugin-notification";
import { invoke } from "@tauri-apps/api/core";
 
async function summarizeClipboard(): Promise<void> {
    const text = await readText();
 
    if (\!text?.trim()) {
        await sendNotification({
            title: "Gemini AI",
            body: "Clipboard is empty or contains no readable text.",
        });
        return;
    }
 
    // Truncate very long clipboard content to avoid runaway token costs
    const maxLength = 8000;
    const truncated = text.length > maxLength
        ? text.slice(0, maxLength) + "\n[truncated — content was too long]"
        : text;
 
    const prompt = `Summarize the following text in three concise bullet points. \
Use plain text, no markdown formatting:\n\n${truncated}`;
 
    try {
        const summary = await invoke<string>("generate_text", { prompt });
        await writeText(summary);
        await sendNotification({
            title: "Gemini AI — Summary ready",
            body: "Result written to clipboard.",
        });
    } catch (err) {
        await sendNotification({
            title: "Gemini AI — Error",
            body: typeof err === "string" ? err : "An unexpected error occurred.",
        });
    }
}

Pair this with a global keyboard shortcut using tauri-plugin-global-shortcut, and you've built a system-wide AI utility that operates independently of which app is in focus. That's a fundamentally desktop-native capability.

System Tray Integration

For background AI tools, adding a system tray icon (via tauri-plugin-tray) lets the app run invisibly until summoned. The pattern is: register the global shortcut on startup → bring the window to front on shortcut activation → hide back to tray on blur. This creates the "always available, never in the way" UX that the best desktop productivity tools share.

Common Pitfalls and How to Fix Them

These are issues I've hit across several Tauri + Gemini projects. None are obvious from the documentation alone.

Pitfall 1: Missing the first stream chunks

If you register event listeners after calling invoke(), you'll miss whichever chunks Rust has already emitted during the registration window. Always await listen(...) before await invoke(...). The custom hook above enforces this by registering all three listeners before the invoke call.

Pitfall 2: Incomplete reqwest feature flags

Using JSON response parsing and streaming together requires features = ["json", "stream"] in Cargo.toml. Omitting stream causes bytes_stream() to fail at compile time with an unhelpful error message. Also: reqwest's blocking feature is incompatible with Tauri's Tokio runtime — always use the async API.

Pitfall 3: Capabilities config vs. tauri.conf.json

Tauri 2.0 moved permission management from tauri.conf.json into the src-tauri/capabilities/ directory. Adding a plugin to Cargo.toml and registering it in main() isn't enough — you also need the corresponding permission in capabilities/default.json. Any "permission denied" error at runtime means a missing capability entry, not a code bug.

Pitfall 4: Cross-compilation doesn't work

You cannot build a Windows MSI installer from a macOS machine, or vice versa. Each platform's native bundle format requires the target OS. The practical solution: set up GitHub Actions with windows-latest, macos-14, and ubuntu-22.04 runner matrices and upload the artifacts. This is the standard Tauri release workflow.

# .github/workflows/release.yml
name: Release
on:
  push:
    tags: ["v*"]
 
jobs:
  build:
    strategy:
      matrix:
        os: [windows-latest, macos-14, ubuntu-22.04]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: npm ci
      - run: npm run tauri build
      - uses: actions/upload-artifact@v4
        with:
          name: bundle-${{ matrix.os }}
          path: src-tauri/target/release/bundle/

Pitfall 5: The windows_subsystem attribute

Without #\![cfg_attr(not(debug_assertions), windows_subsystem = "windows")] at the top of main.rs, release builds on Windows open a terminal window behind the app on launch. This is the correct attribute for GUI apps. It's in the scaffold by default but disappears if you start main.rs from scratch.

Pitfall 6: Safety filter false positives on Japanese content

If you're serving Japanese-speaking users (or building content that includes Japanese text), the default safety filter thresholds can occasionally block benign prompts. When a response is blocked, candidates is empty and promptFeedback.blockReason contains the reason. Always check for this before trying to extract text from the response — otherwise you'll get a misleading "no text in response" error instead of a meaningful message.

Production Build and Code Signing

Run npm run tauri dev during development for hot reload. For production:

npm run tauri build
 
# Outputs (per platform):
# macOS:   src-tauri/target/release/bundle/dmg/*.dmg
#          src-tauri/target/release/bundle/macos/*.app
# Windows: src-tauri/target/release/bundle/msi/*.msi
#          src-tauri/target/release/bundle/nsis/*.exe
# Linux:   src-tauri/target/release/bundle/deb/*.deb
#          src-tauri/target/release/bundle/rpm/*.rpm

macOS code signing and notarization: Unsigned macOS apps trigger Gatekeeper's "unidentified developer" or "malicious software" alert. Most non-technical users won't know how to override this. With an Apple Developer ID certificate (Apple Developer Program, $99/year), set bundle.macOS.signingIdentity and bundle.macOS.providerShortName in tauri.conf.json — Tauri handles the signing and notarization steps automatically during tauri build.

Windows: SmartScreen warnings appear for unsigned executables. For developer tools distributed to technical audiences, this is generally acceptable. For broader consumer distribution, an EV (Extended Validation) code signing certificate removes the warning.

Linux: Most Linux distribution methods (Flatpak, Snap, AUR) have their own verification processes. If you're targeting desktop Linux users, DEB and RPM packages without signing are commonly accepted for personal distributions.

What to Build Next

The Rust command structure you've built here extends cleanly in several directions. Multi-turn conversations work by passing a full contents array — alternating user and model roles — instead of a single message, which maps naturally to a conversation history array in your React state. Multimodal inputs (images, documents) extend the GeminiPart struct to include inline_data with base64-encoded content alongside the text parts.

For deeper coverage of the Gemini API streaming internals that power the Rust layer in this guide, see Implementing Streaming Responses and Multi-Turn Conversations with the Gemini API. The Function Calling patterns in Gemini API Function Calling — Complete Beginner's Guide can extend your desktop app into a genuine local AI agent that calls system tools on your behalf.

The combination of Gemini's reasoning capability and Tauri's native OS access is genuinely interesting territory. Start with the streaming hook working end-to-end — once you see response text appearing in real time from a Rust-backed command, the rest of the architecture opens up naturally.

Managing Application State with Tauri's State API

When your app grows beyond a single page, you'll need to share state — like the HTTP client or an API configuration — across multiple Tauri commands without passing it as a parameter every time. Tauri provides a managed state system for this.

use std::sync::Mutex;
 
// Define application state to share across commands
struct AppState {
    api_key: String,
    client: reqwest::Client,
    conversation_history: Mutex<Vec<GeminiContent>>,
}
 
// Register state in main()
fn main() {
    #[cfg(debug_assertions)]
    dotenvy::dotenv().ok();
 
    let api_key = std::env::var("GEMINI_API_KEY")
        .expect("GEMINI_API_KEY must be set");
 
    let client = reqwest::Client::builder()
        .timeout(std::time::Duration::from_secs(120))
        .build()
        .expect("Failed to build HTTP client");
 
    tauri::Builder::default()
        .manage(AppState {
            api_key,
            client,
            conversation_history: Mutex::new(Vec::new()),
        })
        .invoke_handler(tauri::generate_handler\![
            send_chat_message,
            reset_conversation,
        ])
        .run(tauri::generate_context\!())
        .expect("Failed to run app");
}

Commands access state via the tauri::State parameter:

#[tauri::command]
async fn send_chat_message(
    message: String,
    state: tauri::State<'_, AppState>,
) -> Result<String, String> {
    // Add user message to history
    {
        let mut history = state.conversation_history.lock()
            .map_err(|_| "Failed to acquire conversation lock".to_string())?;
        history.push(GeminiContent {
            role: "user".to_string(),
            parts: vec\![GeminiPart { text: message.clone() }],
        });
    }
 
    // Build the full request with history
    let contents = {
        let history = state.conversation_history.lock()
            .map_err(|_| "Lock error".to_string())?;
        history.clone()
    };
 
    let url = format\!(
        "https://generativelanguage.googleapis.com/v1beta/\
         models/gemini-2.5-pro-latest:generateContent?key={}",
        state.api_key
    );
 
    let request_body = GeminiRequest {
        contents,
        generation_config: None,
    };
 
    let response = state.client
        .post(&url)
        .json(&request_body)
        .send()
        .await
        .map_err(|e| format\!("Network error: {}", e))?;
 
    let json: serde_json::Value = response
        .json()
        .await
        .map_err(|e| format\!("Parse error: {}", e))?;
 
    let reply_text = json["candidates"][0]["content"]["parts"][0]["text"]
        .as_str()
        .ok_or("No response text")?
        .to_string();
 
    // Add model response to history
    {
        let mut history = state.conversation_history.lock()
            .map_err(|_| "Lock error".to_string())?;
        history.push(GeminiContent {
            role: "model".to_string(),
            parts: vec\![GeminiPart { text: reply_text.clone() }],
        });
 
        // Sliding window: keep only the last 20 turns (10 exchanges)
        // This prevents unbounded growth and keeps token costs predictable
        if history.len() > 20 {
            let excess = history.len() - 20;
            history.drain(0..excess);
        }
    }
 
    Ok(reply_text)
}
 
#[tauri::command]
fn reset_conversation(state: tauri::State<'_, AppState>) -> Result<(), String> {
    let mut history = state.conversation_history.lock()
        .map_err(|_| "Lock error".to_string())?;
    history.clear();
    Ok(())
}

The Mutex<Vec<GeminiContent>> pattern is straightforward but blocks the thread while locked. For high-throughput scenarios, tokio::sync::Mutex (async-aware) is more appropriate. For a desktop chat app with one active conversation at a time, the standard std::sync::Mutex is fine and simpler.

Testing Your Commands in Isolation

One of the less-discussed advantages of putting logic in Rust commands is testability. You can write unit tests for the API response parsing logic without spinning up a full Tauri app:

#[cfg(test)]
mod tests {
    use super::*;
    use serde_json::json;
 
    #[test]
    fn test_extract_text_from_valid_response() {
        let response = json\!({
            "candidates": [{
                "content": {
                    "parts": [{ "text": "Hello, world\!" }]
                }
            }]
        });
 
        let text = response["candidates"][0]["content"]["parts"][0]["text"]
            .as_str()
            .unwrap();
 
        assert_eq\!(text, "Hello, world\!");
    }
 
    #[test]
    fn test_blocked_response_detection() {
        let response = json\!({
            "promptFeedback": {
                "blockReason": "SAFETY"
            },
            "candidates": []
        });
 
        let is_blocked = response["promptFeedback"]["blockReason"].is_string();
        assert\!(is_blocked);
    }
 
    #[test]
    fn test_conversation_history_sliding_window() {
        let mut history: Vec<GeminiContent> = Vec::new();
 
        // Simulate filling beyond the 20-message limit
        for i in 0..25 {
            history.push(GeminiContent {
                role: if i % 2 == 0 { "user".to_string() } else { "model".to_string() },
                parts: vec\![GeminiPart { text: format\!("message {}", i) }],
            });
        }
 
        if history.len() > 20 {
            let excess = history.len() - 20;
            history.drain(0..excess);
        }
 
        assert_eq\!(history.len(), 20);
        // Oldest messages removed; latest messages retained
        assert_eq\!(history[0].parts[0].text, "message 5");
    }
}

Run tests with cargo test from the src-tauri/ directory. This gives you fast feedback on your data processing logic without the overhead of launching the full GUI.

Distributing Your App

Beyond just building the binaries, there are a few distribution considerations specific to AI desktop apps.

Auto-update: Tauri ships tauri-plugin-updater for automatic updates. When you push a new version, the app can check a GitHub releases endpoint, download the update, and apply it without user intervention. For a Gemini-powered app that you'll iterate on frequently (as the API evolves), this is worth setting up from the start.

First-launch onboarding: If users need to enter their own API key, the first-launch experience matters. A simple approach: check for the key in Tauri's app data directory on startup (tauri::api::path::app_data_dir()), prompt if missing, and store it using tauri-plugin-stronghold (hardware-backed secure storage). Present a link to https://aistudio.google.com/ to help users create a key.

Crash reporting: For production apps, integrating Sentry or a similar crash reporter helps you catch panics and API failures you didn't anticipate. The Tauri plugin ecosystem has tauri-plugin-sentry for this, though basic std::panic::set_hook logging to a local file is a reasonable starting point for personal tools.

Putting It All Together

To recap the architecture:

The Rust backend (src-tauri/src/main.rs) is responsible for: holding the API key, building and sending HTTP requests to Gemini, parsing responses, managing conversation history, and emitting events during streaming. The TypeScript frontend is responsible for: calling named Rust commands via invoke(), listening for events via listen(), managing UI state, and handling user input.

This separation is clean and easy to reason about. The Rust layer handles everything that touches the network or secrets. The TypeScript layer handles everything that touches the user. Neither layer bleeds into the other's domain.

The resulting app bundle is 8–15 MB, starts in under a second, idles at 20–40 MB RAM, and can be distributed to macOS, Windows, and Linux users without requiring them to have Node.js, Python, or any runtime installed. For a solo-built AI productivity tool, that's a compelling set of properties.

A Note on Choosing Between Tauri and Other Approaches

Before committing to Tauri, it's worth briefly mapping when the approach doesn't fit. If your team has strong web experience and weak Rust familiarity, the Rust learning curve adds meaningful ramp-up time. Electron remains the pragmatic choice in that scenario — the ecosystem is more mature, the hiring pool is larger, and the tooling is more polished. PWAs (Progressive Web Apps) are worth considering if you don't need deep OS integration or offline-first capabilities that go beyond what service workers provide.

Tauri shines when you have: a need to keep secrets out of the client, a desire for small bundle size and fast startup, an audience that values a native-feeling app, or a requirement for OS integrations that browsers explicitly block. Gemini API-powered desktop tools — local document analyzers, clipboard utilities, system-wide summarizers — fit this profile well.

For Gemini API integration specifically, the Gemini API Function Calling — Complete Beginner's Guide is a natural companion to this article. With Tauri's native OS access and Gemini's function calling capability, you can build agents that read your local filesystem, analyze files, and write results back — without any data ever leaving the user's machine beyond the text sent to the API.

Why Electron — Choosing the Right Deployment Target

Before writing a line of code, it's worth asking whether Electron is the right choice for your use case.

PWAs are great when you want zero-install distribution and seamless updates. The limitation is filesystem access — the File System Access API has improved, but you still can't recursively read an arbitrary folder the user points you to, which rules out many "local AI assistant" scenarios.

Mobile apps (React Native, Flutter) are the right choice when your users are on iOS or Android. But if your target is "Mac or Windows desktop, working alongside the user's existing files and workflows," mobile doesn't help.

Electron wins when you need:

Full local filesystem access (reading entire codebases, batch processing folders of PDFs, writing output files directly)
System tray integration for always-on AI assistants
Native OS features like global hotkeys, clipboard access, and desktop notifications
A packaged installer that non-developers can double-click and run

For comparison, see the Gemini API × PWA Complete Implementation Guide for cases where a PWA is a better fit.

Project Setup and Architecture

The fastest path to a working Electron + TypeScript + Vite project is electron-vite:

npm create electron-vite@latest gemini-desktop -- --template react-ts
cd gemini-desktop
npm install
npm install @google/genai keytar electron-store
npm install -D @types/node

@google/genai — Google's official JavaScript/TypeScript SDK for the Gemini API
keytar — stores secrets in the OS keychain (macOS Keychain, Windows Credential Manager, Linux libsecret)
electron-store — simple persistent storage for non-sensitive app settings

The project structure that matters most:

gemini-desktop/
├── src/
│   ├── main/
│   │   ├── index.ts          ← Main process (Gemini API calls go HERE ONLY)
│   │   ├── gemini.ts         ← Gemini API wrapper
│   │   └── ipc-handlers.ts   ← IPC handler registration
│   ├── preload/
│   │   └── index.ts          ← Safe bridge between main and renderer
│   └── renderer/
│       └── src/
│           └── App.tsx        ← UI layer (never calls Gemini API directly)

The guiding principle: all Gemini API calls live in the main process. The renderer never touches an API key or the SDK directly.

Secure API Key Management — The Design Decision That Changes Everything

The most common Electron security mistake I see in tutorials is placing the API key in the renderer process:

// ❌ WRONG — renderer/src/App.tsx
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: 'AIza...' }); // Key is exposed in DevTools

Anyone who opens DevTools in your packaged app can read this key. Electron apps ship with Node.js built in, and extracting the source from an .asar archive takes about thirty seconds.

The correct pattern — API key lives only in the main process, stored in the OS keychain:

// ✅ main/gemini.ts
import { GoogleGenAI } from '@google/genai';
import keytar from 'keytar';
 
const SERVICE_NAME = 'gemini-desktop';
const ACCOUNT_NAME = 'gemini-api-key';
 
export async function getGenAI(): Promise<GoogleGenAI | null> {
  const apiKey = await keytar.getPassword(SERVICE_NAME, ACCOUNT_NAME);
  if (!apiKey) return null;
  return new GoogleGenAI({ apiKey });
}
 
export async function saveApiKey(apiKey: string): Promise<void> {
  await keytar.setPassword(SERVICE_NAME, ACCOUNT_NAME, apiKey);
}
 
export async function deleteApiKey(): Promise<void> {
  await keytar.deletePassword(SERVICE_NAME, ACCOUNT_NAME);
}

The preload script creates a typed bridge that exposes only the operations the renderer needs — nothing more:

// preload/index.ts
import { contextBridge, ipcRenderer } from 'electron';
 
contextBridge.exposeInMainWorld('geminiAPI', {
  saveApiKey: (key: string) => ipcRenderer.invoke('save-api-key', key),
  checkApiKey: () => ipcRenderer.invoke('check-api-key'),
  sendMessage: (message: string, history: unknown[]) =>
    ipcRenderer.invoke('send-message', message, history),
  onStreamChunk: (callback: (chunk: string) => void) => {
    ipcRenderer.on('stream-chunk', (_event, chunk) => callback(chunk));
    return () => ipcRenderer.removeAllListeners('stream-chunk');
  },
});

contextBridge.exposeInMainWorld is the key API here. The renderer gets window.geminiAPI with a fixed set of methods — it cannot access ipcRenderer directly, cannot require Node modules, and has no path to the API key.

Streaming Chat Implementation

The main process IPC handler for chat uses webContents.send() to push stream chunks to the renderer in real time:

// main/ipc-handlers.ts
import { ipcMain, BrowserWindow } from 'electron';
import { getGenAI } from './gemini';
 
export function registerIpcHandlers(mainWindow: BrowserWindow): void {
 
  ipcMain.handle('check-api-key', async () => {
    const ai = await getGenAI();
    return ai !== null;
  });
 
  ipcMain.handle('save-api-key', async (_event, apiKey: string) => {
    const { saveApiKey } = await import('./gemini');
    await saveApiKey(apiKey);
    return { success: true };
  });
 
  ipcMain.handle('send-message', async (
    _event,
    userMessage: string,
    history: Array<{ role: string; parts: string }>
  ) => {
    const ai = await getGenAI();
    if (!ai) return { error: 'API key not configured' };
 
    try {
      const contents = history.map(h => ({
        role: h.role as 'user' | 'model',
        parts: [{ text: h.parts }],
      }));
      contents.push({ role: 'user', parts: [{ text: userMessage }] });
 
      const result = await ai.models.generateContentStream({
        model: 'gemini-2.5-flash',
        contents,
      });
 
      let fullResponse = '';
      for await (const chunk of result) {
        const text = chunk.text ?? '';
        if (text) {
          fullResponse += text;
          mainWindow.webContents.send('stream-chunk', text);
        }
      }
      mainWindow.webContents.send('stream-chunk', '__DONE__');
      return { success: true, fullText: fullResponse };
 
    } catch (error) {
      const message = error instanceof Error ? error.message : 'Unknown error';
      return { error: message };
    }
  });
}

The React component in the renderer:

// renderer/src/App.tsx
import { useState, useEffect, useRef } from 'react';
 
interface Message {
  role: 'user' | 'model';
  content: string;
}
 
export default function App() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [streaming, setStreaming] = useState(false);
  const bufferRef = useRef('');
 
  useEffect(() => {
    const cleanup = window.geminiAPI.onStreamChunk((chunk: string) => {
      if (chunk === '__DONE__') {
        setStreaming(false);
        bufferRef.current = '';
      } else {
        bufferRef.current += chunk;
        setMessages(prev => {
          const msgs = [...prev];
          const last = msgs[msgs.length - 1];
          if (last?.role === 'model') {
            msgs[msgs.length - 1] = { ...last, content: bufferRef.current };
          }
          return msgs;
        });
      }
    });
    return cleanup;
  }, []);
 
  const handleSend = async () => {
    if (!input.trim() || streaming) return;
    const userMessage = input.trim();
    setInput('');
    setStreaming(true);
    setMessages(prev => [
      ...prev,
      { role: 'user', content: userMessage },
      { role: 'model', content: '' },
    ]);
    const history = messages.map(m => ({ role: m.role, parts: m.content }));
    await window.geminiAPI.sendMessage(userMessage, history);
  };
 
  return (
    <div className="chat-container">
      <div className="messages">
        {messages.map((msg, i) => (
          <div key={i} className={`message ${msg.role}`}>
            <strong>{msg.role === 'user' ? 'You' : 'Gemini'}:</strong>
            <p>{msg.content}</p>
          </div>
        ))}
      </div>
      <div className="input-row">
        <textarea
          value={input}
          onChange={e => setInput(e.target.value)}
          onKeyDown={e => { if (e.key === 'Enter' && !e.shiftKey) { e.preventDefault(); handleSend(); } }}
          placeholder="Type a message (Shift+Enter for newline)"
          disabled={streaming}
        />
        <button onClick={handleSend} disabled={streaming}>
          {streaming ? 'Generating...' : 'Send'}
        </button>
      </div>
    </div>
  );
}

One timing issue to watch for: registerIpcHandlers(mainWindow) must be called after the BrowserWindow is created. Calling it before that gives you a webContents that's undefined, and your mainWindow.webContents.send() calls will throw.

Function Calling With Local OS Resources

This is where Electron earns its place over a web app. With Function Calling, you can give Gemini the ability to read files, list directories, or run local commands — all through the secure main process.

// main/ipc-handlers.ts (Function Calling section)
import { Tool } from '@google/genai';
import fs from 'fs/promises';
import path from 'path';
 
const localTools: Tool[] = [
  {
    functionDeclarations: [
      {
        name: 'list_files',
        description: 'List files in a local directory',
        parameters: {
          type: 'OBJECT',
          properties: {
            directory: { type: 'STRING', description: 'Absolute path of the directory' },
            extension: { type: 'STRING', description: 'File extension filter (e.g. .txt). Omit for all files.' },
          },
          required: ['directory'],
        },
      },
      {
        name: 'read_file',
        description: 'Read a text file from the local filesystem',
        parameters: {
          type: 'OBJECT',
          properties: {
            file_path: { type: 'STRING', description: 'Absolute path to the file' },
          },
          required: ['file_path'],
        },
      },
    ],
  },
];
 
async function executeTool(name: string, args: Record<string, string>): Promise<string> {
  switch (name) {
    case 'list_files': {
      const entries = await fs.readdir(args.directory, { withFileTypes: true });
      const files = entries
        .filter(e => e.isFile())
        .map(e => e.name)
        .filter(n => !args.extension || n.endsWith(args.extension));
      return JSON.stringify({ files, count: files.length });
    }
    case 'read_file': {
      // Security: normalize path to prevent traversal attacks
      const resolved = path.resolve(args.file_path);
      const content = await fs.readFile(resolved, 'utf-8');
      return content.length > 4000
        ? content.slice(0, 4000) + '\n[...truncated...]'
        : content;
    }
    default:
      return JSON.stringify({ error: `Unknown tool: ${name}` });
  }
}

The agent loop (up to 5 iterations to prevent infinite loops):

ipcMain.handle('send-agent-message', async (_event, userMessage: string) => {
  const ai = await getGenAI();
  if (!ai) return { error: 'API key not configured' };
 
  const contents: Array<{ role: 'user' | 'model'; parts: unknown[] }> = [
    { role: 'user', parts: [{ text: userMessage }] },
  ];
 
  for (let i = 0; i < 5; i++) {
    const response = await ai.models.generateContent({
      model: 'gemini-2.5-flash',
      contents,
      config: { tools: localTools },
    });
 
    const candidate = response.candidates?.[0];
    if (!candidate) break;
 
    const hasFunctionCall = candidate.content.parts.some((p: unknown) => (p as { functionCall?: unknown }).functionCall);
 
    if (!hasFunctionCall) {
      const text = candidate.content.parts.find((p: unknown) => (p as { text?: string }).text) as { text: string } | undefined;
      mainWindow.webContents.send('stream-chunk', text?.text ?? '');
      mainWindow.webContents.send('stream-chunk', '__DONE__');
      return { success: true };
    }
 
    contents.push({ role: 'model', parts: candidate.content.parts });
 
    const toolResults = [];
    for (const part of candidate.content.parts) {
      const fc = (part as { functionCall?: { name: string; args: Record<string, string> } }).functionCall;
      if (!fc) continue;
      const result = await executeTool(fc.name, fc.args);
      toolResults.push({ functionResponse: { name: fc.name, response: { output: result } } });
    }
    contents.push({ role: 'user', parts: toolResults });
  }
  return { error: 'Max iterations reached' };
});

For production path security, add a whitelist check after path.resolve() to ensure the resolved path starts with an allowed root directory. See Gemini API Function Calling Complete Guide for more advanced patterns.

Multimodal Input — Passing Local Files to the Gemini API

Drag-and-drop file analysis is a natural use case for an Electron app. Here's how to send a local image or PDF to the Gemini API from the main process:

// main/gemini.ts (file analysis)
import fs from 'fs/promises';
import path from 'path';
 
const MIME_MAP: Record<string, string> = {
  '.jpg': 'image/jpeg',
  '.jpeg': 'image/jpeg',
  '.png': 'image/png',
  '.webp': 'image/webp',
  '.gif': 'image/gif',
  '.pdf': 'application/pdf',
};
 
const INLINE_LIMIT = 10 * 1024 * 1024; // 10 MB
 
export async function analyzeLocalFile(
  filePath: string,
  prompt: string,
  ai: GoogleGenAI
): Promise<string> {
  const resolved = path.resolve(filePath);
  const ext = path.extname(resolved).toLowerCase();
  const mimeType = MIME_MAP[ext];
 
  if (!mimeType) throw new Error(`Unsupported file type: ${ext}`);
 
  const buffer = await fs.readFile(resolved);
 
  if (buffer.length < INLINE_LIMIT) {
    // Inline base64 for small files
    const response = await ai.models.generateContent({
      model: 'gemini-2.5-flash',
      contents: [{
        role: 'user',
        parts: [
          { inlineData: { mimeType, data: buffer.toString('base64') } },
          { text: prompt },
        ],
      }],
    });
    return response.text ?? '';
  }
 
  // File API for large files
  const uploaded = await ai.files.upload({
    file: new Blob([buffer], { type: mimeType }),
    config: { mimeType, displayName: path.basename(resolved) },
  });
 
  if (!uploaded.uri) throw new Error('File upload failed');
 
  // Wait for ACTIVE state
  await waitForFileActive(ai, uploaded);
 
  const response = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [{
      role: 'user',
      parts: [
        { fileData: { mimeType, fileUri: uploaded.uri } },
        { text: prompt },
      ],
    }],
  });
  return response.text ?? '';
}
 
async function waitForFileActive(
  ai: GoogleGenAI,
  file: { name: string; state?: string }
): Promise<void> {
  let current = file;
  while (current.state === 'PROCESSING') {
    await new Promise(r => setTimeout(r, 2000));
    current = await ai.files.get({ name: file.name! });
  }
  if (current.state !== 'ACTIVE') {
    throw new Error(`File processing failed: state=${current.state}`);
  }
}

Offline Detection, Error Handling, and Retry Logic

Desktop apps get used on trains, in cafés with spotty Wi-Fi, and in environments where the Gemini API returns 503 occasionally. Handle it properly or your app will appear frozen.

// main/ipc-handlers.ts (resilience layer)
import { net } from 'electron';
 
async function withRetry<T>(
  fn: () => Promise<T>,
  maxAttempts = 3,
  baseMs = 1000
): Promise<T> {
  let lastError: Error | undefined;
  for (let i = 0; i < maxAttempts; i++) {
    try {
      return await fn();
    } catch (err) {
      lastError = err instanceof Error ? err : new Error(String(err));
      const msg = lastError.message.toLowerCase();
      // Non-retryable errors: fail fast
      if (msg.includes('api key') || msg.includes('billing') || msg.includes('permission')) {
        throw lastError;
      }
      if (i < maxAttempts - 1) {
        const delay = baseMs * Math.pow(2, i);
        await new Promise(r => setTimeout(r, delay));
        mainWindow.webContents.send('status', `Retrying... (${i + 2}/${maxAttempts})`);
      }
    }
  }
  throw lastError ?? new Error('Max retries exceeded');
}
 
// In send-message handler:
if (!net.isOnline()) {
  return { error: 'No internet connection. Please check your network.' };
}
return withRetry(() => /* ... Gemini API call */);

net.isOnline() is Electron's native network check — more reliable than navigator.onLine in the renderer. For rate limit handling patterns, see Gemini API Rate Limiting and Quota Management and Gemini API Cost Optimization.

Packaging, Code Signing, and Auto-Updates

Install electron-builder:

npm install -D electron-builder

Add build config to package.json:

{
  "build": {
    "appId": "net.dolice.gemini-desktop",
    "productName": "Gemini Desktop",
    "directories": { "output": "dist-installer" },
    "mac": {
      "category": "public.app-category.productivity",
      "target": [{ "target": "dmg", "arch": ["arm64", "x64"] }],
      "hardenedRuntime": true,
      "entitlements": "entitlements.mac.plist",
      "entitlementsInherit": "entitlements.mac.plist"
    },
    "win": {
      "target": [{ "target": "nsis", "arch": ["x64"] }]
    },
    "publish": [{ "provider": "github", "owner": "your-username", "repo": "gemini-desktop" }]
  }
}

For macOS distribution, code signing and notarization are required. Self-signed or unsigned apps are blocked by Gatekeeper on modern macOS:

CSC_LINK=./certs/dev-cert.p12 \
CSC_KEY_PASSWORD=yourpassword \
APPLE_ID=you@example.com \
APPLE_ID_PASSWORD=your-app-specific-password \
APPLE_TEAM_ID=XXXXXXXXXX \
npx electron-builder --mac

Auto-updates via electron-updater:

// main/index.ts
import { autoUpdater } from 'electron-updater';
 
autoUpdater.autoDownload = false;
autoUpdater.checkForUpdatesAndNotify();
 
autoUpdater.on('update-available', info => {
  mainWindow.webContents.send('update-available', info.version);
});
autoUpdater.on('update-downloaded', () => {
  mainWindow.webContents.send('update-ready');
});
 
ipcMain.handle('install-update', () => autoUpdater.quitAndInstall());

Upload release artifacts to GitHub Releases. electron-updater will pick them up automatically on the next app launch.

Common Mistakes and Pitfalls

Pitfall 1: nodeIntegration: true in outdated tutorials

Many older examples still show nodeIntegration: true. This gives the renderer full Node.js access — a serious security risk, especially if your app loads any external URLs. Always use nodeIntegration: false (the default) with contextIsolation: true and a preload script.

// ❌ Dangerous
new BrowserWindow({ webPreferences: { nodeIntegration: true, contextIsolation: false } });
 
// ✅ Correct
new BrowserWindow({
  webPreferences: {
    nodeIntegration: false,
    contextIsolation: true,
    preload: path.join(__dirname, 'preload.js'),
  },
});

Pitfall 2: Unhandled promise rejections in the main process

Renderer-side errors are caught by the browser runtime. Main process rejections are not — they silently fail unless you set up a handler:

process.on('unhandledRejection', (reason, promise) => {
  console.error('Unhandled Rejection:', reason);
});

Pitfall 3: Storing API keys in electron-store

electron-store writes a JSON file to the user's app data directory — readable by anyone with filesystem access. Store only non-sensitive settings (theme, selected model, window position) there. API keys belong in keytar.

Pitfall 4: Double-registering IPC handlers in development

During development, electron-vite restarts the main process on file changes. ipcMain.handle() throws if the same channel is registered twice. Remove existing handlers before re-registering, or use a flag:

let handlersRegistered = false;
export function registerIpcHandlers(win: BrowserWindow): void {
  if (handlersRegistered) return;
  handlersRegistered = true;
  // ... register handlers
}

Pitfall 5: Using the File API without waiting for ACTIVE state

After ai.files.upload(), the file state is PROCESSING for a few seconds. Calling generateContent() with a file in PROCESSING state returns empty candidates. Always poll for ACTIVE before use (see the waitForFileActive() function above).

Your Next Step

Start simple: create the project, wire up the API key save/load flow, and get your first streaming response working. The architecture feels more complex than a web app at first, but once you've locked down the IPC structure, adding features follows the same patterns as any Node.js backend.

For TypeScript type safety patterns across the entire Gemini SDK, Gemini API TypeScript Type-Safe Application Architecture is a good companion read.

The promise of a desktop AI app is real — files that live locally, no round-trips through your server, and a native install experience your users can actually rely on. I hope this guide helps you get there.

System Instructions and Conversation Context Management

One aspect of Electron-based AI apps that deserves more attention is conversation context management. Unlike a stateless API integration, a desktop app is expected to maintain coherent, multi-session memory. Here's how to handle this well.

Setting Up System Instructions

System instructions establish the persona and behavior of your AI across all conversations. In an Electron app, store them in electron-store (not in the renderer state, which resets on reload):

// main/gemini.ts
import Store from 'electron-store';
 
const store = new Store<{ systemInstruction: string }>();
 
const DEFAULT_INSTRUCTION = `You are a helpful desktop AI assistant. You have access to the user's local filesystem through tool calls. When working with files, always confirm the file path before reading or modifying anything. Be concise but thorough.`;
 
export function getSystemInstruction(): string {
  return store.get('systemInstruction', DEFAULT_INSTRUCTION);
}
 
export function setSystemInstruction(instruction: string): void {
  store.set('systemInstruction', instruction);
}
 
// Use in generateContent calls:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents,
  config: {
    systemInstruction: getSystemInstruction(),
    tools: localTools,
  },
});

Persisting Chat History

Conversation history needs to survive app restarts if your users expect continuity. A pragmatic approach: serialize the last N messages to electron-store, with a configurable limit.

// main/chat-store.ts
import Store from 'electron-store';
 
interface StoredMessage {
  role: 'user' | 'model';
  content: string;
  timestamp: number;
}
 
interface ChatStore {
  messages: StoredMessage[];
}
 
const chatStore = new Store<ChatStore>({ name: 'chat-history' });
const MAX_STORED_MESSAGES = 100;
 
export function getStoredHistory(): StoredMessage[] {
  return chatStore.get('messages', []);
}
 
export function appendMessage(role: 'user' | 'model', content: string): void {
  const messages = getStoredHistory();
  messages.push({ role, content, timestamp: Date.now() });
  // Keep only the most recent messages
  const trimmed = messages.slice(-MAX_STORED_MESSAGES);
  chatStore.set('messages', trimmed);
}
 
export function clearHistory(): void {
  chatStore.set('messages', []);
}

Context Window Budget Management

Gemini 2.5 Flash supports up to 1M tokens of context, but sending the entire conversation history on every request is wasteful and eventually expensive. A sliding window approach works well for most desktop assistant use cases:

// main/ipc-handlers.ts (context trimming)
const MAX_HISTORY_MESSAGES = 20; // Last 20 messages sent to API
 
function buildContents(
  history: Array<{ role: string; parts: string }>,
  newMessage: string
) {
  // Always include the most recent messages for coherence
  const recentHistory = history.slice(-MAX_HISTORY_MESSAGES);
  const contents = recentHistory.map(h => ({
    role: h.role as 'user' | 'model',
    parts: [{ text: h.parts }],
  }));
  contents.push({ role: 'user', parts: [{ text: newMessage }] });
  return contents;
}

For long-running assistant apps, consider implementing a "summarize old context" step: periodically ask Gemini to produce a condensed summary of older conversation segments, store that summary, and prepend it to the context window as a system message. This pattern keeps the useful history without blowing up your token budget. See Gemini API TypeScript Type-Safe Application Architecture for how to type these patterns correctly across your entire codebase.

Advanced UI Patterns for Electron AI Apps

System Tray Integration

A persistent system tray icon makes your AI assistant always-accessible without keeping the main window open:

// main/index.ts
import { Tray, Menu, nativeImage } from 'electron';
import path from 'path';
 
let tray: Tray | null = null;
 
function createTray(mainWindow: BrowserWindow): void {
  const iconPath = path.join(__dirname, 'assets/tray-icon.png');
  tray = new Tray(nativeImage.createFromPath(iconPath));
 
  const contextMenu = Menu.buildFromTemplate([
    {
      label: 'Open Chat',
      click: () => {
        mainWindow.show();
        mainWindow.focus();
      },
    },
    {
      label: 'Quick Ask...',
      accelerator: 'CommandOrControl+Shift+G',
      click: () => {
        // Show a minimal input overlay
        createQuickInputWindow();
      },
    },
    { type: 'separator' },
    { label: 'Quit', click: () => app.quit() },
  ]);
 
  tray.setToolTip('Gemini Desktop');
  tray.setContextMenu(contextMenu);
 
  // Click on tray icon to toggle window
  tray.on('click', () => {
    mainWindow.isVisible() ? mainWindow.hide() : mainWindow.show();
  });
}

Global Hotkey Registration

Allow users to summon your AI from anywhere on their desktop:

// main/index.ts
import { globalShortcut } from 'electron';
 
app.whenReady().then(() => {
  const mainWindow = createWindow();
 
  // Register a global shortcut
  const registered = globalShortcut.register('CommandOrControl+Shift+G', () => {
    if (mainWindow.isVisible()) {
      mainWindow.hide();
    } else {
      mainWindow.show();
      mainWindow.focus();
    }
  });
 
  if (!registered) {
    console.warn('Global shortcut registration failed — may be taken by another app');
  }
});
 
app.on('will-quit', () => {
  globalShortcut.unregisterAll();
});

Native Drag-and-Drop for File Analysis

Enable drag-and-drop anywhere on the app window, not just a specific drop zone:

// main/index.ts (drag-and-drop setup on BrowserWindow)
mainWindow.webContents.on('did-finish-load', () => {
  // Prevent default Electron behavior of navigating to dropped files
  mainWindow.webContents.executeJavaScript(`
    document.addEventListener('dragover', e => e.preventDefault());
    document.addEventListener('drop', e => {
      e.preventDefault();
      const files = Array.from(e.dataTransfer.files).map(f => f.path);
      window.geminiAPI.processDroppedFiles(files);
    });
  `);
});

Add the IPC handler in the preload and main process:

// preload/index.ts (add to exposeInMainWorld)
processDroppedFiles: (paths: string[]) =>
  ipcRenderer.invoke('process-dropped-files', paths),
 
// main/ipc-handlers.ts
ipcMain.handle('process-dropped-files', async (_event, filePaths: string[]) => {
  const ai = await getGenAI();
  if (!ai) return { error: 'API key not configured' };
 
  const results = [];
  for (const filePath of filePaths) {
    try {
      const analysis = await analyzeLocalFile(
        filePath,
        'Briefly describe what this file contains.',
        ai
      );
      results.push({ path: filePath, analysis });
    } catch (err) {
      results.push({ path: filePath, error: String(err) });
    }
  }
  return { results };
});

Production Monitoring and Error Telemetry

Once you're distributing your app, you need visibility into what's going wrong for real users without compromising their privacy.

Local Error Logging

Before setting up any remote telemetry, set up structured local logging:

// main/logger.ts
import fs from 'fs';
import path from 'path';
import { app } from 'electron';
 
const LOG_PATH = path.join(app.getPath('userData'), 'app.log');
const MAX_LOG_SIZE = 5 * 1024 * 1024; // 5 MB
 
function rotateLogs(): void {
  try {
    const stats = fs.statSync(LOG_PATH);
    if (stats.size > MAX_LOG_SIZE) {
      const backupPath = LOG_PATH.replace('.log', '.old.log');
      fs.renameSync(LOG_PATH, backupPath);
    }
  } catch {
    // File doesn't exist yet — that's fine
  }
}
 
export function log(level: 'INFO' | 'WARN' | 'ERROR', message: string, data?: unknown): void {
  rotateLogs();
  const entry = JSON.stringify({
    timestamp: new Date().toISOString(),
    level,
    message,
    ...(data ? { data } : {}),
  });
  fs.appendFileSync(LOG_PATH, entry + '\n');
}

Crash Reports with Electron's Built-In Reporter

// main/index.ts
import { crashReporter } from 'electron';
 
// Local crash dump collection (no remote server needed)
crashReporter.start({
  productName: 'GeminiDesktop',
  companyName: 'YourCompany',
  submitURL: '', // Empty to disable remote submission
  uploadToServer: false,
});

For remote crash reporting with user consent, Sentry has an Electron SDK that integrates cleanly with the main process error handling patterns shown earlier in this guide.

API Usage Tracking

If you want to give users visibility into how many Gemini API tokens they're consuming, track it locally:

// main/usage-tracker.ts
import Store from 'electron-store';
 
interface UsageStore {
  dailyUsage: Record<string, { inputTokens: number; outputTokens: number; calls: number }>;
}
 
const usageStore = new Store<UsageStore>({ name: 'usage' });
 
export function recordUsage(inputTokens: number, outputTokens: number): void {
  const today = new Date().toISOString().slice(0, 10); // YYYY-MM-DD
  const daily = usageStore.get('dailyUsage', {});
  const existing = daily[today] ?? { inputTokens: 0, outputTokens: 0, calls: 0 };
  daily[today] = {
    inputTokens: existing.inputTokens + inputTokens,
    outputTokens: existing.outputTokens + outputTokens,
    calls: existing.calls + 1,
  };
  // Keep only the last 30 days
  const keys = Object.keys(daily).sort();
  if (keys.length > 30) {
    delete daily[keys[0]];
  }
  usageStore.set('dailyUsage', daily);
}
 
export function getUsageSummary(): UsageStore['dailyUsage'] {
  return usageStore.get('dailyUsage', {});
}

Surface this in a "Usage" settings panel so users understand their API consumption. This kind of transparency is especially valuable if you're building a tool for teams where multiple people share a single API key.

Distribution Checklist Before Your First Release

Before publishing to GitHub Releases, run through this list:

nodeIntegration: false and contextIsolation: true in all BrowserWindow configs — verify with grep -r "nodeIntegration" src/
No API keys, tokens, or secrets hardcoded anywhere — check with grep -r "AIza" src/
Preload script uses only contextBridge.exposeInMainWorld — no direct Node.js API exposure
macOS: hardenedRuntime: true is set and entitlements file is present (required for notarization)
Windows: NSIS installer is configured with oneClick: false to let users choose the install path
Auto-updater is configured and tested against a staging GitHub release
process.on('unhandledRejection', ...) handler is in place
Log rotation is implemented to prevent disk space issues for long-running apps
Rate limit and retry logic is in place for all Gemini API calls

Building and shipping a desktop app is more involved than deploying a web service, but the result is a product your users can install, trust, and rely on — even offline, even with their most sensitive local files. That's a meaningful value proposition that web apps simply can't match.

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.