●SIRI — WWDC 2026 confirms the revamped Siri runs on a Google Gemini model, though it won't ship in the EU at iOS 27 due to the DMA●FLASH3.5 — Gemini 3.5 Flash is now GA, the top Flash model for sustained frontier performance on agentic and coding tasks●IMAGE-GA — Gemini 3.1 Flash Image and 3.1 Pro Image are GA as native visual models; the preview versions shut down Jun 25●MANAGED-AGENTS — Managed Agents launch in public preview in the Gemini API, running autonomous agents in Google-hosted isolated Linux sandboxes●FILE-SEARCH — File Search now supports multimodal search, with native image embedding and retrieval via gemini-embedding-2●DEPRECATION — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25 — migrate to the GA models soon●SIRI — WWDC 2026 confirms the revamped Siri runs on a Google Gemini model, though it won't ship in the EU at iOS 27 due to the DMA●FLASH3.5 — Gemini 3.5 Flash is now GA, the top Flash model for sustained frontier performance on agentic and coding tasks●IMAGE-GA — Gemini 3.1 Flash Image and 3.1 Pro Image are GA as native visual models; the preview versions shut down Jun 25●MANAGED-AGENTS — Managed Agents launch in public preview in the Gemini API, running autonomous agents in Google-hosted isolated Linux sandboxes●FILE-SEARCH — File Search now supports multimodal search, with native image embedding and retrieval via gemini-embedding-2●DEPRECATION — gemini-3.1-flash-image-preview and gemini-3-pro-image-preview shut down Jun 25 — migrate to the GA models soon
Building Cross-Platform AI Desktop Apps with Gemini API and Tauri 2.0
A practical guide to building macOS, Windows, and Linux desktop AI apps using Tauri 2.0 and Gemini API. Learn secure API key management with a Rust backend, real-time streaming via Tauri's event system, and native OS integration with working production code.
The question of where to store an API key is one of those nagging problems that follows you when building AI-powered apps. A separate backend server works, but it's overkill for a solo project. Putting the key directly in browser JavaScript is obviously a non-starter. For a while, I just lived with the trade-off — until I started taking Tauri seriously.
Tauri 2.0 solves this cleanly. Its Rust backend owns the API key entirely, and frontend JavaScript can never reach it. The bundle size comes in at a fraction of Electron's footprint, startup is noticeably faster, and the Rust-to-TypeScript interop — via Tauri's IPC command system — is genuinely ergonomic once you understand the patterns.
This guide covers everything you need to ship a Gemini-powered desktop AI assistant with Tauri 2.0: secure API key handling in the Rust layer, real-time streaming through Tauri's event system, multi-turn conversation management, native OS feature integration (clipboard, notifications, filesystem), common pitfalls with concrete fixes, and a production build pipeline for all three major platforms.
Why Tauri 2.0 Over Electron for AI Apps
The Electron vs. Tauri conversation usually focuses on bundle size, but for AI apps the more meaningful difference is the security model.
Bundle size is still worth mentioning. Electron ships Node.js and Chromium together, landing at 60–100+ MB minimum for even a trivial app. Tauri uses the OS's existing browser engine — WKWebView on macOS, WebView2 on Windows, WebKitGTK on Linux — bringing bundles down to 6–15 MB. For a desktop tool you distribute to friends or colleagues, this is a real quality-of-life improvement.
Security model matters more for AI apps specifically. By default, Tauri's frontend JavaScript cannot access OS-level capabilities — clipboard, filesystem, HTTP, notifications — without explicit permission grants defined in src-tauri/capabilities/. This isn't a configuration option you can accidentally omit; it's the default posture. Your Gemini API key lives only in the Rust process. The TypeScript layer calls named commands and receives results. That's it.
Backend performance: Rust's async runtime (tokio) handles concurrent Gemini API requests efficiently. If you're building something that fans out multiple API calls — like a document analyzer that processes several sections in parallel — you get genuine concurrency without worrying about an event loop.
Memory footprint: A typical Tauri app idles at 20–40 MB RAM. The equivalent Electron app runs 150–300 MB. For a background AI assistant that users keep open all day, this actually matters.
Project Setup and Prerequisites
You'll need: Rust (via rustup, version 1.77 or later), Node.js 20+, and npm or pnpm. On macOS, install Xcode Command Line Tools first (xcode-select --install). On Windows, WebView2 ships with Windows 11 automatically; for Windows 10 you'll need the standalone WebView2 Runtime installer.
# Install the Tauri CLI globallynpm install --global @tauri-apps/cli@latest# Scaffold a new project using the React + TypeScript templatenpm create tauri-app@latest gemini-desktop -- --template react-tscd gemini-desktop# Install frontend dependenciesnpm install
Open src-tauri/Cargo.toml and replace the [dependencies] section with the following:
[dependencies]tauri = { version = "2", features = ["protocol-asset"] }tauri-plugin-shell = "2"tauri-plugin-notification = "2"tauri-plugin-clipboard-manager = "2"serde = { version = "1", features = ["derive"] }serde_json = "1"reqwest = { version = "0.12", features = ["json", "stream"] }futures-util = "0.3"tokio = { version = "1", features = ["full"] }[dev-dependencies]dotenvy = "0.15"[build-dependencies]tauri-build = { version = "2", features = [] }
Create a .env file inside src-tauri/ for local development and add it to .gitignore immediately:
# src-tauri/.env (never commit this file)GEMINI_API_KEY=YOUR_GEMINI_API_KEY
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦Developers struggling with where to safely store API keys will get a clean, production-grade solution using Tauri 2.0's Rust backend — keeping keys completely out of client-side JavaScript
✦You'll receive working code for streaming Gemini API responses through Tauri's event system in real time, ready to drop into your own project today
✦By the end, you'll have a multi-platform build pipeline covering macOS, Windows, and Linux — including code signing setup so users don't hit security warnings on first launch
Secure payment via Stripe · Cancel anytime
Secure API Key Management in the Rust Backend
Here's the pattern that makes Tauri compelling for AI apps: the Rust backend loads the API key from the environment, and the frontend JavaScript never touches it. All Gemini API calls flow through named Rust commands.
The full src-tauri/src/main.rs entry point:
// src-tauri/src/main.rs#\![cfg_attr(not(debug_assertions), windows_subsystem = "windows")]use reqwest::Client;use serde::{Deserialize, Serialize};use std::sync::OnceLock;// Reuse a single HTTP client across all requests (connection pool)static HTTP_CLIENT: OnceLock<Client> = OnceLock::new();fn get_client() -> &'static Client { HTTP_CLIENT.get_or_init(|| { Client::builder() .timeout(std::time::Duration::from_secs(120)) .build() .expect("Failed to initialize HTTP client") })}fn get_api_key() -> Result<String, String> { std::env::var("GEMINI_API_KEY").map_err(|_| { "GEMINI_API_KEY is not set. \ Create a .env file in src-tauri/ with your key.".to_string() })}fn main() { // Load .env in debug builds only — release binaries never include this file #[cfg(debug_assertions)] dotenvy::dotenv().ok(); tauri::Builder::default() .plugin(tauri_plugin_shell::init()) .plugin(tauri_plugin_notification::init()) .plugin(tauri_plugin_clipboard_manager::init()) .invoke_handler(tauri::generate_handler\![ generate_text, generate_text_stream, ]) .run(tauri::generate_context\!()) .expect("Failed to start Tauri application");}
For production distribution, you have two practical strategies:
User-managed key: Let users enter their own Gemini API key on first launch, then store it in the OS keychain via tauri-plugin-stronghold. The user controls their key, their billing, and their usage. This works well for developer tools and power-user utilities where your audience already has API access.
Proxy backend: The app communicates with your own server (a Cloudflare Worker works well here), which holds the shared API key. The distributed binary contains no secrets. This adds a small round-trip latency overhead but is the correct choice for general consumer distribution.
Shared Request Structures and Error Handling
Define the Gemini API request shape once and reuse it across commands:
The simplest command — takes a prompt string, returns a response string, handles errors explicitly:
#[tauri::command]async fn generate_text(prompt: String) -> Result<String, String> { let api_key = get_api_key()?; let client = get_client(); let url = format\!( "https://generativelanguage.googleapis.com/v1beta/\ models/gemini-2.5-pro-latest:generateContent?key={}", api_key ); let request_body = GeminiRequest { contents: vec\![GeminiContent { role: "user".to_string(), parts: vec\![GeminiPart { text: prompt }], }], generation_config: Some(GenerationConfig { temperature: 0.7, max_output_tokens: 8192, }), }; let response = client .post(&url) .json(&request_body) .send() .await .map_err(|e| format\!("Network error: {}", e))?; if \!response.status().is_success() { let status = response.status(); let body = response.text().await.unwrap_or_default(); return Err(format\!("Gemini API error (HTTP {}): {}", status, body)); } let json: serde_json::Value = response .json() .await .map_err(|e| format\!("Failed to parse response: {}", e))?; // Check for safety filter blocks before accessing the text if let Some(reason) = json["promptFeedback"]["blockReason"].as_str() { return Err(format\!("Blocked by safety filter: {}", reason)); } json["candidates"][0]["content"]["parts"][0]["text"] .as_str() .ok_or_else(|| "Response contained no text content".to_string()) .map(|s| s.to_string())}
Calling it from TypeScript:
import { invoke } from "@tauri-apps/api/core";async function askGemini(prompt: string): Promise<string> { // Errors from Rust become rejected Promises with the error string as message return await invoke<string>("generate_text", { prompt });}
Real-Time Streaming via Tauri's Event System
Streaming is where the Tauri model requires a mental shift compared to web APIs. Instead of returning a ReadableStream or an AsyncGenerator, you use Tauri's window event system: Rust emits named events as SSE chunks arrive, and TypeScript listens for them.
The flow: TypeScript registers event listeners → calls invoke() → Rust opens the SSE connection to Gemini → emits each text chunk as a stream-chunk event → emits stream-done when finished → TypeScript accumulates chunks into state.
use futures_util::StreamExt;#[tauri::command]async fn generate_text_stream( window: tauri::Window, prompt: String,) -> Result<(), String> { let api_key = get_api_key()?; let client = get_client(); let url = format\!( "https://generativelanguage.googleapis.com/v1beta/\ models/gemini-2.5-pro-latest:streamGenerateContent?alt=sse&key={}", api_key ); let request_body = GeminiRequest { contents: vec\![GeminiContent { role: "user".to_string(), parts: vec\![GeminiPart { text: prompt }], }], generation_config: Some(GenerationConfig { temperature: 0.7, max_output_tokens: 8192, }), }; let response = client .post(&url) .json(&request_body) .send() .await .map_err(|e| format\!("Request failed: {}", e))?; if \!response.status().is_success() { let error = response.text().await.unwrap_or_default(); window.emit("stream-error", &error).ok(); return Err(error); } let mut byte_stream = response.bytes_stream(); let mut line_buffer = String::new(); while let Some(chunk) = byte_stream.next().await { match chunk { Ok(bytes) => { line_buffer.push_str(&String::from_utf8_lossy(&bytes)); // SSE uses newlines to delimit events. // Process only complete lines; carry incomplete tail in the buffer. while let Some(newline_pos) = line_buffer.find('\n') { let line = line_buffer[..newline_pos].trim().to_string(); line_buffer = line_buffer[newline_pos + 1..].to_string(); if let Some(data) = line.strip_prefix("data: ") { if data == "[DONE]" { window.emit("stream-done", ()).ok(); return Ok(()); } if let Ok(json) = serde_json::from_str::<serde_json::Value>(data) { if let Some(text) = json["candidates"][0]["content"]["parts"][0]["text"].as_str() { if \!text.is_empty() { window.emit("stream-chunk", text).ok(); } } } } } } Err(e) => { let msg = e.to_string(); window.emit("stream-error", &msg).ok(); return Err(msg); } } } window.emit("stream-done", ()).ok(); Ok(())}
The React Streaming Hook
This custom hook handles listener registration order, state accumulation, and cleanup:
This is where desktop apps genuinely justify themselves over web alternatives. Clipboard access, system notifications, and filesystem operations create experiences that browser sandboxing simply can't replicate.
First, grant necessary permissions in src-tauri/capabilities/default.json:
Select text in any application, trigger the shortcut, get an AI summary written back to your clipboard:
import { readText, writeText } from "@tauri-apps/plugin-clipboard-manager";import { sendNotification } from "@tauri-apps/plugin-notification";import { invoke } from "@tauri-apps/api/core";async function summarizeClipboard(): Promise<void> { const text = await readText(); if (\!text?.trim()) { await sendNotification({ title: "Gemini AI", body: "Clipboard is empty or contains no readable text.", }); return; } // Truncate very long clipboard content to avoid runaway token costs const maxLength = 8000; const truncated = text.length > maxLength ? text.slice(0, maxLength) + "\n[truncated — content was too long]" : text; const prompt = `Summarize the following text in three concise bullet points. \Use plain text, no markdown formatting:\n\n${truncated}`; try { const summary = await invoke<string>("generate_text", { prompt }); await writeText(summary); await sendNotification({ title: "Gemini AI — Summary ready", body: "Result written to clipboard.", }); } catch (err) { await sendNotification({ title: "Gemini AI — Error", body: typeof err === "string" ? err : "An unexpected error occurred.", }); }}
Pair this with a global keyboard shortcut using tauri-plugin-global-shortcut, and you've built a system-wide AI utility that operates independently of which app is in focus. That's a fundamentally desktop-native capability.
System Tray Integration
For background AI tools, adding a system tray icon (via tauri-plugin-tray) lets the app run invisibly until summoned. The pattern is: register the global shortcut on startup → bring the window to front on shortcut activation → hide back to tray on blur. This creates the "always available, never in the way" UX that the best desktop productivity tools share.
Common Pitfalls and How to Fix Them
These are issues I've hit across several Tauri + Gemini projects. None are obvious from the documentation alone.
Pitfall 1: Missing the first stream chunks
If you register event listeners after calling invoke(), you'll miss whichever chunks Rust has already emitted during the registration window. Always await listen(...) before await invoke(...). The custom hook above enforces this by registering all three listeners before the invoke call.
Pitfall 2: Incomplete reqwest feature flags
Using JSON response parsing and streaming together requires features = ["json", "stream"] in Cargo.toml. Omitting stream causes bytes_stream() to fail at compile time with an unhelpful error message. Also: reqwest's blocking feature is incompatible with Tauri's Tokio runtime — always use the async API.
Pitfall 3: Capabilities config vs. tauri.conf.json
Tauri 2.0 moved permission management from tauri.conf.json into the src-tauri/capabilities/ directory. Adding a plugin to Cargo.toml and registering it in main() isn't enough — you also need the corresponding permission in capabilities/default.json. Any "permission denied" error at runtime means a missing capability entry, not a code bug.
Pitfall 4: Cross-compilation doesn't work
You cannot build a Windows MSI installer from a macOS machine, or vice versa. Each platform's native bundle format requires the target OS. The practical solution: set up GitHub Actions with windows-latest, macos-14, and ubuntu-22.04 runner matrices and upload the artifacts. This is the standard Tauri release workflow.
Without #\![cfg_attr(not(debug_assertions), windows_subsystem = "windows")] at the top of main.rs, release builds on Windows open a terminal window behind the app on launch. This is the correct attribute for GUI apps. It's in the scaffold by default but disappears if you start main.rs from scratch.
Pitfall 6: Safety filter false positives on Japanese content
If you're serving Japanese-speaking users (or building content that includes Japanese text), the default safety filter thresholds can occasionally block benign prompts. When a response is blocked, candidates is empty and promptFeedback.blockReason contains the reason. Always check for this before trying to extract text from the response — otherwise you'll get a misleading "no text in response" error instead of a meaningful message.
Production Build and Code Signing
Run npm run tauri dev during development for hot reload. For production:
macOS code signing and notarization: Unsigned macOS apps trigger Gatekeeper's "unidentified developer" or "malicious software" alert. Most non-technical users won't know how to override this. With an Apple Developer ID certificate (Apple Developer Program, $99/year), set bundle.macOS.signingIdentity and bundle.macOS.providerShortName in tauri.conf.json — Tauri handles the signing and notarization steps automatically during tauri build.
Windows: SmartScreen warnings appear for unsigned executables. For developer tools distributed to technical audiences, this is generally acceptable. For broader consumer distribution, an EV (Extended Validation) code signing certificate removes the warning.
Linux: Most Linux distribution methods (Flatpak, Snap, AUR) have their own verification processes. If you're targeting desktop Linux users, DEB and RPM packages without signing are commonly accepted for personal distributions.
What to Build Next
The Rust command structure you've built here extends cleanly in several directions. Multi-turn conversations work by passing a full contents array — alternating user and model roles — instead of a single message, which maps naturally to a conversation history array in your React state. Multimodal inputs (images, documents) extend the GeminiPart struct to include inline_data with base64-encoded content alongside the text parts.
The combination of Gemini's reasoning capability and Tauri's native OS access is genuinely interesting territory. Start with the streaming hook working end-to-end — once you see response text appearing in real time from a Rust-backed command, the rest of the architecture opens up naturally.
Managing Application State with Tauri's State API
When your app grows beyond a single page, you'll need to share state — like the HTTP client or an API configuration — across multiple Tauri commands without passing it as a parameter every time. Tauri provides a managed state system for this.
use std::sync::Mutex;// Define application state to share across commandsstruct AppState { api_key: String, client: reqwest::Client, conversation_history: Mutex<Vec<GeminiContent>>,}// Register state in main()fn main() { #[cfg(debug_assertions)] dotenvy::dotenv().ok(); let api_key = std::env::var("GEMINI_API_KEY") .expect("GEMINI_API_KEY must be set"); let client = reqwest::Client::builder() .timeout(std::time::Duration::from_secs(120)) .build() .expect("Failed to build HTTP client"); tauri::Builder::default() .manage(AppState { api_key, client, conversation_history: Mutex::new(Vec::new()), }) .invoke_handler(tauri::generate_handler\![ send_chat_message, reset_conversation, ]) .run(tauri::generate_context\!()) .expect("Failed to run app");}
Commands access state via the tauri::State parameter:
#[tauri::command]async fn send_chat_message( message: String, state: tauri::State<'_, AppState>,) -> Result<String, String> { // Add user message to history { let mut history = state.conversation_history.lock() .map_err(|_| "Failed to acquire conversation lock".to_string())?; history.push(GeminiContent { role: "user".to_string(), parts: vec\![GeminiPart { text: message.clone() }], }); } // Build the full request with history let contents = { let history = state.conversation_history.lock() .map_err(|_| "Lock error".to_string())?; history.clone() }; let url = format\!( "https://generativelanguage.googleapis.com/v1beta/\ models/gemini-2.5-pro-latest:generateContent?key={}", state.api_key ); let request_body = GeminiRequest { contents, generation_config: None, }; let response = state.client .post(&url) .json(&request_body) .send() .await .map_err(|e| format\!("Network error: {}", e))?; let json: serde_json::Value = response .json() .await .map_err(|e| format\!("Parse error: {}", e))?; let reply_text = json["candidates"][0]["content"]["parts"][0]["text"] .as_str() .ok_or("No response text")? .to_string(); // Add model response to history { let mut history = state.conversation_history.lock() .map_err(|_| "Lock error".to_string())?; history.push(GeminiContent { role: "model".to_string(), parts: vec\![GeminiPart { text: reply_text.clone() }], }); // Sliding window: keep only the last 20 turns (10 exchanges) // This prevents unbounded growth and keeps token costs predictable if history.len() > 20 { let excess = history.len() - 20; history.drain(0..excess); } } Ok(reply_text)}#[tauri::command]fn reset_conversation(state: tauri::State<'_, AppState>) -> Result<(), String> { let mut history = state.conversation_history.lock() .map_err(|_| "Lock error".to_string())?; history.clear(); Ok(())}
The Mutex<Vec<GeminiContent>> pattern is straightforward but blocks the thread while locked. For high-throughput scenarios, tokio::sync::Mutex (async-aware) is more appropriate. For a desktop chat app with one active conversation at a time, the standard std::sync::Mutex is fine and simpler.
Testing Your Commands in Isolation
One of the less-discussed advantages of putting logic in Rust commands is testability. You can write unit tests for the API response parsing logic without spinning up a full Tauri app:
#[cfg(test)]mod tests { use super::*; use serde_json::json; #[test] fn test_extract_text_from_valid_response() { let response = json\!({ "candidates": [{ "content": { "parts": [{ "text": "Hello, world\!" }] } }] }); let text = response["candidates"][0]["content"]["parts"][0]["text"] .as_str() .unwrap(); assert_eq\!(text, "Hello, world\!"); } #[test] fn test_blocked_response_detection() { let response = json\!({ "promptFeedback": { "blockReason": "SAFETY" }, "candidates": [] }); let is_blocked = response["promptFeedback"]["blockReason"].is_string(); assert\!(is_blocked); } #[test] fn test_conversation_history_sliding_window() { let mut history: Vec<GeminiContent> = Vec::new(); // Simulate filling beyond the 20-message limit for i in 0..25 { history.push(GeminiContent { role: if i % 2 == 0 { "user".to_string() } else { "model".to_string() }, parts: vec\![GeminiPart { text: format\!("message {}", i) }], }); } if history.len() > 20 { let excess = history.len() - 20; history.drain(0..excess); } assert_eq\!(history.len(), 20); // Oldest messages removed; latest messages retained assert_eq\!(history[0].parts[0].text, "message 5"); }}
Run tests with cargo test from the src-tauri/ directory. This gives you fast feedback on your data processing logic without the overhead of launching the full GUI.
Distributing Your App
Beyond just building the binaries, there are a few distribution considerations specific to AI desktop apps.
Auto-update: Tauri ships tauri-plugin-updater for automatic updates. When you push a new version, the app can check a GitHub releases endpoint, download the update, and apply it without user intervention. For a Gemini-powered app that you'll iterate on frequently (as the API evolves), this is worth setting up from the start.
First-launch onboarding: If users need to enter their own API key, the first-launch experience matters. A simple approach: check for the key in Tauri's app data directory on startup (tauri::api::path::app_data_dir()), prompt if missing, and store it using tauri-plugin-stronghold (hardware-backed secure storage). Present a link to https://aistudio.google.com/ to help users create a key.
Crash reporting: For production apps, integrating Sentry or a similar crash reporter helps you catch panics and API failures you didn't anticipate. The Tauri plugin ecosystem has tauri-plugin-sentry for this, though basic std::panic::set_hook logging to a local file is a reasonable starting point for personal tools.
Putting It All Together
To recap the architecture:
The Rust backend (src-tauri/src/main.rs) is responsible for: holding the API key, building and sending HTTP requests to Gemini, parsing responses, managing conversation history, and emitting events during streaming. The TypeScript frontend is responsible for: calling named Rust commands via invoke(), listening for events via listen(), managing UI state, and handling user input.
This separation is clean and easy to reason about. The Rust layer handles everything that touches the network or secrets. The TypeScript layer handles everything that touches the user. Neither layer bleeds into the other's domain.
The resulting app bundle is 8–15 MB, starts in under a second, idles at 20–40 MB RAM, and can be distributed to macOS, Windows, and Linux users without requiring them to have Node.js, Python, or any runtime installed. For a solo-built AI productivity tool, that's a compelling set of properties.
A Note on Choosing Between Tauri and Other Approaches
Before committing to Tauri, it's worth briefly mapping when the approach doesn't fit. If your team has strong web experience and weak Rust familiarity, the Rust learning curve adds meaningful ramp-up time. Electron remains the pragmatic choice in that scenario — the ecosystem is more mature, the hiring pool is larger, and the tooling is more polished. PWAs (Progressive Web Apps) are worth considering if you don't need deep OS integration or offline-first capabilities that go beyond what service workers provide.
Tauri shines when you have: a need to keep secrets out of the client, a desire for small bundle size and fast startup, an audience that values a native-feeling app, or a requirement for OS integrations that browsers explicitly block. Gemini API-powered desktop tools — local document analyzers, clipboard utilities, system-wide summarizers — fit this profile well.
For Gemini API integration specifically, the Gemini API Function Calling — Complete Beginner's Guide is a natural companion to this article. With Tauri's native OS access and Gemini's function calling capability, you can build agents that read your local filesystem, analyze files, and write results back — without any data ever leaving the user's machine beyond the text sent to the API.
Why Electron — Choosing the Right Deployment Target
Before writing a line of code, it's worth asking whether Electron is the right choice for your use case.
PWAs are great when you want zero-install distribution and seamless updates. The limitation is filesystem access — the File System Access API has improved, but you still can't recursively read an arbitrary folder the user points you to, which rules out many "local AI assistant" scenarios.
Mobile apps (React Native, Flutter) are the right choice when your users are on iOS or Android. But if your target is "Mac or Windows desktop, working alongside the user's existing files and workflows," mobile doesn't help.
Electron wins when you need:
Full local filesystem access (reading entire codebases, batch processing folders of PDFs, writing output files directly)
System tray integration for always-on AI assistants
Native OS features like global hotkeys, clipboard access, and desktop notifications
A packaged installer that non-developers can double-click and run
@google/genai — Google's official JavaScript/TypeScript SDK for the Gemini API
keytar — stores secrets in the OS keychain (macOS Keychain, Windows Credential Manager, Linux libsecret)
electron-store — simple persistent storage for non-sensitive app settings
The project structure that matters most:
gemini-desktop/
├── src/
│ ├── main/
│ │ ├── index.ts ← Main process (Gemini API calls go HERE ONLY)
│ │ ├── gemini.ts ← Gemini API wrapper
│ │ └── ipc-handlers.ts ← IPC handler registration
│ ├── preload/
│ │ └── index.ts ← Safe bridge between main and renderer
│ └── renderer/
│ └── src/
│ └── App.tsx ← UI layer (never calls Gemini API directly)
The guiding principle: all Gemini API calls live in the main process. The renderer never touches an API key or the SDK directly.
Secure API Key Management — The Design Decision That Changes Everything
The most common Electron security mistake I see in tutorials is placing the API key in the renderer process:
// ❌ WRONG — renderer/src/App.tsximport { GoogleGenAI } from '@google/genai';const ai = new GoogleGenAI({ apiKey: 'AIza...' }); // Key is exposed in DevTools
Anyone who opens DevTools in your packaged app can read this key. Electron apps ship with Node.js built in, and extracting the source from an .asar archive takes about thirty seconds.
The correct pattern — API key lives only in the main process, stored in the OS keychain:
// ✅ main/gemini.tsimport { GoogleGenAI } from '@google/genai';import keytar from 'keytar';const SERVICE_NAME = 'gemini-desktop';const ACCOUNT_NAME = 'gemini-api-key';export async function getGenAI(): Promise<GoogleGenAI | null> { const apiKey = await keytar.getPassword(SERVICE_NAME, ACCOUNT_NAME); if (!apiKey) return null; return new GoogleGenAI({ apiKey });}export async function saveApiKey(apiKey: string): Promise<void> { await keytar.setPassword(SERVICE_NAME, ACCOUNT_NAME, apiKey);}export async function deleteApiKey(): Promise<void> { await keytar.deletePassword(SERVICE_NAME, ACCOUNT_NAME);}
The preload script creates a typed bridge that exposes only the operations the renderer needs — nothing more:
contextBridge.exposeInMainWorld is the key API here. The renderer gets window.geminiAPI with a fixed set of methods — it cannot access ipcRenderer directly, cannot require Node modules, and has no path to the API key.
Streaming Chat Implementation
The main process IPC handler for chat uses webContents.send() to push stream chunks to the renderer in real time:
One timing issue to watch for: registerIpcHandlers(mainWindow) must be called after the BrowserWindow is created. Calling it before that gives you a webContents that's undefined, and your mainWindow.webContents.send() calls will throw.
Function Calling With Local OS Resources
This is where Electron earns its place over a web app. With Function Calling, you can give Gemini the ability to read files, list directories, or run local commands — all through the secure main process.
// main/ipc-handlers.ts (Function Calling section)import { Tool } from '@google/genai';import fs from 'fs/promises';import path from 'path';const localTools: Tool[] = [ { functionDeclarations: [ { name: 'list_files', description: 'List files in a local directory', parameters: { type: 'OBJECT', properties: { directory: { type: 'STRING', description: 'Absolute path of the directory' }, extension: { type: 'STRING', description: 'File extension filter (e.g. .txt). Omit for all files.' }, }, required: ['directory'], }, }, { name: 'read_file', description: 'Read a text file from the local filesystem', parameters: { type: 'OBJECT', properties: { file_path: { type: 'STRING', description: 'Absolute path to the file' }, }, required: ['file_path'], }, }, ], },];async function executeTool(name: string, args: Record<string, string>): Promise<string> { switch (name) { case 'list_files': { const entries = await fs.readdir(args.directory, { withFileTypes: true }); const files = entries .filter(e => e.isFile()) .map(e => e.name) .filter(n => !args.extension || n.endsWith(args.extension)); return JSON.stringify({ files, count: files.length }); } case 'read_file': { // Security: normalize path to prevent traversal attacks const resolved = path.resolve(args.file_path); const content = await fs.readFile(resolved, 'utf-8'); return content.length > 4000 ? content.slice(0, 4000) + '\n[...truncated...]' : content; } default: return JSON.stringify({ error: `Unknown tool: ${name}` }); }}
The agent loop (up to 5 iterations to prevent infinite loops):
ipcMain.handle('send-agent-message', async (_event, userMessage: string) => { const ai = await getGenAI(); if (!ai) return { error: 'API key not configured' }; const contents: Array<{ role: 'user' | 'model'; parts: unknown[] }> = [ { role: 'user', parts: [{ text: userMessage }] }, ]; for (let i = 0; i < 5; i++) { const response = await ai.models.generateContent({ model: 'gemini-2.5-flash', contents, config: { tools: localTools }, }); const candidate = response.candidates?.[0]; if (!candidate) break; const hasFunctionCall = candidate.content.parts.some((p: unknown) => (p as { functionCall?: unknown }).functionCall); if (!hasFunctionCall) { const text = candidate.content.parts.find((p: unknown) => (p as { text?: string }).text) as { text: string } | undefined; mainWindow.webContents.send('stream-chunk', text?.text ?? ''); mainWindow.webContents.send('stream-chunk', '__DONE__'); return { success: true }; } contents.push({ role: 'model', parts: candidate.content.parts }); const toolResults = []; for (const part of candidate.content.parts) { const fc = (part as { functionCall?: { name: string; args: Record<string, string> } }).functionCall; if (!fc) continue; const result = await executeTool(fc.name, fc.args); toolResults.push({ functionResponse: { name: fc.name, response: { output: result } } }); } contents.push({ role: 'user', parts: toolResults }); } return { error: 'Max iterations reached' };});
For production path security, add a whitelist check after path.resolve() to ensure the resolved path starts with an allowed root directory. See Gemini API Function Calling Complete Guide for more advanced patterns.
Multimodal Input — Passing Local Files to the Gemini API
Drag-and-drop file analysis is a natural use case for an Electron app. Here's how to send a local image or PDF to the Gemini API from the main process:
// main/gemini.ts (file analysis)import fs from 'fs/promises';import path from 'path';const MIME_MAP: Record<string, string> = { '.jpg': 'image/jpeg', '.jpeg': 'image/jpeg', '.png': 'image/png', '.webp': 'image/webp', '.gif': 'image/gif', '.pdf': 'application/pdf',};const INLINE_LIMIT = 10 * 1024 * 1024; // 10 MBexport async function analyzeLocalFile( filePath: string, prompt: string, ai: GoogleGenAI): Promise<string> { const resolved = path.resolve(filePath); const ext = path.extname(resolved).toLowerCase(); const mimeType = MIME_MAP[ext]; if (!mimeType) throw new Error(`Unsupported file type: ${ext}`); const buffer = await fs.readFile(resolved); if (buffer.length < INLINE_LIMIT) { // Inline base64 for small files const response = await ai.models.generateContent({ model: 'gemini-2.5-flash', contents: [{ role: 'user', parts: [ { inlineData: { mimeType, data: buffer.toString('base64') } }, { text: prompt }, ], }], }); return response.text ?? ''; } // File API for large files const uploaded = await ai.files.upload({ file: new Blob([buffer], { type: mimeType }), config: { mimeType, displayName: path.basename(resolved) }, }); if (!uploaded.uri) throw new Error('File upload failed'); // Wait for ACTIVE state await waitForFileActive(ai, uploaded); const response = await ai.models.generateContent({ model: 'gemini-2.5-flash', contents: [{ role: 'user', parts: [ { fileData: { mimeType, fileUri: uploaded.uri } }, { text: prompt }, ], }], }); return response.text ?? '';}async function waitForFileActive( ai: GoogleGenAI, file: { name: string; state?: string }): Promise<void> { let current = file; while (current.state === 'PROCESSING') { await new Promise(r => setTimeout(r, 2000)); current = await ai.files.get({ name: file.name! }); } if (current.state !== 'ACTIVE') { throw new Error(`File processing failed: state=${current.state}`); }}
Offline Detection, Error Handling, and Retry Logic
Desktop apps get used on trains, in cafés with spotty Wi-Fi, and in environments where the Gemini API returns 503 occasionally. Handle it properly or your app will appear frozen.
// main/ipc-handlers.ts (resilience layer)import { net } from 'electron';async function withRetry<T>( fn: () => Promise<T>, maxAttempts = 3, baseMs = 1000): Promise<T> { let lastError: Error | undefined; for (let i = 0; i < maxAttempts; i++) { try { return await fn(); } catch (err) { lastError = err instanceof Error ? err : new Error(String(err)); const msg = lastError.message.toLowerCase(); // Non-retryable errors: fail fast if (msg.includes('api key') || msg.includes('billing') || msg.includes('permission')) { throw lastError; } if (i < maxAttempts - 1) { const delay = baseMs * Math.pow(2, i); await new Promise(r => setTimeout(r, delay)); mainWindow.webContents.send('status', `Retrying... (${i + 2}/${maxAttempts})`); } } } throw lastError ?? new Error('Max retries exceeded');}// In send-message handler:if (!net.isOnline()) { return { error: 'No internet connection. Please check your network.' };}return withRetry(() => /* ... Gemini API call */);
Upload release artifacts to GitHub Releases. electron-updater will pick them up automatically on the next app launch.
Common Mistakes and Pitfalls
Pitfall 1: nodeIntegration: true in outdated tutorials
Many older examples still show nodeIntegration: true. This gives the renderer full Node.js access — a serious security risk, especially if your app loads any external URLs. Always use nodeIntegration: false (the default) with contextIsolation: true and a preload script.
electron-store writes a JSON file to the user's app data directory — readable by anyone with filesystem access. Store only non-sensitive settings (theme, selected model, window position) there. API keys belong in keytar.
Pitfall 4: Double-registering IPC handlers in development
During development, electron-vite restarts the main process on file changes. ipcMain.handle() throws if the same channel is registered twice. Remove existing handlers before re-registering, or use a flag:
let handlersRegistered = false;export function registerIpcHandlers(win: BrowserWindow): void { if (handlersRegistered) return; handlersRegistered = true; // ... register handlers}
Pitfall 5: Using the File API without waiting for ACTIVE state
After ai.files.upload(), the file state is PROCESSING for a few seconds. Calling generateContent() with a file in PROCESSING state returns empty candidates. Always poll for ACTIVE before use (see the waitForFileActive() function above).
Your Next Step
Start simple: create the project, wire up the API key save/load flow, and get your first streaming response working. The architecture feels more complex than a web app at first, but once you've locked down the IPC structure, adding features follows the same patterns as any Node.js backend.
The promise of a desktop AI app is real — files that live locally, no round-trips through your server, and a native install experience your users can actually rely on. I hope this guide helps you get there.
System Instructions and Conversation Context Management
One aspect of Electron-based AI apps that deserves more attention is conversation context management. Unlike a stateless API integration, a desktop app is expected to maintain coherent, multi-session memory. Here's how to handle this well.
Setting Up System Instructions
System instructions establish the persona and behavior of your AI across all conversations. In an Electron app, store them in electron-store (not in the renderer state, which resets on reload):
// main/gemini.tsimport Store from 'electron-store';const store = new Store<{ systemInstruction: string }>();const DEFAULT_INSTRUCTION = `You are a helpful desktop AI assistant. You have access to the user's local filesystem through tool calls. When working with files, always confirm the file path before reading or modifying anything. Be concise but thorough.`;export function getSystemInstruction(): string { return store.get('systemInstruction', DEFAULT_INSTRUCTION);}export function setSystemInstruction(instruction: string): void { store.set('systemInstruction', instruction);}// Use in generateContent calls:const response = await ai.models.generateContent({ model: 'gemini-2.5-flash', contents, config: { systemInstruction: getSystemInstruction(), tools: localTools, },});
Persisting Chat History
Conversation history needs to survive app restarts if your users expect continuity. A pragmatic approach: serialize the last N messages to electron-store, with a configurable limit.
// main/chat-store.tsimport Store from 'electron-store';interface StoredMessage { role: 'user' | 'model'; content: string; timestamp: number;}interface ChatStore { messages: StoredMessage[];}const chatStore = new Store<ChatStore>({ name: 'chat-history' });const MAX_STORED_MESSAGES = 100;export function getStoredHistory(): StoredMessage[] { return chatStore.get('messages', []);}export function appendMessage(role: 'user' | 'model', content: string): void { const messages = getStoredHistory(); messages.push({ role, content, timestamp: Date.now() }); // Keep only the most recent messages const trimmed = messages.slice(-MAX_STORED_MESSAGES); chatStore.set('messages', trimmed);}export function clearHistory(): void { chatStore.set('messages', []);}
Context Window Budget Management
Gemini 2.5 Flash supports up to 1M tokens of context, but sending the entire conversation history on every request is wasteful and eventually expensive. A sliding window approach works well for most desktop assistant use cases:
// main/ipc-handlers.ts (context trimming)const MAX_HISTORY_MESSAGES = 20; // Last 20 messages sent to APIfunction buildContents( history: Array<{ role: string; parts: string }>, newMessage: string) { // Always include the most recent messages for coherence const recentHistory = history.slice(-MAX_HISTORY_MESSAGES); const contents = recentHistory.map(h => ({ role: h.role as 'user' | 'model', parts: [{ text: h.parts }], })); contents.push({ role: 'user', parts: [{ text: newMessage }] }); return contents;}
For long-running assistant apps, consider implementing a "summarize old context" step: periodically ask Gemini to produce a condensed summary of older conversation segments, store that summary, and prepend it to the context window as a system message. This pattern keeps the useful history without blowing up your token budget. See Gemini API TypeScript Type-Safe Application Architecture for how to type these patterns correctly across your entire codebase.
Advanced UI Patterns for Electron AI Apps
System Tray Integration
A persistent system tray icon makes your AI assistant always-accessible without keeping the main window open:
// main/index.tsimport { crashReporter } from 'electron';// Local crash dump collection (no remote server needed)crashReporter.start({ productName: 'GeminiDesktop', companyName: 'YourCompany', submitURL: '', // Empty to disable remote submission uploadToServer: false,});
For remote crash reporting with user consent, Sentry has an Electron SDK that integrates cleanly with the main process error handling patterns shown earlier in this guide.
API Usage Tracking
If you want to give users visibility into how many Gemini API tokens they're consuming, track it locally:
// main/usage-tracker.tsimport Store from 'electron-store';interface UsageStore { dailyUsage: Record<string, { inputTokens: number; outputTokens: number; calls: number }>;}const usageStore = new Store<UsageStore>({ name: 'usage' });export function recordUsage(inputTokens: number, outputTokens: number): void { const today = new Date().toISOString().slice(0, 10); // YYYY-MM-DD const daily = usageStore.get('dailyUsage', {}); const existing = daily[today] ?? { inputTokens: 0, outputTokens: 0, calls: 0 }; daily[today] = { inputTokens: existing.inputTokens + inputTokens, outputTokens: existing.outputTokens + outputTokens, calls: existing.calls + 1, }; // Keep only the last 30 days const keys = Object.keys(daily).sort(); if (keys.length > 30) { delete daily[keys[0]]; } usageStore.set('dailyUsage', daily);}export function getUsageSummary(): UsageStore['dailyUsage'] { return usageStore.get('dailyUsage', {});}
Surface this in a "Usage" settings panel so users understand their API consumption. This kind of transparency is especially valuable if you're building a tool for teams where multiple people share a single API key.
Distribution Checklist Before Your First Release
Before publishing to GitHub Releases, run through this list:
nodeIntegration: false and contextIsolation: true in all BrowserWindow configs — verify with grep -r "nodeIntegration" src/
No API keys, tokens, or secrets hardcoded anywhere — check with grep -r "AIza" src/
Preload script uses only contextBridge.exposeInMainWorld — no direct Node.js API exposure
macOS: hardenedRuntime: true is set and entitlements file is present (required for notarization)
Windows: NSIS installer is configured with oneClick: false to let users choose the install path
Auto-updater is configured and tested against a staging GitHub release
process.on('unhandledRejection', ...) handler is in place
Log rotation is implemented to prevent disk space issues for long-running apps
Rate limit and retry logic is in place for all Gemini API calls
Building and shipping a desktop app is more involved than deploying a web service, but the result is a product your users can install, trust, and rely on — even offline, even with their most sensitive local files. That's a meaningful value proposition that web apps simply can't match.
Share
Thank You for Reading
Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.