GEMINI LABJP
FLASH — Gemini 3.5 Flash is now generally available, billed as the most intelligent model for agentic and coding tasksTIER — New tiers like 3.1 Pro and 3.1 Flash-Lite are rolling into apps, cloud products, and business toolsPIXEL — The June Pixel Drop adds Gemini music generation, AI video and music creation, and screen-recording reactionsOMNI — Gemini Omni (creation), 3 Deep Think (reasoning), and Deep Research (automation) all advance in parallelLIVE — Gemini Live's real-time interaction is expanding across Android, Search, YouTube, and connected Google appsULTRA — Google AI Ultra offers top model access, Deep Research, Veo 3 video, and a 1M-token context windowFLASH — Gemini 3.5 Flash is now generally available, billed as the most intelligent model for agentic and coding tasksTIER — New tiers like 3.1 Pro and 3.1 Flash-Lite are rolling into apps, cloud products, and business toolsPIXEL — The June Pixel Drop adds Gemini music generation, AI video and music creation, and screen-recording reactionsOMNI — Gemini Omni (creation), 3 Deep Think (reasoning), and Deep Research (automation) all advance in parallelLIVE — Gemini Live's real-time interaction is expanding across Android, Search, YouTube, and connected Google appsULTRA — Google AI Ultra offers top model access, Deep Research, Veo 3 video, and a 1M-token context window
Articles/API / SDK
API / SDK/2026-06-19Advanced

Catch Near-Duplicate Images Before You Publish with gemini-embedding-2

This is about removing near-duplicates, not image search. Use gemini-embedding-2 multimodal embeddings to vectorize images, cluster them, and build a pre-publish gate — with working code and threshold guidance.

gemini85gemini-embedding-23embeddings11image5deduplication

Premium Article

When you run several sites, image assets bite you later not because you have too few, but because near-identical ones quietly pile up. As an indie developer I keep the OGP images for four blogs plus a set of wallpaper apps under Dolice, and for the last six months I've increasingly paused on "wait, haven't I already published this pale blue abstract background?" Checking one by one stops being realistic once the count crosses three digits.

In June 2026, File Search gained multimodal search with gemini-embedding-2, which adds a clean tool for this problem. But what we want here is not search. Instead of finding and pulling back similar images, we want to reject images that are too similar before they ship. These two goals differ in both intent and implementation, and conflating them lets the gate pass everything through.

Why image search can't reject near-duplicates

Retrieval returns "the top N closest to a query." It always returns something, and a loose threshold still works. Near-duplicate detection needs something else: a binary judgment of "are these two images close enough to be considered effectively the same?"

If you repurpose retrieval directly, the single closest item is always returned, so completely unrelated images still line up as "similar candidates." Conversely, if you leave the threshold tuned for search, you miss the recolors and crops you actually want to catch. A near-duplicate gate has to switch to a design where the score itself is the decision boundary.

I underestimated this difference at first and judged "no duplicates" just by glancing at the top File Search results. In reality, similarly composed gradient backgrounds had grown into three lineages; the search context simply treated them as separate hits.

Vectorize the images

First, convert each image into an embedding vector with gemini-embedding-2. Because it's multimodal, you pass an image part to the same endpoint you'd use for text.

import os
from pathlib import Path
from google import genai
from google.genai import types
 
client = genai.Client(api_key="YOUR_API_KEY")
 
EMBED_MODEL = "gemini-embedding-2"  # multimodal (GA, 2026-06)
 
def embed_image(path: Path) -> list[float]:
    data = path.read_bytes()
    mime = "image/png" if path.suffix.lower() == ".png" else "image/jpeg"
    resp = client.models.embed_content(
        model=EMBED_MODEL,
        contents=[types.Part.from_bytes(data=data, mime_type=mime)],
    )
    return resp.embeddings[0].values

If you L2-normalize the vectors up front, the dot product is the cosine similarity, which keeps the later math simple.

import math
 
def l2_normalize(v):
    norm = math.sqrt(sum(x * x for x in v)) or 1.0
    return [x / norm for x in v]
 
def cosine(a, b):
    # a, b are normalized -> dot product is the cosine similarity
    return sum(x * y for x, y in zip(a, b))
 
def build_index(paths):
    index = {}
    for p in paths:
        index[str(p)] = l2_normalize(embed_image(p))
    return index

Since embedding hits the API once per image, a naive implementation pays that cost every run as assets grow. In my own setup, I store vectors locally keyed by the file hash and skip re-embedding any image whose content hasn't changed.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
If you keep hesitating because 'I think I've published something like this before,' you can now flag near-duplicates automatically before publishing
You get working code that vectorizes images with gemini-embedding-2 multimodal embeddings and isolates only the near-duplicates via cosine similarity and threshold clustering
You'll understand how to choose a threshold and how to handle crops and recolors, so you can grow OGP and wallpaper assets across multiple sites without drift into redundancy
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API / SDK2026-04-14
Gemini API Embeddings vs Vector Databases: Pinecone, Qdrant, pgvector, and Cloud Spanner Compared for Production
Benchmark Pinecone, Qdrant, pgvector, and Cloud Spanner Vector using Gemini text-embedding-004 with real latency, cost, and code. The definitive production selection guide.
API / SDK2026-04-03
Building a Production RAG System with Gemini Embedding API and Pinecone
A step-by-step guide to building a production-ready RAG system using Gemini Embedding API and Pinecone. Covers index design, query optimization, chunking strategies, and cost management with practical Python code.
API / SDK2026-03-29
Building Production Semantic Search with Gemini Embeddings API — Design, Implementation, and Operations
A comprehensive guide to building production-grade semantic search with Gemini Embeddings API. Covers vector DB selection, reranking, recommendation engines, and cost optimization with practical code.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →