●API — Gemini 3.5 Flash is generally available and now powers gemini-flash-latest for sustained agentic and coding performance●AGENT — Managed Agents enter public preview, running stateful autonomous agents in Google-hosted isolated Linux sandboxes●SEARCH — File Search adds multimodal search, embedding and searching images natively with gemini-embedding-2●RESEARCH — A new Deep Research agent adds collaborative planning, visualization, MCP server integration, and File Search●SHEETS — Gemini in Sheets analyzes surrounding data to diagnose and fix formula errors in one click●ROADMAP — Gemini 3.5 Pro slips to July for refinement; the Flash line leads for now●API — Gemini 3.5 Flash is generally available and now powers gemini-flash-latest for sustained agentic and coding performance●AGENT — Managed Agents enter public preview, running stateful autonomous agents in Google-hosted isolated Linux sandboxes●SEARCH — File Search adds multimodal search, embedding and searching images natively with gemini-embedding-2●RESEARCH — A new Deep Research agent adds collaborative planning, visualization, MCP server integration, and File Search●SHEETS — Gemini in Sheets analyzes surrounding data to diagnose and fix formula errors in one click●ROADMAP — Gemini 3.5 Pro slips to July for refinement; the Flash line leads for now
Your Gemini Structured Output Keys Keep Reordering — Pin Them With propertyOrdering
You constrained the shape with responseSchema, yet the JSON key order shifts between calls and your snapshot tests go red for no reason. Here is why field order is not guaranteed by default, how propertyOrdering fixes it, how Pydantic sets it for you, and how to align few-shot examples — all with working code.
When you automate content operations across several sites, you end up asking Gemini for structured metadata constantly: give me the title, category, tags, and summary as JSON, then push that straight into the next stage. I felt safe because responseSchema constrained the shape. Then one day my snapshot tests started going red with no explanation.
Opening the diff was almost anticlimactic. The values were identical. The only thing that had changed was the order of the keys. An output that was {"title": ..., "tags": ...} came back the next run as {"tags": ..., "title": ...}. Right types, right values, and yet a text comparison reports a mismatch.
This article walks through that behavior — structured output does not guarantee field order by default — and how propertyOrdering pins it, including the trap that bites you once you add few-shot examples. It follows the same order in which I actually hit each issue as an indie developer maintaining pipelines across my own Dolice Labs sites.
Why the order moves — JSON schema being honest
The root of the confusion is in JSON itself: object keys carry no notion of order. The Schema object's properties (it descends from OpenAPI) is conceptually an unordered map, and the order the model writes fields in is decided at generation time. Even with temperature at 0 and a byte-identical prompt, sampling noise can swap the arrangement.
Google's structured output guidance says as much: if you depend on order, specify it explicitly. So there are two separate guarantees here — the type can be fixed, but the ordering has to be fixed separately. Miss that distinction and you lose time on the assumption that "I constrained the type, so the output must be fully determined," which is exactly the hole I fell into.
Here is the typical code that hands Gemini a raw dict schema.
# pip install google-genaifrom google import genaifrom google.genai import typesclient = genai.Client(api_key="YOUR_GEMINI_API_KEY")# A dict schema with no ordering specifiedschema = { "type": "object", "properties": { "title": {"type": "string"}, "category": {"type": "string"}, "tags": {"type": "array", "items": {"type": "string"}}, "summary": {"type": "string"}, }, "required": ["title", "category", "tags", "summary"],}resp = client.models.generate_content( model="gemini-2.5-flash", contents="Extract the metadata for this article: ...", config=types.GenerateContentConfig( response_mime_type="application/json", response_schema=schema, ),)print(resp.text)# Sometimes {"title":..., "category":..., "tags":[...], "summary":...}# Other times {"summary":..., "title":..., "tags":[...], "category":...}
Even though response_schema is set, the raw key order of resp.text wobbles between calls. If you only json.loads it and read values, nothing is wrong. The trouble appears only when you save and compare the generated JSON text itself. For me that was snapshot tests, plus a workflow that committed generated results to git as diffs.
Pin the order with propertyOrdering
The fix is simple: add propertyOrdering (camelCase in REST/JSON) to the Schema and list the property names in the order you want them emitted.
schema = { "type": "object", "properties": { "title": {"type": "string"}, "category": {"type": "string"}, "tags": {"type": "array", "items": {"type": "string"}}, "summary": {"type": "string"}, }, # Add just this. The output stabilizes in this order. "propertyOrdering": ["title", "category", "tags", "summary"], "required": ["title", "category", "tags", "summary"],}resp = client.models.generate_content( model="gemini-2.5-flash", contents="Extract the metadata for this article: ...", config=types.GenerateContentConfig( response_mime_type="application/json", response_schema=schema, ),)# Output is always title -> category -> tags -> summary
propertyOrdering is an ordering hint, not a type constraint. Because validation still passes without it, it is easy to forget when you hand-write dict schemas — that has been my lived experience. If you have nested objects, each child Schema needs its own propertyOrdering too. Write it only at the top and the nested fields keep shuffling, and the diffs come back.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦Understand why responseSchema does not pin key order, and add one line of propertyOrdering to a dict schema to lock it down today
✦Learn how passing a Pydantic model as response_schema turns declaration order into propertyOrdering automatically, with verification code
✦Stop snapshot regressions that flapped purely on key order, using a two-layer fix: pin the order and normalize on compare
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
Pass a Pydantic model and declaration order becomes the order
Raw dicts are flexible, but hand-written propertyOrdering slips through the cracks. What I have settled on is passing a Pydantic model as response_schema to the google-genai SDK. In that case the SDK converts the model's field declaration order straight into propertyOrdering. You no longer write the order anywhere — "the order in your code" becomes "the order in the output."
from pydantic import BaseModelfrom google import genaifrom google.genai import typesclass ArticleMeta(BaseModel): title: str category: str tags: list[str] summary: strclient = genai.Client(api_key="YOUR_GEMINI_API_KEY")resp = client.models.generate_content( model="gemini-2.5-flash", contents="Extract the metadata for this article: ...", config=types.GenerateContentConfig( response_mime_type="application/json", response_schema=ArticleMeta, # declaration order becomes propertyOrdering ),)meta = resp.parsed # comes back as an ArticleMeta instanceprint(meta.title, meta.tags)
It is safer to verify the order really is pinned rather than assume it. When I migrated, I ran a small check that hit the same input dozens of times and counted whether the raw key order matched.
import jsondef key_order(raw: str) -> list[str]: # Top-level keys, in the order they appear in the raw text return list(json.loads(raw).keys())orders = set()for _ in range(30): r = client.models.generate_content( model="gemini-2.5-flash", contents="Extract the metadata for this article: ...", config=types.GenerateContentConfig( response_mime_type="application/json", response_schema=ArticleMeta, ), ) orders.add(tuple(key_order(r.text)))assert len(orders) == 1, f"Order split into {len(orders)} variants: {orders}"print("OK, key order pinned to one:", orders.pop())
Since Python 3.7, json.loads preserves object insertion order, so keys() lets you observe the on-the-wire ordering. Thirty calls with a single element in orders means it is pinned. On my machine, an order that split into 4-6 variants across 30 calls without propertyOrdering collapsed to exactly one after I set it.
Aligning order when you use few-shot
This was the trap I overlooked the longest. When you feed few-shot examples as JSON inside the prompt, matching the field order inside the examples to propertyOrdering measurably helps quality. If summary comes first in your examples while the schema pins title first, the model receives a contradictory signal about which to follow. That can degrade not just stability but the extraction accuracy itself.
The countermeasure is almost embarrassingly simple: do not hand-write the examples, generate them mechanically from the order you want to pin.
ORDER = ["title", "category", "tags", "summary"]def ordered_json(d: dict) -> str: # Rebuild in ORDER, then serialize ordered = {k: d[k] for k in ORDER if k in d} return json.dumps(ordered, ensure_ascii=False)few_shot = ordered_json({ "title": "Summarizing meeting notes with Gemini", "category": "gemini-api", "tags": ["summary", "notes"], "summary": "How to summarize transcribed meeting notes with Gemini.",})# -> always {"title":...,"category":...,"tags":[...],"summary":...} in ORDER
If your few-shot examples, your propertyOrdering, and your Pydantic declaration order all derive from a single source of truth — one ORDER list — a mismatch becomes structurally impossible. I moved this ORDER constant to the top of the module and now build both the schema and the examples from it.
Do not rely on order alone — normalize on compare
Even with order pinned, as long as the comparison side demands exact text equality, a future minor change leaves room to go red again. So I run a two-layer defense: pin the order on the generation side, and normalize the keys on the comparison side. When taking a snapshot, parse once, re-sort the keys, then save and compare — the ordering wobble never reaches the comparison ring at all.
Pinning the order stabilizes the artifact a human reads; normalization stabilizes the ring a machine compares in. The roles differ, so keeping both — rather than one — also makes it easier to isolate the cause later. In fact, had I added only this normalization before propertyOrdering, the test red would have gone away. But the diff committed to git stayed noisy on every run, so in the end I needed both.
The table below organizes the three layers I kept conflating: type, order, and comparison.
Layer
What it guarantees
Tool
Type
Presence and type of fields
responseSchema / required
Order
Key order (visual stability)
propertyOrdering / Pydantic declaration order
Compare
Stable diff judgement
parse + sort_keys normalization
Small snags I hit during migration
A few things tripped me up when wiring this into an existing pipeline, kept short. If it saves a little time for someone on the same setup, that would make me happy.
Missing the nesting, as above: writing propertyOrdering at the top does not govern the order inside nested objects. Write it per child Schema. The other was a casing mix-up between the raw dict and camelCase. The REST/JSON field is propertyOrdering, but some SDKs accept a snake_case alias, and I once wrote property_ordering by mistake and had it silently ignored. When something you set does not take effect, the fastest path is to log the actual schema you are sending and eyeball whether the field name is really there.
Finally, a key listed in propertyOrdering that is absent from properties — or the reverse, present in properties but missing from propertyOrdering — is its own source of accidents. On the test side I assert in one line that the two sets are equal, so any edit to the schema surfaces the drift immediately.
It is tempting to believe structured output is fully determined once the type is fixed, but in practice three separate concerns overlap: type, order, and comparison. Start by checking whether your own pipeline saves and compares generated JSON text. If it does, drop in propertyOrdering (or a Pydantic declaration order) in one place today. The quiet feeling of the diff red going still is modest, but it is real.
Gemini Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.