Last Saturday morning, I opened the Gemini app to work through the spec for my app's next update — and froze at the mode picker. Deep Think for careful reasoning, Deep Research for gathering sources, Live for talking things through, Omni for generating video and mixed media. With the June 2026 updates, the lineup now covers thinking, researching, talking, and creating end to end. That's genuinely good news, but standing in front of an actual task, I couldn't immediately say which one deserved it.
So last week I did an inventory of my recurring indie-development work and explicitly reassigned each task to one of the four modes. The punchline: the whole assignment came down to three questions. If you've been feeling that more modes have somehow made Gemini harder to use, I hope this record saves you some of the trial and error.
The problem wasn't the number of modes — it was having no criteria
The first thing I noticed was what the hesitation actually was. Having four modes isn't the problem. The problem was that I had no criteria to consult when a task was in front of me.
Before this exercise, I picked modes on vague grounds: "this decision matters, so Deep Think" or "I need information, so Deep Research." The weakness of that rule is that almost every task both matters and needs information. Speccing a new feature involves competitor research and design reasoning at the same time, so either justification works — which means the choice ends up being made by mood.
To fix that, I went back through two weeks of requests I had sent to Gemini and sorted them into ones I was happy with and ones that disappointed. A pattern emerged: each mode's strengths had almost nothing to do with how important the task was. They were explained by three different conditions.
Question 1: Do I already have the raw material? — the line between Deep Think and Deep Research
The pair that's easiest to confuse is Deep Think and Deep Research. Both take their time and return deep answers, so on the surface their roles overlap.
My conclusion is a simple split: if the raw material is already in my hands, Deep Think; if collecting the material is part of the job, Deep Research.
Deep Think shone on problems where I could supply every input myself. One example: I prepared three candidate price revisions for my app's in-app purchases, handed them over together with past revenue trends, and asked for the weaknesses of each. The problem is closed — so the depth of reasoning translates directly into answer quality. When I instead asked "think about my pricing strategy" with no material attached, I got a politely reworded set of generalities that wasn't worth the wait.
Deep Research is the mirror image: it earns its keep at the stage where you don't yet know what you don't know. For surveys of unfamiliar territory, the cited, comprehensive collection is hard to replace. But pointed at a question my own material could already answer, it just adds research time. My overall impressions haven't changed much since I wrote Six Months with Gemini Deep Research — an honest look at the gap between expectations and reality.
Once this axis was in place, much of the hesitation disappeared. Work like feature speccing decomposes naturally into two stages: Deep Research up front to gather material, then Deep Think once the material is on the table.
Question 2: Will my assumptions shift mid-way? — Live as a venue for deliberation, not an input method
As long as I treated Gemini Live as "voice instead of keyboard," it had no place in my week. Text is more precise, and dictating code or URLs out loud is hopeless.
What changed my mind was using it for deliberation where the assumptions aren't settled yet. While out on a walk, I was talking through bug-fix priorities for the next update when, mid-conversation, I realized one of the bugs sat in territory the upcoming OS update would change anyway — and I could correct that assumption on the spot and keep going. You can do the same in a text chat, but with zero rewriting cost, you stop hesitating to revise your premises. That's the real difference.
So Live now owns the early, fluid stage of deliberation — concretely, my start-of-week pass where I list the week's tasks and rank them. It does not get the later stages, like finalizing an implementation approach or polishing prose. Re-explaining settled premises out loud is pure duplicated effort.
Question 3: What shape is the deliverable? — what I handed to Omni, and what I deliberately didn't
With Gemini Omni Flash rolling out to all subscribers including AI Plus, video and mixed-media generation is suddenly easy to try. I welcome the wider menu of "creating," but this is where I kept the assignment deliberately narrow.
What Omni got from me: deliverables that are short explanatory media and cheap to redo. Concretely, prototype clips of about fifteen seconds introducing a new app feature — an experiment, one slot per week, to see whether a single-prompt video can replace the static-image-plus-text explainers I used to assemble by hand.
What it didn't get: production assets that actually ship to the App Store listing. The reason isn't quality — it's reproducibility. I haven't yet worked out a procedure that reliably gets the same tone of video from the same prompt, and store assets need series-level consistency. That makes it too early; I'll migrate the work once the prototyping settles the procedure.
Text-centric routine work (release-note drafts and the like) stays with the regular chat and with Gemini 3.5 Flash on the API side. For deciding which workloads belong on the API side at all, I'm reusing the evaluation I described in Where to Adopt Gemini 3.5 Flash GA First — per-workload evaluation and a staged rollout with a model router.
Two adjustments after a week of running this
After a week, exactly two things needed fixing.
First, the granularity of what goes to Deep Think. I started by sending large units like "the feature spec," but with several intertwined issues in one request, the reasoning felt thinly spread across all of them. Now I send one issue at a time — "compare the weaknesses of these three price options" — and split multi-issue work into separate requests. If Deep Think isn't behaving the way you expect, the diagnostics in Gemini 3's Deep Think Not Working as Expected — five common problems and fixes are a useful checklist.
Second, when to launch Deep Research. Runs can take a while, and firing one off on impulse between tasks meant that by the time results arrived, my head was elsewhere and I skimmed them carelessly. I've now fixed the routine: launch at night, read the next morning. Pushing "researching" into the night and collecting it in the morning mirrors the off-peak scheduling I already use for automated publishing, and it fits the rhythm of an indie developer's week well.
Start by sorting your own last two weeks of requests
If you're stuck on mode selection, my suggested first step isn't to draw up an assignment table. It's to go back through your last two weeks of Gemini requests and sort them into satisfying and disappointing. In my case, the three questions — was the material in hand, did the premises move, what shape is the deliverable — emerged from that sorting, and the assignments then mostly decided themselves.
Counted by features, Gemini has grown more complex. With the right criteria, though, choosing has actually become faster. Thanks for reading — I hope this helps you rethink your own week's assignments.