Gemini 3.5 Pro Missed June. The Slip Says More Than the Launch Will.

Gemini 3.5 Pro Google AI Models Vertex AI LLM Ops

At the end of my Gemini 3.5 Flash review I said I wasn't making a final call on the Gemini 3.5 family until Pro was on the table. The table stayed empty. June came and went, the public API stayed closed, and Gemini 3.5 Pro is now targeting general availability in July, locked until then in a limited Vertex AI preview for a small set of enterprise customers.

So this is the follow-up nobody wants to write: a post about a model that didn't ship. I'm writing it anyway, because the slip itself is data — about what's hard at the frontier right now, and about how expensive a missed month has become.

The timeline, straight

May 19    Unveiled at Google I/O — GA promised for June
June      Enterprise-only Vertex AI preview; public API closed
June 23   Still no GA; prediction markets ~50-55% on a June ship
June 29   Reports: GA re-targeted to July
July 2    (today) Still in limited preview

At I/O, Sundar Pichai told a room expecting immediate access to wait another month, and reportedly got audible groans for it. The month passed. And this isn't an isolated wobble: it's Google's second major AI delivery miss of the year, after Gemini Ultra 1.5 slipped three months earlier in 2026.

Why it slipped (reportedly)

The reporting attributes the delay to quality refinements after early enterprise testing — specifically token-efficiency issues, coding performance, and long-task, multi-step reasoning that weren't yet at flagship standard.

Read that list again next to my Flash review. Long-context retrieval regressing and deep reasoning trailing the frontier were exactly Flash's weak spots — and closing them is Pro's entire job description: the 2-million-token context window, the Deep Think reasoning mode, the flagship tier of the family. If those are the pieces that aren't ready, the delay is at least honest. A flagship that loses the thread at token 800,000 shipped "on time" would be worse than a flagship that's late — I say that as someone whose agent sessions are one long stress test of exactly that failure mode.

The week that made it worse

In the week of June 21–27, four senior Gemini researchers announced they were leaving Google — for Anthropic. No names or roles were reported, and the honest read is that we don't know why: technical direction, execution frustration, or simply better offers. I'm not going to build a narrative on four resignations. But as a platform-stability signal in the same month your flagship slips, it's the kind of thing the reporting rightly tells developers to keep an eye on.

And the market did not wait politely. Anthropic shipped Claude Sonnet 5 on June 30 — an agent-grade model at $2/$10 introductory pricing, made the default for every Free and Pro Claude user the next morning — landing in the exact window Google left open. In 2026, missing a month doesn't mean shipping late. It means shipping into a market that re-routed around you while you were gone.

What I'm doing while Pro is a promise

The routing split stands. Flash still carries the breadth work — cheap, fast, benchmark-leading at tool orchestration — and the expensive tier still handles the one step that has to be smart. Nothing about a delayed Pro changes that; it just delays the day I re-run the comparison.

Don't rebuild around a July date. That's Bind AI's advice and I'll co-sign it with a scar: I once made a frontier model my default four days before it vanished by government directive. Model availability promises aren't architecture inputs — in either direction. Build so that the model is a config line, and launch dates stop mattering.

A 2M-token context is a spec sheet until it's an API. Before you plan around double-everyone-else's window, check whether you actually need it — in my experience, most "long context" workloads are retrieval problems wearing a costume. The workloads that genuinely need 2M tokens will still need them in July.

The verdict

The honest one-liner: the most interesting thing Gemini 3.5 Pro has done so far is not ship. The delay tells you where the frontier actually hurts — token efficiency, long-horizon reasoning, coding at flagship standard — and the quiet weeks around it tell you what a missed month costs when competitors ship agent-grade models into your window at a discount.

If Pro lands in July with the long-context and reasoning gaps genuinely closed, Flash-plus-Pro becomes a fearsome routing pair and I'll write the real review against my own MCP server, as promised. Until then, Flash carries the family name, and my final call stays where I left it in the last post: on hold, with the meter running.

Sources

The I/O announcement and family specs are in Google's Gemini 3.5 post; the July slip, preview status, and quality-refinement reporting come from Tech Insider's delay coverage; the researcher departures and the developer guidance are from Bind AI's write-up. Treat the $15/$60 pricing figure floating around as an estimate — Google hasn't published a rate card.

Share this article

Share on X LinkedIn Bluesky Reddit WhatsApp Email

More writing

Like what you read?

Stay in the loop.

New articles on engineering, architecture, and building software that lasts. Straight to your inbox.

or follow

GitHub LinkedIn @flcn16