Google’s AI endgame is here… everything you missed at Google I/O 2026

Fireship| 00:05:43|May 22, 2026

Chapters9

Google positions Gemini as the interface to reality, embedding AI across products rather than just improving search.

Google I/O 2026 signals a Gemini-driven shift where AI becomes the interface to reality across Google’s products, with rapid model demos, new UX systems, and a bold push into agent-based development.

Summary

Fireship’s recap of Google I/O 2026 centers on Sundar Pichai and Demis Hassabis outlining a Gemini-centric future where AI permeates every product. The keynote framed Gemini as the core engine, spawning variants like Gemini Spark, Gemini Omni, and Gemini Flow, all moving toward an agentic era where search, Gmail, Android, and even wearables become AI-driven interfaces. The video argues that Google aims to redefine how we interact with information by embedding AI agents into every product, effectively prioritizing reality modeling over classic hyperlink-based search. Beyond AI, the report highlights practical tech notes, including a new web API for developers, and a focus on the sheer scale of Google’s infrastructure. The piece also dives into hardware shifts, noting the split of TPU chips into training (TPU-T) and inference (TPU-I) roles to optimize thinking versus hallucination at global scale. Gemini Omni is described as a multimodal model capable of taking text, video, and sound as input and producing any output, paired with a Neural Expressive design system that generates UI elements on demand. Gemini Flash 3.5 is presented as a fast-but-not-top-tier model, with the forthcoming Gemini 3.5 Pro teased for later this summer. The segment closes with a nod to Chrome’s HTML on Canvas API for intertwined HTML-in-Canvas UI capabilities and a plug for Emergent, the sponsor, which demonstrates an agent-based approach to building full-stack apps quickly.Overall, The Code Report paints a future where AI agents, new design systems, and scalable hardware redefine software development and user experience at Google.

Key Takeaways

Gemini Omni is a new multimodal model that handles text, video, and sound to produce any output, signaling a true cross-input capability.
Google splits TPU chips into TPU-T for training and TPU-I for inference, signaling a dedicated hardware path for thinking versus hallucinating at scale.
Gemini Flash 3.5 is a fast model that benchmarks competitively with Opus 4.7 and GPT-5.5 but isn’t their top-tier offering yet.
Gemini 3.5 Pro remains under wraps with expectations set for later this summer, creating anticipation and online disappointment among fans.
The Neural Expressive design system aims to generate UI elements on demand within Gemini apps, enabling diagrams, timelines, and mini apps via prompts.
Chrome’s HTML on Canvas API now lets developers render HTML elements directly in a canvas, blending WebGL/WebGPU with traditional DOM UI.
Emergent offers a multi-agent approach to building full-stack apps, allowing parallel front-end, back-end, database, testing, and deployment workstreams.

Who Is This For?

This video is essential viewing for frontend and backend developers curious about the AI-first direction of Google, plus product managers and AI enthusiasts tracking Gemini’s rollout and the implications for tooling like Emergent.

Notable Quotes

""Gemini Omni, a model that takes any input like text, video, and sound and produces any output.""

—Shows the multimodal ambition behind Gemini Omni.

""The search is now an AI agent, Gmail is an AI agent, Android is an AI agent... your glasses are an AI agent.""

—Captures the broad shift to AI-powered interfaces.

""Neural Expressive"... optimized for generating UI elements on demand, like diagrams, timelines, and even mini apps.""

—Defines the new UX design system for Gemini apps.

""The speed is not the only thing increasing. But, the price of Gemini 3.5 Flash is three times more than the previous version and 30 times more than Gemini 1.5 Flash.""

—Highlights cost dynamics amid faster models.

""Emergent spins up specialized agents to work on your app's front end, back end, database, testing, and deployment all in parallel.""

—Explains the multi-agent development workflow.

Questions This Video Answers

What is Gemini Omni and why does Google position it as a universal multimodal model?
How do TPU-T and TPU-I differ in Google's new AI infrastructure?
What is Emergent and how does its agent-based approach speed up full-stack development?
What does Chrome HTML on Canvas API enable for web developers in 2026?
When will Gemini 3.5 Pro be released and what should we expect from it?

Google I/O 2026Gemini OmniGemini SparkGemini FlowNeural ExpressiveTPU-TTPU-IGemini Flash 3.5Gemini 3.5 ProAndroid AI agent','Gmail AI agent','Chrome HTML on Canvas API','Emergent AI tooling','Windserve/Cursor reference

Full Transcript

Yesterday, Google I/O wrapped, and I was able to watch in person as Sundar and Demis laid out an ambitious vision for the future of software. And apparently, that future is Gemini hiding inside of every product like the microplastics in your bloodstream. But the road map is basically take Gemini, append a noun to it, and ship it. Gemini Spark, Gemini Omni, Gemini Flow, and the list goes on. But they're calling it the agentic Gemini era. The search is now an AI agent, Gmail is an AI agent, Android is an AI agent, your glasses are an AI agent. And as I watched the keynote, I realized something. That Google is no longer trying to organize the world's information with blue hyperlinks, because search engines are now an archaic technology. Instead, Google is trying to become the interface to reality itself before Anthropic and OpenAI create better realities. But luckily, Google I/O wasn't all about AI. I didn't see any updates about Angular, but I did come across a new awesome web API that every web developer should know about. In today's video, we'll break down everything you missed at Google I/O. It is May 22nd, 2026, and you're watching The Code Report. Whether you love it or hate it, one thing is undeniably impressive about Google, and that's its ability to scale. Not only is it serving its core products to billions of daily active users, but in the last 2 years, they've gone from serving 9.7 trillion tokens per month to a staggering 3.2 quadrillion tokens per month. And that number is going to continue accelerating. In addition, Alphabet's capital expenditures have exploded, building new infrastructure to support all these stupid AI images you guys create with nano banana. You ever see a pug dressed like an accountant? No. You want to? Uh One thing that makes this massive scale possible is Google's TPU chip, or Tensor Processing Unit. I remember being amazed seeing a TPU at my first Google I/O back in 2018. But this week, they announced they're splitting these chips into two distinct jobs, the training and inference with the TPU-T and TPU-I. In other words, Google now has one chip that's optimized to teach a robot how to think, and another chip that's optimized for it to hallucinate search results on a global scale. The headline announcement at Google I/O though was Gemini Omni, a model that takes any input like text, video, and sound and produces any output. Demis Hassabis, who might be the smartest guy at Google, appears to be fully world model pilled because models like this don't just generate pixels anymore. They understand language, physics, motion, and everything else in your world just well enough to simulate reality on demand. But along with this new model comes an entirely new design system for the Gemini app called Neural Expressive. At first glance, the UI looks like a simple glow up with new icons and better gradients. But what's unique about it is that it's optimized for generating UI elements on demand, like diagrams, timelines, and even mini apps that didn't exist before your prompt. Now, when it comes to Google's core large language models, they released Gemini Flash 3.5, which is not the big brain model, but the fast model. According to the trust me bro benchmarks, it performs nearly on par with Opus 4.7 and GPT-5.5, but runs at a much faster speed. Like if we look at this trust me bro diagram, we see that Flash is entirely in a quadrant of its own in terms of speed and intelligence. However, it's important to remember that this is not their top-tier model. The Gemini 3.5 Pro is still under wraps and not expected to release until later this summer, which was very disappointing to a lot of people on the internet. Speaking of disappointment though, not everybody was happy with the new direction of Google's anti-gravity IDE. Anti-gravity was formerly known as Windserve and was code for AI coding just like Cursor. And once again, following in the footsteps of Cursor, its latest version looks like an OpenAI Codex clone that's more focused on managing agents than writing code. Old school programmers might not be happy with this change, but the live demo was pretty badass. They used the tool to build a complete operating system from scratch, which took like 12 hours and billions of tokens. But then, they tried to play Doom on it and it failed due to missing drivers. However, live on stage, they had Gemini code up those drivers and within a few seconds, Doom was up and running. The most impressive part was just the sheer speed at which this thing could spit out tokens. But, the speed is not the only thing increasing. But, the price of Gemini 3.5 Flash is three times more than the previous version and 30 times more than Gemini 1.5 Flash. It's still a lot cheaper than Claude, but not nearly as cheap as it used to be. Almost everything at IO involved AI in one way or another. But, if you're a web developer, one cool thing you should know about in Chrome is the HTML on Canvas API, which as the name implies, allows you to use HTML elements directly in a canvas now. Awesome. Native HTML elements rendered into the canvas. Woo! That means you can build highly interactive UIs where you control every pixel with tools like WebGL and WebGPU, while simultaneously using HTML for your more basic UI elements. The only question is which AI coding model should you use to work with this API? Well, that's why you need to know about Emergent, the sponsor of today's video. Everyone's switching between five different coding models these days, but we still need something to help us ship full stack applications that actually work. And that's exactly where Emergent can help. Right now, I'm using it to build a pull request review dashboard where I can paste in a GitHub link and get an AI summary of all the changes and risks per repo. You still start with a prompt, but instead of one LLM guessing how to build everything, Emergent spins up specialized agents to work on your app's front end, back end, database, testing, and deployment all in parallel. You also don't need to mess with any Superbase wiring or Express boilerplate, because that one prompt sets up your app's database, auth, and APIs. If you're really into self-torture, feel free to keep scaffolding this stuff by hand, or you could just describe the tool you want and let Emergent's agents swarm build it all for you. You try it out for free at the link below. This has been the Code Report. Thanks for watching and I will see you in the next one.