Google I/O 2025: The most important new launches

mercredi 21 mai 2025, 15:12 , par InfoWorld

At its annual developer conference, Google presented groundbreaking AI innovations that go far beyond mere product updates. Rather, they outline a future in which artificial intelligence is no longer just a tool, but an omnipresent, proactive companion. Here are the most important new products.

Project Astra: the universal AI assistant

Project Astra, which has been continuously developed since its first presentation at Google I/O 2024, embodies Google’s vision of a universal AI assistant. The most revolutionary innovation: Astra can now act proactively. Instead of only reacting to direct commands, the assistant continuously observes its surroundings and decides independently when it should intervene.

[ Related: Google Cloud I/O 2025: News and insights ]

“Astra can decide for itself when it wants to speak based on events it sees,” explains Greg Wayne, Research Director at Google DeepMind. “It observes continuously and can then comment.” This ability marks a fundamental change in human-AI interaction.

The practical application examples are manifold: If students are working on their homework, Astra could notice a mistake and point it out. Another scenario would be for the AI assistant to inform intermittent fasters shortly before the end of their fasting period — or to carefully ask whether they really want to eat something outside their meal window.

DeepMind CEO Demis Hassabis calls this ability “reading the room” — a feel for the situation. At the same time, he emphasizes how incredibly difficult it is to teach a computer when it should intervene, what tone it should strike and when it should simply be quiet.

Astra can now also access information from the web and other Google products as well as operate Android devices. In an impressive demo, Bibo Xiu, Product Manager in the DeepMind team, showed how Astra independently paired Bluetooth headphones with a smartphone.

Gemini 2.5: the new flagship model

Gemini 2.5 is at the center of Google’s AI strategy and has received significant upgrades. The family comprises two main variants:

Gemini 2.5 Pro, the flagship model for complex tasks, and

Gemini 2.5 Flash, the more efficient and faster version for everyday applications.

The most exciting new feature is “Deep Think,” an experimental, advanced thinking mode for 2.5 Pro. This feature uses new research techniques that enable the model to consider multiple hypotheses before answering. The results are impressive: Deep Think achieves excellent scores on the USAMO 2025 math benchmarks, leads on LiveCodeBench for competitive coding and scores 84 percent on the MMMU test for multimodal reasoning.

Gemini 2.5 Flash, on the other hand, has been optimized as an efficient “workhorse” and now offers better performance with 20 to 30 percent less token usage, according to Google. Flash is already available in the Gemini app for all users and will be generally released for production environments from June.

Both models will also receive new features: The Live API introduces audio-visual input and native audio output, allowing for more natural conversations. The model can customize its tone of voice, accent and speaking style – for example, the user can instruct it to use a dramatic voice when telling a story.

As a further innovation, the text-to-speech capabilities now support multiple speakers for the first time and work in over 24 languages with seamless switching between them. In addition, the security measures against indirect prompt injections have been significantly strengthened.

AI integration in Google services

AI is also increasingly finding its way into existing Google services. For example, AI mode is now being rolled out in Google Search for all US users, with new functions debuting there first before being incorporated into regular search.

Particularly exciting is “Deep Search,” a function that takes over complex searches and analyzes, compares and merges multiple sources. “Search Live,” on the other hand, enables real-time information searches: if you point your smartphone camera at a building, for example, Google Search immediately provides information on its history, architectural style and opening hours.

Gmail also benefits from AI upgrades. The personalized Smart Replies now take into account users’ personal writing style, previous interactions and even their calendar. “If you have entered an important appointment at 3 p.m., the Smart Reply could suggest moving the meeting to 4 p.m. instead of simply accepting it,” explained a Google representative.

“Thought Summaries” provide insight into the AI’s “thought processes” and make it possible to understand how it comes to certain conclusions. “Thinking Budgets” enable developers to manage and optimize the “thinking time” of their AI applications.

Creative AI tools for media production

Google is also revolutionizing the way images, videos and music are created with a range of new and improved tools. The most exciting new tool is“Flow,” an AI app specifically for filmmakers that can generate complex video scenes from simple text descriptions.

Even renowned director Darren Aronofsky (The Whale, Black Swan, among others) is already using AI in his creative processes, as was revealed at the conference. This underlines the fact that these tools are not just for amateurs, but are also being adopted by professionals.

Imagen 4 was also unveiled, with the latest version of Google’s image generation system now set to set new standards in terms of detail and realism. Veo 3 is set to achieve similar advances in video generation. In the audio sector, Lyria 2 is causing a stir – Google’s music generation system can now create complete pieces of music and edit existing music.

With SynthID, Google has also introduced a system for authenticating and labeling AI-generated content. The tool inserts invisible watermarks into generated media that can later be verified – an important step towards transparency in a world where it is becoming increasingly difficult to distinguish between human-generated and machine-generated content.

Google Beam and XR technologies

By renaming Project Starline to Google Beam and introducing new XR technologies, Google is making it clear that immersive experiences are a central part of its vision for the future.

Google Beam, the successor to Project Starline unveiled in 2021, marks a significant advance in telepresence technology. According to Google, the new version of the solution takes up less space, consumes less energy and still offers the same immersive presence experience.

Beam is Google’s contribution to telepresence. The first commercial products are to be launched on the market in cooperation with HP.Google LLC

The integration of real-time voice translation in Google Meet is particularly impressive. This function translates conversations simultaneously and displays subtitles in the desired language, while the speaker’s voice is synthesized in the target language.

Android XR, on the other hand, documents Google’s ambitious foray into the world of augmented reality. The platform offers developers tools to create immersive applications that work seamlessly between smartphones, tablets and XR glasses.

Xreal’s Project Aura prototype, which was developed in collaboration with Google, shows what the future of AR glasses could look like. The smart glasses are almost indistinguishable from normal glasses, a decisive step for the social acceptance of such technologies.

The integration of Gemini on headsets represents another milestone. The AI assistant can not only process voice commands, but also interpret visual information from the user’s surroundings.

Agentic AI: the future of automation

“Agentic AI,” i.e. AI systems that can plan and execute tasks independently, were the focus of numerous announcements, as they mark a paradigm shift in human-machine interaction.

Project Mariner is particularly worth mentioning. First presented in December 2024, the solution now comprises a system of agents that can perform up to ten different tasks simultaneously. Among other things, they are able to look up information, make bookings or make purchases – all at the same time.

Agent Mode goes even further: here, the AI understands the user’s intention and independently chooses the most efficient way to achieve the desired goal. In a demo, Google showed how a simple command such as “Plan a weekend trip to Berlin” led to a cascade of actions: The agent then researched flights, hotels and activities and presented a complete itinerary – all without further user interaction.

Agentic Checkout is particularly revolutionary: a function that could fundamentally change the online shopping experience. The agent can take over the entire checkout process, find the best offers, fill in forms and complete the purchase – all with minimal user interaction.

Google emphasizes that security and responsibility are central to their work. The agents explain their actions, ask questions about important decisions and can be interrupted by the user at any time.

Scientific AI applications

The AI research apps presented cover a broad spectrum of scientific disciplines. These specialized applications combine Gemini’s understanding of scientific literature with domain-specific models and simulation capabilities.

Particularly impressive was the demonstration of an application for protein folding that builds on the findings of DeepMind’s AI system AlphaFold. The new version can not only predict the three-dimensional structure of proteins, but also simulate their interactions with other molecules, a crucial step for drug development.

The Jules Coding Assistant in turn represents a quantum leap for AI-supported software development. Unlike conventional code assistants, Jules not only understands programming languages, but also the intention behind the code and the broader context of the project.

Canvas, Google’s collaborative AI environment, is designed to take scientific collaboration to a new level. The platform enables researchers to visualize complex data, develop models and interpret results – all in a shared virtual environment.

Ironwood and Project Mariner, two of Google’s most advanced research prototypes, combine multimodal understanding capabilities with agent-based action and can independently plan and execute complex scientific workflows.

Risks and side effects…

Despite all the euphoria, it is essential to take a critical look at the new solutions. We must not forget that AI systems remain prone to error despite all the progress made. They can invent facts, misinterpret correlations or fail in unexpected situations. The demos at I/O took place in controlled environments: In more chaotic reality, the results are likely to be less impressive.

Data protection and security aspects cast further shadows. The more context AI systems have, the better they work, but the more sensitive data they have to process. If an AI agent can act on someone’s behalf, what security guarantees are there against misuse or manipulation?

The social impact is perhaps the most difficult to assess. AI systems could transform many professions or even make them redundant. At the same time, these technologies could reinforce existing inequalities. Access to advanced AI requires fast internet, modern devices and often paid subscriptions, resources that are unevenly distributed globally.

Google proved at I/O 2025 that it is at the forefront of AI innovation. But the true success of these technologies will not be measured by benchmarks or demos but by whether they actually improve people’s lives, whether they enable us to be more creative, productive and fulfilled without sacrificing our autonomy, privacy or humanity.

Lire la suite sur InfoWorld