DeepMind Views New Genie 3 World Model As A Milestone Toward AGI

Google DeepMind Logo Google DeepMind Logo

Google DeepMind rolled out Genie 3, its new general-purpose world model for training AI agents. The lab calls it a key step toward artificial general intelligence (AGI).

Genie 3 goes beyond past models by creating real-time interactive 3D environments from text prompts. It can generate several minutes of photo-realistic or imagined scenes at 720p and 24fps—far longer than Genie 2’s 10 to 20 seconds.

The model remembers what it’s generated to maintain physical consistency over time, without explicit programming. This lets it simulate physics naturally, predicting things like an object teetering before it falls.

Advertisement

DeepMind says Genie 3 is critical for training embodied AI agents that need to operate in complex, dynamic worlds—a major hurdle on the path to AGI.

Jack Parker-Holder, DeepMind researcher, said,

“We think world models are key on the path to AGI, specifically for embodied agents, where simulating real world scenarios is particularly challenging.”

Genie 3 teaches itself physics and world interactions by generating frames one at a time and referencing past frames for continuity. This auto-regressive design mirrors how humans understand cause and effect.

The model stood up in tests with DeepMind’s recent generalist AI, SIMA, which successfully completed tasks like approaching objects in a simulated warehouse.

Parker-Holder added,

“In all three cases, the SIMA agent is able to achieve the goal. It just receives the actions from the agent. So the agent takes the goal, sees the world simulated around it, and then takes the actions in the world. Genie 3 simulates forward, and the fact that it’s able to achieve it is because Genie 3 remains consistent.”

Genie 3 still has limits. Complex physics interactions—like snow displacement in a skiing demo—aren’t perfect. Agent actions are constrained, and multiple independent agents interacting remain tricky. Plus, it only supports a few minutes of continuous interaction, short of ideal training times measured in hours.

Still, the model pushes agents toward self-driven learning: planning, exploration, and trial and error.

Parker-Holder said DeepMind hasn’t yet seen a “Move 37 moment” for embodied agents but believes Genie 3 could kickstart that breakthrough.

“We haven’t really had a Move 37 moment for embodied agents yet, where they can actually take novel actions in the real world,”
“But now, we can potentially usher in a new era.”

Genie 3 is in research preview and not publicly available.


Prompt-to-World demo with Genie 3


Genie 3 event simulation


The launch follows DeepMind’s earlier video generation model Veo 3 and Genie 2, which Genie 3 builds on. Look for more on this at TechCrunch’s event in San Francisco, October 27-29, 2025.

Add a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Advertisement