Robbyant Open-Sources LingBot-World, a World Model for Millisecond-Level Real-Time Interaction

Business Wire IndiaRobbyant, an embodied AI company within Ant Group, today announced the open-source release of LingBot-World, a world model that achieves industry-leading performance in video quality, dynamic fidelity, long-term consistency, and interactivity. Designed for embodied intelligence, autonomous driving, and game development, LingBot-World offers a high-fidelity, highly dynamic, and real-time controllable “digital sandbox” for simulation and training.

This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20260128459962/en/

LingBot-World leads the industry in applicable scenarios, generation duration, dynamic content, resolution, and more

Addressing the common challenge in video generation known as “long-term drift”, where prolonged generation often leads to object deformation, detail collapse, subject disappearance, or scene structure breakdown, LingBot-World leverages multi-stage training and parallelized acceleration to achieve up to nearly 10 minutes of continuous, stable, and lossless video generation. This capability supports complex, multi-step tasks requiring extended temporal coherence.

In terms of interactivity, LingBot-World delivers a generation throughput of approximately 16 FPS and maintains end-to-end interaction latency under one second. Users can control characters and camera perspectives in real time via keyboard or mouse, with immediate visual feedback to their inputs. Additionally, users can trigger environmental changes and world events through text commands—for example, adjusting weather conditions, altering visual styles, or initiating specific scenarios—all while preserving consistent spatial relationships within the scene.

LingBot-World also demonstrates strong zero-shot generalization. With just a single real-world image (e.g., an urban street view) or a game screenshot as input, LingBot-World can generate an interactive video stream without requiring additional scene-specific training or data collection, significantly lowering deployment and operational costs across diverse environments.

To address the scarcity of high-quality interactive data for world model training, LingBot-World adopts a hybrid data acquisition strategy. It combines large-scale, carefully curated web videos covering diverse real-world scenes, with game-engine synthetic data, including Unreal Engine (UE) pipelines. By extracting clean, UI-free frames directly from the rendering layer while simultaneously logging precise action commands and camera poses, the model receives accurately aligned training signals that capture how actions drive environmental changes.

LingBot-World excels in long-sequence consistency, real-time responsiveness, and modeling the causal relationship between actions and environmental dynamics. This enables it to “imagine” the physical world in a digital space, providing AI agents with a cost-effective, high-fidelity environment for trial-and-error learning. Its support for diverse scene variations, such as lighting conditions or object placements, further boosts the real-world generalization of embodied AI algorithms.

Zhu Xing, CEO of Robbyant, said, “The release of LingBot-World is the third AI model in the LingBot series dedicated to embodied intelligence. This is an important extension of Ant Group’s artificial general intelligence (AGI) strategy from the digital realm to physical perception, and underscores our full-stack roadmap spanning foundational models, general-purpose applications, and physical-world interaction.”

During Robbyant’s “Evolution of Embodied AI Week” initiative, the company has already unveiled LingBot-Depth, a high-precision spatial perception model, and LingBot-VLA, a vision-language-action model designed to serve as a “universal brain” for real-world robotics.

To learn more about LingBot-World, please visit:

About Robbyant

Robbyant is an embodied intelligence company within Ant Group, dedicated to advancing embodied intelligence through cutting-edge software and hardware technologies. Robbyant independently develops foundational large models for embodied AI and actively explores next-generation intelligent devices, aiming to create robotic companions and caregivers that truly understand and enhance people’s everyday lives and deliver reliable intelligent services across key use cases, such as elderly care, medical assistance, and household tasks.

To learn more about Robbyant, please visit: www.robbyant.com


Facebook Comments

About

Share

This website uses cookies.

%%footer%%