What is D4RT?
D4RT (Dynamic 4D Reconstruction and Tracking) is a unified AI model from Google DeepMind that reconstructs dynamic 4D scenes (3D space plus time) from monocular video, disentangling camera and object motion efficiently.
When was D4RT announced?
D4RT was introduced by Google DeepMind on January 22, 2026, via their official blog post.
Is D4RT open-source or publicly available?
No, D4RT is currently a research model with no code, weights, or public demo released; only the technical report and project page are available.
How fast is D4RT compared to previous methods?
It processes one-minute videos in about 5 seconds on a single TPU, up to 300x faster than prior state-of-the-art approaches.
What tasks does D4RT support?
It enables all-pixels 3D tracking, point cloud reconstruction, camera pose estimation, and long-term prediction through a single query interface.
What are D4RT’s main applications?
Primarily robotics (dynamic navigation/manipulation), augmented reality (low-latency scene understanding), and advancing AI world models for physical perception.
What benchmarks does D4RT excel on?
It achieves SOTA on MPI Sintel (complex motion), Aria Digital Twin (ego-motion/occlusions), and RE10k (diverse scenes) for 4D reconstruction and tracking.
Who developed D4RT?
D4RT was developed by Google DeepMind researchers Guillaume Le Moing and Mehdi S. M. Sajjadi.




