Seeing the Wind from a Falling Leaf

Abstract

A longstanding goal in computer vision is to model motions from videos, while the representations behind motions, i.e. the invisible physical interactions that cause objects to deform and move, remain largely unexplored. In this work, we present an end-to-end differentiable inverse graphics framework, which jointly models object geometry, physical properties, and interactions directly from videos. By backpropagating through physics simulations, we can recover force representations from object movements. We validate our approach on both synthetic and real-world scenarios, demonstrating the ability to estimate plausible force fields—such as wind patterns affecting a falling leaf. Our method shows promise for physics-based video generation and editing, bridging computer vision with physics by understanding the physical processes underlying visual data.

Additional resources and videos are available on the project page.

Cite