Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
News
Sports
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/86/0c/75/860c75aa-068a-18b9-1cb5-600f803acdd4/mza_17177667092256625558.jpg/600x600bb.jpg
AI Illuminated
The AI Illuminators
25 episodes
1 day ago
A new way to keep up with AI research. Delivered to your ears. Illuminated by AI. Part of the GenAI4Good initiative.
Show more...
Courses
Education
RSS
All content for AI Illuminated is the property of The AI Illuminators and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
A new way to keep up with AI research. Delivered to your ears. Illuminated by AI. Part of the GenAI4Good initiative.
Show more...
Courses
Education
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/42256170/42256170-1729998711368-62df626d4f0a1.jpg
Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling
AI Illuminated
15 minutes 31 seconds
1 year ago
Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling

[00:00] Introduction to 3D Gaussian tracking for robotic manipulation

[00:26] Limitations of current video prediction methods

[01:11] Advantages of 3D Gaussian representation

[02:04] Graph Neural Networks for modeling object dynamics

[02:54] Control particle implementation and computation reduction

[03:42] Physics-based optimization for prediction stability

[04:25] Integration with real-world robotic systems

[05:12] Performance testing across different materials

[05:58] Advantages over traditional physics-based methods

[09:16] Implementation of object detection systems

[10:02] Data collection and synchronization challenges

[14:39] Long-term prediction capabilities and limitations


Authors: Mingtong Zhang, Kaifeng Zhang, Yunzhu Li


Affiliations: University of Illinois Urbana-Champaign, Columbia University


Abstract: Videos of robots interacting with objects encode rich information about the objects' dynamics. However, existing video prediction approaches typically do not explicitly account for the 3D information from videos, such as robot actions and objects' 3D states, limiting their use in real-world robotic applications. In this work, we introduce a framework to learn object dynamics directly from multi-view RGB videos by explicitly considering the robot's action trajectories and their effects on scene dynamics. We utilize the 3D Gaussian representation of 3D Gaussian Splatting (3DGS) to train a particle-based dynamics model using Graph Neural Networks. This model operates on sparse control particles downsampled from the densely tracked 3D Gaussian reconstructions. By learning the neural dynamics model on offline robot interaction data, our method can predict object motions under varying initial configurations and unseen robot actions. The 3D transformations of Gaussians can be interpolated from the motions of control particles, enabling the rendering of predicted future object states and achieving action-conditioned video prediction. The dynamics model can also be applied to model-based planning frameworks for object manipulation tasks. We conduct experiments on various kinds of deformable materials, including ropes, clothes, and stuffed animals, demonstrating our framework's ability to model complex shapes and dynamics. Our project page is available at this https URL.


Link: https://arxiv.org/abs/2410.18912

AI Illuminated
A new way to keep up with AI research. Delivered to your ears. Illuminated by AI. Part of the GenAI4Good initiative.