GSWorld is a photo-realistic simulator combining 3D Gaussian Splatting with physics engines to reduce sim2real visual gap, with diverse downstream applications.
Creating a digital twin is getting closer to reality, but how can we maximize its utility for robotic manipulation?
To this end, we propose GSWorld and explore its applications at:
GSWorld generates photo-realistic simulation data with multiple camera viewpoints. The videos below showcase our simulation capabilities across different manipulation tasks and robot platforms.
Arrange Cans
Stack Cans
Place Box
Pour Sauce
Arrange Cans
Stack Cans
Place Box
Pour Sauce
Align Cans
Tidy Table
Grasp Banana
Align Cans
Tidy Table
Grasp Banana
Our zero-shot sim2real approach enables policies trained entirely in GSWorld simulation to perform manipulation tasks directly on real robots without any real-world fine-tuning. The videos demonstrate successful task execution across four manipulation tasks.
Arrange Cans
Place Box
Pour Sauce
Stack Cans
DAgger iteratively evaluate policy in GSWorld, record failure cases, recover the scene before failure, and collect corrective data, thus improving polices in a closed loop.
Arrange Cans
Place Box
Pour Sauce
Stack Cans
Arrange Cans
Place Box
Pour Sauce
Stack Cans
Instead of starting from a randomly initialized policy, we start from a policy trained with real-world teleoperation data.
Align Cans (3x)
Grasp Banana (2x)
Tidy Table (3x)
Align Cans (3x)
Grasp Banana (3x)
Tidy Table (1x)
We benchmark state-of-the-art manipulation policies (ACT and Pi0) in our GSWorld simulation environment. The comparison shows how different methods perform across various manipulation tasks, providing a standardized evaluation platform for robotic manipulation research.
Arrange Cans (Sim)
Place Box (Sim)
Pour Sauce (Sim)
Stack Cans (Sim)
Arrange Cans (Real)
Place Box (Real)
Pour Sauce (Real)
Stack Cans (Real)
Arrange Cans (Sim)
Place Box (Sim)
Pour Sauce (Sim)
Stack Cans (Sim)
Arrange Cans (Real)
Place Box (Real)
Pour Sauce (Real)
Stack Cans (Real)
GSWorld enables virtual teleoperation where human operators can control simulated robots in photo-realistic environments. This capability allows for efficient data collection and policy demonstration without requiring physical robot hardware.
R1 Virtual Teleoperation (6x)
We demonstrate zero-shot sim2real transfer for visual reinforcement learning (VRL) tasks. Policies trained in GSWorld simulation using SAC (Soft Actor-Critic) are deployed directly on real robots, showing superior performance compared to training in other simulation environments.
Grasp Banana - GSWorld (4x)
Grasp Banana - ManiSkill (3x)
Tidy Table - GSWorld (3x)
Tidy Table - ManiSkill (3x)
@inproceedings{jiang2025gsworld,
title={GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation},
author={Jiang, Guangqi and Chang, Haoran and Qiu, Ri-Zhao and Liang, Yutong and Ji, Mazeyu and Zhu, Jiyue and Dong, Zhao and Zou, Xueyan and Wang, Xiaolong},
booktitle={arXiv preprint arXiv:2510.20813},
year={2025}
}