GSWorld is a photo-realistic simulator combining 3D Gaussian Splatting with physics engines to reduce sim2real visual gap, with diverse downstream applications.

Creating a digital twin is getting closer to reality, but how can we maximize its utility for robotic manipulation?

To this end, we propose GSWorld and explore its applications at:

  • 1. Learning zero-shot sim2real pixel-to-action manipulation policy with photo-realistic rendering
  • 2. Automated high-quality DAgger data collection for adapting policies to deployment environments
  • 3. Reproducible benchmarking of real-robot manipulation policies in simulation
  • 4. Simulation data collection by virtual teleoperation
  • 5. Sim2real visual reinforcement learning

Summary Video

GSWorld Generated Simulation Data

GSWorld generates photo-realistic simulation data with multiple camera viewpoints. The videos below showcase our simulation capabilities across different manipulation tasks and robot platforms.

FR3

Arrange Cans

Stack Cans

Place Box

Pour Sauce

Arrange Cans

Stack Cans

Place Box

Pour Sauce

XArm6

Align Cans

Tidy Table

Grasp Banana

Align Cans

Tidy Table

Grasp Banana

Zero-shot Sim2real Imitation

Our zero-shot sim2real approach enables policies trained entirely in GSWorld simulation to perform manipulation tasks directly on real robots without any real-world fine-tuning. The videos demonstrate successful task execution across four manipulation tasks.

Arrange Cans

Place Box

Pour Sauce

Stack Cans

Closed-Loop DAgger Results

DAgger iteratively evaluate policy in GSWorld, record failure cases, recover the scene before failure, and collect corrective data, thus improving polices in a closed loop.

FR3 Sim2Real

Arrange Cans

Place Box

Pour Sauce

Stack Cans

Arrange Cans

Place Box

Pour Sauce

Stack Cans

XArm6 Real2Sim2real

Instead of starting from a randomly initialized policy, we start from a policy trained with real-world teleoperation data.

Align Cans (3x)

Grasp Banana (2x)

Tidy Table (3x)

Align Cans (3x)

Grasp Banana (3x)

Tidy Table (1x)

Visual Benchmarking

We benchmark state-of-the-art manipulation policies (ACT and Pi0) in our GSWorld simulation environment. The comparison shows how different methods perform across various manipulation tasks, providing a standardized evaluation platform for robotic manipulation research.

ACT Benchmarking

Arrange Cans (Sim)

Place Box (Sim)

Pour Sauce (Sim)

Stack Cans (Sim)

Arrange Cans (Real)

Place Box (Real)

Pour Sauce (Real)

Stack Cans (Real)

Pi0 Benchmarking

Arrange Cans (Sim)

Place Box (Sim)

Pour Sauce (Sim)

Stack Cans (Sim)

Arrange Cans (Real)

Place Box (Real)

Pour Sauce (Real)

Stack Cans (Real)

Virtual Teleoperation

GSWorld enables virtual teleoperation where human operators can control simulated robots in photo-realistic environments. This capability allows for efficient data collection and policy demonstration without requiring physical robot hardware.

R1 Virtual Teleoperation (6x)

Zero-shot Sim2real VRL

We demonstrate zero-shot sim2real transfer for visual reinforcement learning (VRL) tasks. Policies trained in GSWorld simulation using SAC (Soft Actor-Critic) are deployed directly on real robots, showing superior performance compared to training in other simulation environments.

Grasp Banana - GSWorld (4x)

Grasp Banana - ManiSkill (3x)

Tidy Table - GSWorld (3x)

Tidy Table - ManiSkill (3x)

BibTeX

@inproceedings{jiang2025gsworld,
title={GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation},
author={Jiang, Guangqi and Chang, Haoran and Qiu, Ri-Zhao and Liang, Yutong and Ji, Mazeyu and Zhu, Jiyue and Dong, Zhao and Zou, Xueyan and Wang, Xiaolong},
booktitle={arXiv preprint arXiv:2510.20813},
year={2025}
}