Skip to content

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation

License

Notifications You must be signed in to change notification settings

Dingpx/ManiGaussian

 
 

Repository files navigation

ManiGaussian

🦾 ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
Guanxing Lu, Shiyi Zhang, Ziwei Wang, Changliu Liu, Jiwen Lu, Yansong Tang

[Project Page] | [Paper]

ManiGaussian is an end-to-end behavior cloning agent that learns to perform various language-conditioned robotic manipulation tasks, which consists of a dynamic Gaussian Splatting framework and a Gaussian world model to model scene-level spatiotemporal dynamics. The dynamic Gaussian Splatting framework models the propagation of semantic features in the Gaussian embedding space for manipulation, and the Gaussian world model parameterizes distributions to provide supervision by reconstructing the future scene.

📝 TODO

  • Release pretrained checkpoints.
  • Provide more results (csv files).
  • Provide a Dockerfile for installation.

💻 Installation

NOTE: ManiGaussian is mainly built upon the GNFactor repo by Ze et al.

See INSTALL.md for installation instructions.

See ERROR_CATCH.md for error catching.

🛠️ Usage

The following steps are structured in order.

🦉 Generate Demonstrations

To generate demonstrations for all 10 tasks we use in our paper, run:

bash scripts/gen_demonstrations_all.sh

📈 Training

We use wandb to log some curves and visualizations. Login to wandb before running the scripts.

wandb login

To train our ManiGaussian without semantic features and deformation predictor (the fastest version), run:

bash scripts/train_and_eval_w_geo.sh ManiGaussian_BC 0,1 12345 ${exp_name}

where the exp_name can be specified as you like. You can also train other baselines such as GNFACTOR_BC and PERACT_BC.

To train our ManiGaussian without semantic features, run:

bash scripts/train_and_eval_w_geo_dyna.sh ManiGaussian_BC 0,1 12345 ${exp_name}

To train our ManiGaussian without deformation predictor, run:

bash scripts/train_and_eval_w_geo_sem.sh ManiGaussian_BC 0,1 12345 ${exp_name}

To train our vanilla ManiGaussian, run:

bash scripts/train_and_eval_w_geo_sem_dyna.sh ManiGaussian_BC 0,1 12345 ${exp_name}

We train our ManiGaussian on two NVIDIA RTX 4090 GPUs for ~1 day.

🧪 Evaluation

To evaluate the checkpoint, you can use:

bash scripts/eval.sh ManiGaussian_BC ${exp_name} 0

NOTE: The performances on push_buttons and stack_blocks may fluctuate slightly due to different variations.

📊 Analyze Evaluation Results

After evaluation, the following command is used to compute the average success rates. For example, to compute the average success rate of our provided csv files, run:

python scripts/compute_results.py --file_paths ManiGaussian_results/w_geo/0.csv ManiGaussian_results/w_geo/1.csv ManiGaussian_results/w_geo/2.csv --method last

🏷️ License

This repository is released under the MIT license.

🙏 Acknowledgement

Our code is built upon GNFactor, LangSplat, GPS-Gaussian, splatter-image, PerAct, RLBench, pixelNeRF, ODISE, and CLIP. We thank all these authors for their nicely open sourced code and their great contributions to the community.

🥰 Citation

If you find this repository helpful, please consider citing:

@article{lu2024manigaussian,
      title={ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation}, 
      author={Lu, Guanxing and Zhang, Shiyi and Wang, Ziwei and Liu, Changliu and Lu, Jiwen and Tang, Yansong},
      journal={arXiv preprint arXiv:2403.08321},
      year={2024}
}

About

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.1%
  • Shell 1.9%