This is the primary repository for my Master's Thesis conducted at Aalborg University, Denmark. The focus of this project is to apply Deep Reinforcement Learning to acquire a robust policy that allows robot to grasp arbitrary objects from compact octree observations.
Below are some examples of grasping objects using Panda and UR5 with RG2 gripper in the same environment. The observations, represented as point cloud, are visualised on the right, where floating coordinate frame represents the camera pose. (3x speed)
Example of Sim2Real transfer can be seen below (trained inside simulation, no re-training in real world).
Local Installation (click to expand)
- OS: Ubuntu 20.04 (Focal)
- GPU: CUDA is required to process octree observations on GPU. Everything else should function normally on CPU.
These are the dependencies required to use the entirety of this project. If no "(tested with version
)" is specified, the latest release from a relevant distribution is expected to function properly.
- Python 3 (tested with
3.8
) - PyTorch (tested with
1.7
) - ROS 2 Foxy
- Ignition Dome
- gym-ignition
- AndrejOrsula/gym-ignition fork is currently required
- MoveIt 2
- O-CNN
- AndrejOrsula/O-CNN fork is currently required
Several other dependencies can be installed via pip
with this one-liner.
pip3 install numpy scipy optuna seaborn stable-baselines3[extra] sb3-contrib open3d trimesh pcg-gazebo
All other dependencies are pulled from git and built together with this repository, see drl_grasping.repos for more details.
In case you run into any problems along the way, check Dockerfile that includes the full instructions.
Clone this repository and import VCS dependencies. Then build with colcon.
# Create workspace for the project
mkdir -p drl_grasping/src && cd drl_grasping/src
# Clone this repository
git clone https://github.com/AndrejOrsula/drl_grasping.git
# Import and install dependencies
vcs import < drl_grasping/drl_grasping.repos && cd ..
rosdep install -r --from-paths src -i -y --rosdistro ${ROS_DISTRO}
# Build with colcon
colcon build --merge-install --symlink-install --cmake-args "-DCMAKE_BUILD_TYPE=Release"
Before using, remember to source the ROS 2 workspace overlay.
source <drl_grasping dir>/install/local_setup.bash
This enables:
- Use of
drl_grasping
Python module - Execution of scripts and examples via
ros2 run drl_grasping <executable>
- Launching of setup scripts via
ros2 launch drl_grasping <launch_script>
Docker (click to expand)
- OS: Any system that supports Docker should work (Linux, Windows, macOS). However, only Linux was properly tested.
- GPU: CUDA is required to process octree observations on GPU. Therefore, only Docker images with CUDA support are currently available.
Before starting, make sure your system has a setup for using Nvidia Docker, e.g.:
# Docker
curl https://get.docker.com | sh \
&& sudo systemctl --now enable docker
# Nvidia Docker
distribution=$(. /etc/os-release; echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
The easiest way to try out this project is by using the included Dockerfile.
Instead of building it locally, you can pull a pre-built Docker image directly from Docker Hub. Currently, there is only a development image available.
docker pull andrejorsula/drl_grasping:latest
To run the docker, please use the included docker run script as it significantly simplifies the setup.
run.bash andrejorsula/drl_grasping:latest /bin/bash
If you are struggling to get CUDA working on your system with Nvidia GPU (no
nvidia-smi
output), you might need to use a different version of CUDA base image that supports the version of your driver.
This repository contains environments for robotic manipulation that are compatible with OpenAI Gym. All of these make use of Ignition Gazebo robotic simulator, which is interfaced via gym-ignition.
Currently, the following environments are included inside this repository. Take a look at their gym environment registration and source code if you are interested in configuring them.
- Reach task
- Observation variants
- Reach - simulation states
- ReachColorImage
- ReachDepthImage
- ReachOctree (with and without color features)
- Observation variants
- Grasp task
- Observation variants
- GraspOctree (with and without color features)
- Includes GraspCurriculum
- This curriculum can be used to progressively increase difficulty of the task by automatically adjusting behaviour based on current success rate. It affects the following:
- Workspace size
- Number of objects
- Termination state (task is divided into hierarchical sub-tasks, further guiding the agent)
- This curriculum can be used to progressively increase difficulty of the task by automatically adjusting behaviour based on current success rate. It affects the following:
- Observation variants
TODO: Add animation for Reach task
These environments can be wrapped by a randomizer in order to introduce domain randomization and improve generalization of the trained policies, which is especially beneficial for Sim2Real transfer.
The included ManipulationGazeboEnvRandomizer allows randomization of the following properties at each reset of the environment.
- Object model - primitive geometry
- Random type (box, sphere and cylinder are currently supported)
- Random color, scale, mass, friction
- Object model - mesh geometry
- Random type (see Object Model Database)
- Random scale, mass, friction
- Object pose
- Ground plane texture
- Initial robot configuration
- Camera pose
For database of objects with mesh geometry, this project currently utilises Google Scanned Objects collection from Ignition Fuel. You can also try to use a different Fuel collection or just a couple of models stored locally (some tweaks might be required to support certain models).
All models are automatically configured in several ways before their insertion into the world:
- Inertial properties are automatically estimated (uniform density is assumed)
- Collision geometry is decimated in order to improve performance
- Models can be filtered and automatically blacklisted based on several aspects, e.g too much geometry or disconnected components
This repository includes few scripts that can be used to simplify interaction with the dataset and splitting into training/testing subsets. By default they include 80 training and 20 testing models.
dataset_download_train
/dataset_download_test
- Download models from Fueldataset_unset_train
/dataset_unset_test
- Unset current train/test datasetdataset_set_train
/dataset_set_test
- Set dataset to use train/test subsetprocess_collection
- Process the collection with the steps mentioned above
DRL_GRASPING_PBR_TEXTURES_DIR
environment variable can be exported if ground plane texture should be randomized. It should lead to a directory with the following structure.
├── ./ # Directory pointed to by `DRL_GRASPING_PBR_TEXTURES_DIR`
├── texture_0
├── *albedo*.png || *basecolor*.png
├── *normal*.png
├── *roughness*.png
└── *specular*.png || *metalness*.png
├── ...
└── texture_n
There are several databases with free PBR textures that you can use. Alternatively, you can clone AndrejOrsula/pbr_textures with 80 training and 20 testing textures.
Only Franka Emika Panda and UR5 with RG2 gripper are currently supported. This project currently lacks a more generic solution that would allow to easily utilise arbitrary models, e.g. full-on MoveIt 2 with ros2_control implementation.
This project makes direct use of stable-baselines3 as well as sb3_contrib. Furthermore, scripts for training and evaluation were largely inspired by rl-baselines3-zoo.
To train an agent, please take a look at ex_train
example. Similarly, ex_enjoy
example demonstrates a way to evaluate a trained agent.
TODO: Add graphics for learning curve (TD3 vs SAC cs TQC)
The OctreeCnnFeaturesExtractor makes use of O-CNN implementation to enable training on GPU. This features extractor is part of OctreeCnnPolicy
policy that is currently implemented for TD3, SAC and TQC algorithms.
TODO: Add graphics for network architecture
Hyperparameters for training of RL agents can be found in hyperparameters directory. Optuna was used to autotune some of them, but certain algorithm/environment combinations require far more tuning. If needed, you can try running Optuna yourself, see ex_optimize
example.
├── drl_grasping # Primary Python module of this project
├── algorithms # Definitions of policies and slight modifications to RL algorithms
├── envs # Environments for grasping (compatible with OpenAI Gym)
├── tasks # Tasks for the agent that are identical for simulation
├── randomizers # Domain randomization of the tasks, which also populates the world
└── models # Functional models for the environment (Ignition Gazebo)
├── control # Control for the agent
├── perception # Perception for the agent
└── utils # Other utilities, used across the module
├── examples # Examples for training and enjoying RL agents
├── hyperparams # Hyperparameters for training RL agents
├── scripts # Helpful scripts for training, evaluating, ...
├── launch # ROS 2 launch scripts that can be used to help with setup
├── docker # Dockerfile for this project
└── drl_grasping.repos # List of other dependencies created for `drl_grasping`
In case you have any problems or questions, feel free to open an Issue or Discussion.