-
-
Notifications
You must be signed in to change notification settings - Fork 290
Dataset Reproducibility Guide
This page contains the commands used to generate the datasets for D4RL. All scripts are located in the scripts/generation folder.
Command:
python scripts/generation/generate_maze2d_datasets.py --env_name <env_name>
python scripts/generation/relabel_maze2d_rewards.py --maze <open/umaze/medium/large> --filename <hdf5 file>
For the pybullet implementation, use the scripts/generation/generate_maze2d_bullet_datasets.py
script.
Download the pre-trained antmaze policy here: http://rail.eecs.berkeley.edu/datasets/offline_rl/ant_hierarch_pol.pkl
Loading the pickle file requires installing this fork of RLkit (https://github.com/aviralkumar2907/rlkit-offline-rl-benchmark), and installing torch 1.5.1 and torchvision 0.6.0.
Command:
python scripts/generation/generate_ant_maze_datasets.py --maze <umaze/medium/large> --policy_file ant_hierarch_pol.pkl
Behavior policies are generated by training an agent using SAC in rlkit (https://github.com/vitchyr/rlkit). The checkpoint (pkl file) can be passed into the following scripts.
Medium/Expert Datasets
python scripts/generation/mujoco/collect_data.py <env_name> --num_data=1000000 --output_file=<hdf5 output filename> --pklfile=<rlkit snapshot file>
Random Datasets
python scripts/generation/mujoco/collect_data.py <env_name> --num_data=1000000 --output_file=<hdf5 output filename> --random
Replay Buffer Datasets
python scripts/generation/mujoco/convert_buffer.py --pklfile=<rlkit snapshot file> --output_file=<hdf5 output filename>
Merging Random/Medium/Expert Datasets
python scripts/generation/mujoco/stitch_data.py <h5file_1> <h5file_2> --output_file=<output h5file>
Download the expert policies and demonstrations from the hand_dapg repository. (https://github.com/aravindr93/hand_dapg)
Place the files in the folder “./demonstrations/<env_name>_demos.pickle” and “./policies/.pickle”
Expert dataset:
python scripts/generation/hand_dapg_policies.py --env-name=<env name>
Human dataset:
python scripts/generation/hand_dapg_demos.py --env-name=<env name>
Cloned dataset: (TODO)
Download demonstrations from the relay-policy-learning repository (https://github.com/google-research/relay-policy-learning)
Place the demonstrations in the folder “~/relay-policy-learning/kitchen_demos_multitask”
python scripts/generation/generate_kitchen_datasets.py
python scripts/generation/flow_idm.py --controller=<random|idm> --env_name=<env name>
(TODO)