Learn the Concept of a Daily Object (DALF: Dual-stage Affordance Learning Framework for Object Concept Learning)
Official code for "DALF: Dual-stage Affordance Learning Framework for Object Concept Learning".
We present DALF for enabling a comprehensive understanding of object affordances.
We strongly encourage you to create a separate CONDA environment.
conda create -n openad python=3.8
conda activate openad
pip install -r requirements.txt
(shaofeng: 后面缺什么下什么就行)
Download data from this drive folder.
Currently, we support 2 models (OpenAD with backbones of PointNet++ and DGCNN) and 2 settings (full-shape and partial-view).
(shaofeng: 目前的数据放 ./data
目录下,可以改)
Please train the model on a single GPU for the best performance. Below are the steps for training the model with PointNet++ backbone on the full-shape setting, those of other combinations are equivalent.
-
In
config/openad_pn2/full_shape_cfg.py
, change the value ofdata_root
to your downloaded data folder, and change the path to class weights to the path of the filefull_shape_weights.npy
(contained in the data folder). -
Assume you use the GPU 0, then run the following command to start training:
CUDA_VISIBLE_DEVICES=0 python3 train.py --config ./config/openad_pn2/full_shape_cfg.py --work_dir ./log/openad_pn2/OPENAD_PN2_FULL_SHAPE_Release/ --gpu 0
The followings are steps for open-vocabulary testing a trained model with PointNet++ backbone on the full-shape setting, those of other combinations are equivalent.
-
Change the value of
data_root
inconfig/openad_pn2/full_shape_open_vocab_cfg.py
to your downloaded data folder. -
Run the following command:
CUDA_VISIBLE_DEVICES=0 python3 test_open_vocab.py --config ./config/openad_pn2/full_shape_open_vocab_cfg.py --checkpoint <path to your checkpoint model> --gpu 0
Where
<path to your checkpoint model>
is your trained model.
We provide the pretrained models at this drive.
(shaofeng: 目前的ckpt放 ./pretrain
目录下,可以改)
Up till now, our CLPP model doesn't generate per-point features. Therefore we conbine the feature propagation layers from pretrained OpenAD checkpoint with the point aggregation layers from the CLPP checkpoint to get point-wise features to calculate per-point affordance classification score.
The evaluation code is implemented in eval_local.py
and both OpenAD checkpoint and CLPP checkpoint is required.
CUDA_VISIBLE_DEVICES=0 python3 eval_local.py --OpenAD_config ./config/openad_pn2/full_shape_open_vocab_cfg.py --CLPP_config ./config/openad_pn2_clpp/clpp_full_shape_open_vocab_cfg.py --OpenAD_checkpoint ./log/openad_pn2/OPENAD_PN2_FULL_SHAPE_Release/best_model_openad_pn2_estimation.t7 --CLPP_checkpoint ./log/openad_pn2_clpp/OPENAD_PN2_CLPP/current_clpp_model.t7 --gpu 0
To generate new data for the following steps, please paste your api key here. and run:
python caption.py
It creates the pair of <point cloud, functionality (text prompt)>
. The data can be download here and put under dir ./data
.
To arrive the model that can align the semantics of a text prompt with the semantics of a set of point clouds of an object, the following steps leverage contrastive learning to finetune the pointnet++ encoder.
CUDA_VISIBLE_DEVICES=0 python3 train_clpp.py --config ./config/openad_pn2_clpp/clpp_full_shape_cfg.py --work_dir ./log/openad_pn2_clpp/OPENAD_PN2_CLPP/ --checkpoint <path to your checkpoint model> --gpu 0
Where <path to your checkpoint model>
is your trained model in step 3.
Following step provides a training-free method to rank multiple objects based on a query.
CUDA_VISIBLE_DEVICES=0 python3 rank_multi_obj.py --config ./config/openad_pn2/full_shape_open_vocab_cfg.py --checkpoint <path to your checkpoint model> --gpu 0 --query "It can contain some objects or water"
Evaluate the model on test dataset created from ysf_full_shape_val
, which gives the model cand_num
objects and one ground-truth functionality
prompt and asks the model which object best suits the prompt.
Each object is sampled from different categories to avoid ambiguity.
CUDA_VISIBLE_DEVICES=0 python3 eval_global.py --config ./config/openad_pn2_clpp/clpp_full_shape_open_vocab_cfg.py --checkpoint ${ckpt_path} --gpu 0 --test_num 50 --cand_num 5
If you find our work useful for your research, please cite:
@inproceedings{Nguyen2023open,
title={Open-vocabulary affordance detection in 3d point clouds},
author={Nguyen, Toan and Vu, Minh Nhat and Vuong, An and Nguyen, Dzung and Vo, Thieu and Le, Ngan and Nguyen, Anh},
booktitle = IROS,
year = 2023
}
Our source code is built with the heavy support from 3D AffordaceNet. We express a huge thank to them.