VI-MID

Project Page | Video | Paper

This repository contains VI-MID: a new multi-instance dynamic RGBD-Inertial SLAM system using an object-level octree-based volumetric representation.

We present a tightly-coupled visual-inertial object-level multi-instance dynamic SLAM system. Even in extremely dynamic scenes, it can robustly optimise for the camera pose, velocity, IMU biases and build a dense 3D reconstruction object-level map of the environment. Our system can robustly track and reconstruct the geometries of arbitrary objects, their semantics and motion by incrementally fusing associated colour, depth, semantic, and foreground object probabilities into each object model thanks to its robust sensor and object tracking. In addition, when an object is lost or moved outside the camera field of view, our system can reliably recover its pose upon re-observation. We demonstrate the robustness and accuracy of our method by quantitatively and qualitatively testing it in real-world data sequences.

TODO:

Create custom dataset
Evaluation examples
Documentation
License

How to use our software

Dependencies

Check Dockerfile and .devcontainer/docker-compose.yml for the required dependencies. We provided a vscode devcontainer for easy development. xhost + is required to run the container with GUI support. If you are using remote computer for developmnt without a monitor and meet an issue to get GUI: try a hack export DISPLAY=$LOCAL_DISPLAY$

Installation:

Get our code and submodules:

git clone git@github.com:binbin-xu/vimid.git
git submodule update --init --recursive

Install the custom OKVIS and realsense library in the third-party folder.

Then go into the apps/kfusion folder, simply run the following command to build the software and the dependencies:

make

If any error occurs, please check the dockerfile for any possible dependencies.

Uninstall:

make clean

Demo:

We provide some usage samples. Please see the bash files in the apps/kfusion/demo folder.

The data used to run those bash can be downloaded via this link. Remember to modify the datasets address in the bash files accordingly.

Customised settings:

RGB-D sequences need to be given as the demo format. Specifically, RGB images in /cam0 folder, depth images in /depth0 folder, mask-RCNN outputs in /mask_RCNN folder, IMU data in /imu0 folder (/cam1, /cam0_ori can be ignored). If there's no alignment between RGB and depth image, you may need to associate them (check this link for details).
The input images are defined as in the TUM RGBD datasets, where the input images are in the resolution of 640 X 480 and depth is scaled by 5000. Images are named in the recorded timestamples (nanoseconds).

You may need to tune some hyper-paramters defined in the file and parse them as arguments for your own sequences.
Make sure to complie and run in debug mode first to expose bugs that were hidden in unoptimized code.

make debug

Then you can run our modified Mask RCNN script (check demo/vimid-mask.py in this repo) to generate masks, classes, and semantic probability in cnpy format. Here we provide a detectron2 version for usage. We did not finetune the pretrained coco models, and the results would be much improved with a better/more suited segmentation mask. Therefore if you want to increase performance in your specific domain, please consider training a network on your data.

Those tunable parameters can be found in apps/kfusion/include/default_parameters.h.

Citations

Please consider citing this project in your publications if it helps your work. The following is a BibTeX reference. The BibTeX entry requires the url LaTeX package.

@inproceedings{Ren:Xu:etal:IROS2022,
  title={Visual-Inertial Multi-Instance Dynamic SLAM with Object-level Relocalisation},
  author={Ren, Yifei and Xu, Binbin and Choi, Christopher L and Leutenegger, Stefan},
  booktitle={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  pages={11055--11062},
  year={2022},
}

@inproceedings{Xu:etal:ICRA2019,
author = {Binbin Xu and Wenbin Li and Dimos Tzoumanikas and Michael Bloesch and Andrew Davison and Stefan Leutenegger},
booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
+ title = {{MID-Fusion}: Octree-based Object-Level Multi-Instance Dynamic SLAM},
year = {2019},
}

License

Distributed under the BSD 3-clause license.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.devcontainer		.devcontainer
apps/kfusion		apps/kfusion
config		config
lib		lib
teaser		teaser
third-party		third-party
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VI-MID

Project Page | Video | Paper

TODO:

How to use our software

Dependencies

Installation:

Uninstall:

Demo:

Customised settings:

Citations

License

About

Releases

Packages

Languages

License

smartroboticslab/vimid

Folders and files

Latest commit

History

Repository files navigation

VI-MID

Project Page | Video | Paper

TODO:

How to use our software

Dependencies

Installation:

Uninstall:

Demo:

Customised settings:

Citations

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages