Skip to content

Small Python utility to compare and visualize the output of various stereo depth estimation algorithms

License

Notifications You must be signed in to change notification settings

nburrus/stereodemo

Repository files navigation

Unit Tests Twitter Badge

stereodemo

Small Python utility to compare and visualize the output of various stereo reconstruction algorithms:

  • Make it easy to get a qualitative evaluation of several state-of-the-art models in the wild
  • Feed it left/right images or capture live from an OAK-D camera
  • Interactive colored point-cloud view since nice-looking disparity images can be misleading
  • Try different parameters on the same image

Supported methods (implementation/pre-trained models taken from their respective authors):

  • OpenCV stereo block matching and Semi-global block matching baselines, with all their parameters
  • CREStereo: Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation (CVPR 2022)
  • RAFT-Stereo: "Multilevel Recurrent Field Transforms for Stereo Matching." (3DV 2021)
  • Hitnet: "Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching." (CVPR 2021)
  • Chang et al. RealtimeStereo: "Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices" (ACCV 2020)

See below for more details / credits to get each of these working.

stereodemo-intro.mp4

Getting started

Installation

python3 -m pip install stereodemo

Running it

With an OAK-D camera

To capture data directly from an OAK-D camera, use:

stereodemo --oak

Then click on Next Image to capture a new one.

With image files

For convenience a tiny subset of some popular datasets is included in this repository. Just provide a folder to stereodemo and it'll look for left/right pairs (either im0/im1 or left/right in the names):

# To evaluate on the oak-d images
stereodemo datasets/oak-d 

# To cycle through all images
stereodemo datasets

Then click on Next Image to cycle through the images.

Sample images included in this repository:

Dependencies

pip will install the dependencies automatically. Here is the list:

  • Open3D. For the point cloud visualization and the GUI.
  • OpenCV. For image loading and the traditional block matching baselines.
  • onnxruntime. To run pretrained models in the ONNX format.
  • pytorch. To run pretrained models exported as torch script.
  • depthai. Optional, to grab images from a Luxonis OAK camera.

Credits for each method

I did not implement any of these myself, but just collected pre-trained models or converted them to torch script / ONNX.

License

The code of steredemo is MIT licensed, but the pre-trained models are subject to the license of their respective implementation.

The sample images have the license of their respective source, except for datasets/oak-d which is licenced under Creative Commons Attribution 4.0 International License.