Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] How to use visualizer during inference on visual grounding #95

Open
3 tasks done
Mintinson opened this issue Jan 3, 2025 · 1 comment
Open
3 tasks done

Comments

@Mintinson
Copy link

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment


System environment:
sys.platform: linux
Python: 3.8.20 (default, Oct 3 2024, 15:24:27) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 1278117654
GPU 0: NVIDIA A100-PCIE-40GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.3, V11.3.58
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.11.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3

  • C++ Version: 201402

  • Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications

  • Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)

  • OpenMP 201511 (a.k.a. OpenMP 4.5)

  • LAPACK is enabled (usually provided by MKL)

  • NNPACK is enabled

  • CPU capability usage: AVX2

  • CUDA Runtime 11.3

  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37

  • CuDNN 8.2

  • Magma 2.5.2

  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

    TorchVision: 0.12.0
    OpenCV: 4.10.0
    MMEngine: 0.10.5

Reproduces the problem - code sample

python tools/test.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py work_dirs/mv-grounding/epoch_12.pth 

Reproduces the problem - command or script

python tools/test.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py work_dirs/mv-grounding/epoch_12.pth 

Reproduces the problem - error message

Traceback (most recent call last):
  File "tools/test.py", line 157, in <module>
    main()
  File "tools/test.py", line 153, in main
    runner.test()
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1823, in test
    metrics = self.test_loop.run()  # type: ignore
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 463, in run
    self.run_iter(idx, data_batch)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 487, in run_iter
    outputs = self.runner.model.test_step(data_batch)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 145, in test_step
    return self._run_forward(data, mode='predict')  # type: ignore
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
    results = self(**data, mode=mode)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 766, in forward
    return self.predict(inputs, data_samples, **kwargs)
  File "/root/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 650, in predict
    visualizer.visualize_scene(predictions)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/dist/utils.py", line 427, in wrapper
    return func(*args, **kwargs)
  File "/root/EmbodiedScan/embodiedscan/visualizer/base_visualizer.py", line 93, in visualize_scene
    assert gt["gt_labels_3d"].shape[0] == 1
AssertionError

Additional information

I'm trying to follow the README in visualizer to visualize the prediction at the time of visual grounding inference. But it fails with the above errors.

I noticed that the visualization in the README is for the multi-views detection task, does it mean that the visual grounding task cannot be visualized? If not, how can I modify it to visualize. Thanks in advance for your answer.

@mxh1999
Copy link
Collaborator

mxh1999 commented Jan 8, 2025

The visualization module is developed based on the multi-view detection task and it hasn't been tested on the visual grounding task.
It may be necessary to adjust some APIs to make visualizer compatible with the visual grounding task.

In visualizer, the label is used to determine the color of the box. Since visual grounding task doesn't require labels, you can pass a pseudo label to make visualizer work properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants