Skip to content
forked from xxlong0/Wonder3D

Single Image to 3D using Cross-Domain Diffusion for 3D Generation

Notifications You must be signed in to change notification settings

dqj5182/Wonder3D

 
 

Repository files navigation

Wonder3D

Single Image to 3D using Cross-Domain Diffusion

Wonder3D reconstructs highly-detailed textured meshes from a single-view image in only 2 ∼ 3 minutes. Wonder3D first generates consistent multi-view normal maps with corresponding color images via a cross-domain diffusion model, and then leverages a novel normal fusion method to achieve fast and high-quality reconstruction.

Preparation for inference

Linux System Setup.

conda create -n wonder3d
conda activate wonder3d
pip install -r requirements.txt
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

Inference

  1. Optional. If you have troubles to connect to huggingface. Make sure you have downloaded the following models. Download the checkpoints and into the root folder.

If you are in mainland China, you may download via aliyun.

Wonder3D
|-- ckpts
    |-- unet
    |-- scheduler
    |-- vae
    ...

Then modify the file ./configs/mvdiffusion-joint-ortho-6views.yaml, set pretrained_model_name_or_path="./ckpts"

  1. Download the SAM model. Put it to the sam_pt folder.
Wonder3D
|-- sam_pt
    |-- sam_vit_h_4b8939.pth
  1. Predict foreground mask as the alpha channel. We use Clipdrop to segment the foreground object interactively. You may also use rembg to remove the backgrounds.
# !pip install rembg
import rembg
result = rembg.remove(result)
result.show()
  1. Run Wonder3d to produce multiview-consistent normal maps and color images. Then you can check the results in the folder ./outputs. (we use rembg to remove backgrounds of the results, but the segmentations are not always perfect. May consider using Clipdrop to get masks for the generated normal maps and color images, since the quality of masks will significantly influence the reconstructed mesh quality.)
accelerate launch --config_file 1gpu.yaml test_mvdiffusion_seq.py \
            --config configs/mvdiffusion-joint-ortho-6views.yaml validation_dataset.root_dir={your_data_path} \
            validation_dataset.filepaths=['your_img_file'] save_dir={your_save_path}

see example:

accelerate launch --config_file 1gpu.yaml test_mvdiffusion_seq.py \
            --config configs/mvdiffusion-joint-ortho-6views.yaml validation_dataset.root_dir=./example_images \
            validation_dataset.filepaths=['owl.png'] save_dir=./outputs
  1. Mesh Extraction

Instant-NSR Mesh Extraction

cd ./instant-nsr-pl
python launch.py --config configs/neuralangelo-ortho-wmask.yaml --gpu 0 --train dataset.root_dir=../{your_save_path}/cropsize-{crop_size}-cfg{guidance_scale:.1f}/ dataset.scene={scene}

see example:

cd ./instant-nsr-pl
python launch.py --config configs/neuralangelo-ortho-wmask.yaml --gpu 0 --train dataset.root_dir=../outputs/cropsize-192-cfg1.0/ dataset.scene=owl

About

Single Image to 3D using Cross-Domain Diffusion for 3D Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%