Tong Wu, Zhibing Li, Shuai Yang, Pan Zhang, Xingang Pan, Jiaqi Wang, Dahua Lin, Ziwei Liu
Official implementation of HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image
- Install
PyTorch >= 1.12
. We have tested ontorch1.12.1+cu113
, but other versions should also work fine.
# torch1.12.1+cu113
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
# install kaolin
pip install kaolin==0.14.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.12.1_cu113.html
- Other dependencies:
pip install -r requirements.txt
pip install ./raymarching
pip install ./shencoder
pip install ./freqencoder
pip install ./gridencoder
- Zero123 for diffusion guidance
cd pretrained/zero123
wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt
- Omnidata for depth and normal prediction
mkdir pretrained/omnidata
cd pretrained/omnidata
gdown '1Jrh-bRnJEjyMCS7f-WsaFlccfPjJPPHI&confirm=t' # omnidata_dpt_depth_v2.ckpt
gdown '1wNxVO4vVbDEMEpnAi_jwQObf2MFodcBR&confirm=t' # omnidata_dpt_normal_v2.ckpt
-
256 resolution tetrahedron for DMTet. Download and move it to
tets/
-
SAM for segmentation
mkdir models
cd models
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
- derender3d for derender
mkdir models/co3d
wget -O models/co3d/checkpoint010.pth https://www.robots.ox.ac.uk/~vgg/research/derender3d/data/co3d.pth
-
PASD for super-resolution module
- Download SD1.5 models from huggingface and put them into
PASD/checkpoints/stable-diffusion-v1-5
- Download PASD pre-trained models pasd and place the dictionary
checkpoint-100000
insidePASD/runs/pasd/
.
- Download SD1.5 models from huggingface and put them into
-
Editing for editing (ControlNet-Normal2img)
-
You can download
control_v11p_sd15_normalbae.pth
from the HuggingFace Model Page, and put it underpretrained/controlnet/...
. -
You need to download Stable Diffusion 1.5 model "v1-5-pruned.ckpt" and put it under
pretrained/controlnet/...
.
Preprocess the input image to move background and obtain its depth, normal and caption.
python preprocess_image.py /path/to/image.png
We adopt a two-stage training pipeline. You can run it by
image_path='data/strawberry_rgba.png'
nerf_workspace='exp/strawberry_s1'
dmtet_workspace='exp/strawberry_s2'
# Stage 1: NeRF
bash run_nerf.sh ${image_path} ${nerf_workspace}
# Stage 2 DMTet
bash run_dmtet.sh ${image_path} ${nerf_workspace} ${dmtet_workspace}
[optional] We also support importiing pre-defined material masks in the reference view. You can use Semantic-SAM or Materialistic to obtain more accurate masks.
bash run_dmtet.sh ${image_path} ${nerf_workspace} ${dmtet_workspace} --material_masks material_masks/xxx.npy
To relight
bash run_dmtet.sh ${image_path} ${nerf_workspace} ${dmtet_workspace} --test --relight_sg envmaps/lgtSGs_studio.npy
To editing
python editing/scripts/run_editing.py --config_path=editing/configs/sculpture.yaml
Gradio Demo (Editing)
python editing/app_edit.py
- Release editing code.
This code is built on the open-source projects stable-dreamfusion, Zero123, derender3d, SAM and PASD.
Thanks to the maintainers of these projects for their contribution to the community!
If you find HyperDreamer helpful for your research, please cite:
@InProceedings{wu2023hyperdreamer,
author = {Tong Wu and Zhibing Li and Shuai Yang and Pan Zhang and Xingang Pan and Jiaqi Wang and Dahua Lin and Ziwei Liu},
title = {HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image},
journal={ACM SIGGRAPH Asia 2023 Conference Proceedings},
year={2023}
}