Skip to content

Commit

Permalink
more overview
Browse files Browse the repository at this point in the history
  • Loading branch information
Angjoo Kanazawa committed Dec 19, 2017
1 parent bfa2a47 commit d7a2325
Showing 1 changed file with 72 additions and 14 deletions.
86 changes: 72 additions & 14 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -238,26 +238,84 @@
<tr>
<td><center> <br>
<span style="font-size:20px">&nbsp;<a href='https://github.com/akanazawa/hmr'>
Code [coming soon]</a>
<br>
Code [coming soon]</a> </span>
<br>
</center>
</td>
</tr>
</table>
<br>

<br>
<hr>
We present an end-to-end framework for recovering a full 3D mesh
of a human body from a single RGB image. We use the generative
human body model <a href="http://smpl.is.tue.mpg.de/">SMPL</a>,
which parameterizes the mesh by 3D joint angles and a
low-dimensional linear shape space. <!-- We estimate the 3D joint angles, -->
<!-- the shape, as well as the weak-perspective camera of the input -->
<!-- image. -->
Estimating a 3D mesh opens the door to a wide range of applications such as foreground and
part segmentation and dense correspondences that are beyond
what is practical with a simple skeleton. The output mesh can be
immediately used by animators, modified, measured, manipulated
and retargeted. Our output is also holistic – we always infer
the full 3D body even in case of occlusions and
truncations. <br>
<br>
There are several challenges in training such an model in an end-to-end
manner:
<ol>
<li> First is the lack of large-scale ground truth 3D
annotation for <i>in-the-wild</i> images. Existing datasets with
accurate 3D annotations are captured in constrained
environments
(<a href="http://humaneva.is.tue.mpg.de/">HumanEva</a>
, <a href="http://vision.imar.ro/human3.6m/description.php">Human3.6M</a>
, <a href="http://gvv.mpi-inf.mpg.de/3dhp-dataset/">MPI-INF-3DHP</a>
). Models trained on these datasets do not generalize
well to the richness of images in the real world.

<li> Second is the inherent ambiguities in single-view 2D-to-3D
mapping. Many of these configurations may not be
anthropometrically reasonable, such as impossible joint angles
or extremely skinny bodies. In addition, estimating the camera explicitly introduces an additional scale ambiguity between the size of the person and the camera distance.
</ol>
In this work we propose a novel approach to mesh reconstruction that
addresses both of these challenges. The key insight is even though
we don't have a large-scale paired 2D-to-3D labels of images in-the-wild, we have
a lot of <i>unpaired</i> datasets: large-scale 2D keypoint
annotations of in-the-wild images
(<a href="http://sam.johnson.io/research/lsp.html">LSP</a>
, <a href="http://human-pose.mpi-inf.mpg.de/">MPII</a>
, <a href="http://cocodataset.org/#keypoints-challenge2017">COCO</a>
, etc) and a
separate large-scale dataset of 3D meshes of people with various
poses and shapes from MoCap. Our key contribution is to take
advantage of these <i>unpaired</i> 2D keypoint annotations and 3D
scans in a conditional generative adversarial manner. <br>

The idea is that, given an image, the network has to infer the 3D
mesh parameters and the camera such that the 3D keypoints match the
annotated 2D keypoints after projection. To deal with ambiguities,
these parameters are sent to a discriminator network, whose task is
to determine if the 3D parameters correspond to bodies of real
humans or not. Hence the network is encouraged to output parameters
on the human manifold and the discriminator acts as a weak
supervision. The network implicitly learns the angle limits for each
joint and is discouraged from making people with unusual body
shapes.
<br>
<br>
We take advantage of the structure of the body model and propose a
factorized adversarial prior. We show that we can train a model even
<i>without</i> using any paired 2D-to-3D training data (pink meshes are all
results of this unpaired model). Even without using any paired
2D-to-3D supervision, HMR produces reasonable 3D
reconstructions. This is most exciting because it opens up
possibilities for learning 3D from large amounts of 2D data.
<br>
<br>
Please see the <a href="https://arxiv.org/pdf/1712.06584.pdf">paper</a> for more details.
<hr>
<br>
<!-- <table align=center width=800px> -->
<!-- <tr><center> <br> -->
<!-- <span style="font-size:14px">&nbsp;<a href='https://github.com/akanazawa/hmr'> -->
<!-- Code [coming soon]</a> -->
<!-- <br> -->
<!-- </center></tr> -->
<!-- </table> -->
<!-- <br> -->
<!-- <hr> -->
<table align=center width=1100px>
<tr>
<td>
Expand Down

0 comments on commit d7a2325

Please sign in to comment.