Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published June 13, 2018 | Submitted
Report Open

It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data

Abstract

We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data. Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with associated 3D poses. Existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically. Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time. Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose. Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data. Our proposed approach opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.

Additional Information

© 2018. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms. We would like to thank Google for their gift to the Visipedia project and Amazon Web Services (AWS) for Research Credits.

Attached Files

Submitted - 1805.06880.pdf

Files

1805.06880.pdf
Files (4.4 MB)
Name Size Download all
md5:0ae4103a24e3b8cb13a7fa9e0ac29aca
4.4 MB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 18, 2023