Contextual Attention for Hand Detection in the Wild
Abstract
We present Hand-CNN, a novel convolutional network architecture for detecting hand masks and predicting hand orientations in unconstrained images. Hand-CNN extends MaskRCNN with a novel attention mechanism to incorporate contextual cues in the detection process. This attention mechanism can be implemented as an efficient network module that captures non-local dependencies between features. This network module can be inserted at different stages of an object detection network, and the entire detector can be trained end-to-end. We also introduce large-scale annotated hand datasets containing hands in unconstrained images for training and evaluation. We show that Hand-CNN outperforms existing methods on the newly collected datasets and the publicly available PASCAL VOC human layout dataset. Data and code: https://www3.cs.stonybrook.edu/~cvl/projects/hand_det_attention/
Additional Information
© 2019 IEEE. This work is partially supported by VinAI Research and NSF IIS-1763981. Many thanks to Tomas Simon for his suggestion about the COCO dataset and Rakshit Gautam for his contribution to the data annotation process.Attached Files
Submitted - 1904.04882.pdf
Files
Name | Size | Download all |
---|---|---|
md5:8890f65a36e1db120aac1ba678c5d106
|
2.3 MB | Preview Download |
Additional details
- Eprint ID
- 101740
- DOI
- 10.1109/ICCV.2019.00966
- Resolver ID
- CaltechAUTHORS:20200306-124356519
- VinAI Research
- IIS-1763981
- NSF
- Created
-
2020-03-06Created from EPrint's datestamp field
- Updated
-
2021-11-16Created from EPrint's last_modified field