Online crowdsourcing: rating annotators and obtaining cost-effective labels
- Creators
- Welinder, Peter
-
Perona, Pietro
Abstract
Labeling large datasets has become faster, cheaper, and easier with the advent of crowdsourcing services like Amazon Mechanical Turk. How can one trust the labels obtained from such services? We propose a model of the labeling process which includes label uncertainty, as well a multi-dimensional measure of the annotators' ability. From the model we derive an online algorithm that estimates the most likely value of the labels and the annotator abilities. It finds and prioritizes experts when requesting labels, and actively excludes unreliable annotators. Based on labels already obtained, it dynamically chooses which images will be labeled next, and how many labels to request in order to achieve a desired level of confidence. Our algorithm is general and can handle binary, multi-valued, and continuous annotations (e.g. bounding boxes). Experiments on a dataset containing more than 50,000 labels show that our algorithm reduces the number of labels required, and thus the total cost of labeling, by a large factor while keeping error rates low on a variety of datasets.
Additional Information
©2010 IEEE. We thank Catherine Wah, Florian Schroff, Steve Branson, and Serge Belongie for motivation, discussions and help with the data collection. We also thank Piotr Dollar, Merrielle Spain, Michael Maire, and Kristen Grauman for helpful discussions and feedback. This work was supported by ONR MURI Grant #N00014-06-1-0734 and ONR/Evolution Grant #N00173-09-C-4005.Attached Files
Accepted Version - WelinderPerona10.pdf
Files
Name | Size | Download all |
---|---|---|
md5:4f978fd46d4c3b9f62e8c22b4d954fa1
|
1.2 MB | Preview Download |
Additional details
- Eprint ID
- 47669
- Resolver ID
- CaltechAUTHORS:20140730-102210223
- ONR MURI
- #N00014-06-1-0734
- ONR Evolution
- #N00173-09-C-4005
- Created
-
2014-07-30Created from EPrint's datestamp field
- Updated
-
2021-11-10Created from EPrint's last_modified field