Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection
Abstract
Training on datasets with long-tailed distributions has been challenging for major recognition tasks such as classification and detection. To deal with this challenge, image resampling is typically introduced as a simple but effective approach. However, we observe that long-tailed detection differs from classification since multiple classes may be present in one image. As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object-level. We address object-level resampling by introducing an object-centric sampling strategy based on a dynamic, episodic memory bank. Our proposed strategy has two benefits: 1) convenient object-level resampling without significant extra computation, and 2) implicit feature-level augmentation from model updates. We show that image-level and object-level resamplings are both important, and thus unify them with a joint resampling strategy. Our method achieves state-of-the-art performance on the rare categories of LVIS, with 1.89% and 3.13% relative improvements over Forest R-CNN on detection and instance segmentation.
Additional Information
© 2021 by the author(s). We would like to sincerely thank Achal Dave, Kenneth Marino, Senthil Purushwalkam and other NVIDIA colleagues for the discussion and constructive suggestions.Attached Files
Published - chang21c.pdf
Submitted - 2104.05702.pdf
Files
Name | Size | Download all |
---|---|---|
md5:9ae3abf67ca481c758b0c5904f9d9666
|
3.9 MB | Preview Download |
md5:030e541794929a0e488258463efed818
|
3.8 MB | Preview Download |
Additional details
- Eprint ID
- 109038
- Resolver ID
- CaltechAUTHORS:20210510-134322482
- Created
-
2021-05-10Created from EPrint's datestamp field
- Updated
-
2023-06-02Created from EPrint's last_modified field