Improving Object Detection Quality by Incorporating Global Contexts via Self-Attention

Lee, Donghyeon and Kim, Joonyoung and Jung, Kyomin (2021) Improving Object Detection Quality by Incorporating Global Contexts via Self-Attention. Electronics, 10 (1). p. 90. ISSN 2079-9292

[thumbnail of electronics-10-00090-v2.pdf] Text
electronics-10-00090-v2.pdf - Published Version

Download (17MB)

Abstract

Fully convolutional structures provide feature maps acquiring local contexts of an image by only stacking numerous convolutional layers. These structures are known to be effective in modern state-of-the-art object detectors such as Faster R-CNN and SSD to find objects from local contexts. However, the quality of object detectors can be further improved by incorporating global contexts when some ambiguous objects should be identified by surrounding objects or background. In this paper, we introduce a self-attention module for object detectors to incorporate global contexts. More specifically, our self-attention module allows the feature extractor to compute feature maps with global contexts by the self-attention mechanism. Our self-attention module computes relationships among all elements in the feature maps, and then blends the feature maps considering the computed relationships. Therefore, this module can capture long-range relationships among objects or backgrounds, which is difficult for fully convolutional structures. Furthermore, our proposed module is not limited to any specific object detectors, and it can be applied to any CNN-based model for any computer vision task. In the experimental results on the object detection task, our method shows remarkable gains in average precision (AP) compared to popular models that have fully convolutional structures. In particular, compared to Faster R-CNN with the ResNet-50 backbone, our module applied to the same backbone achieved +4.0 AP gains without the bells and whistles. In image semantic segmentation and panoptic segmentation tasks, our module improved the performance in all metrics used for each task.

Item Type: Article
Uncontrolled Keywords: object detection; global context; self-attention; convolutional neural network
Subjects: STM Repository > Engineering
Depositing User: Managing Editor
Date Deposited: 03 Aug 2024 13:06
Last Modified: 03 Aug 2024 13:06
URI: http://classical.goforpromo.com/id/eprint/693

Actions (login required)

View Item
View Item