• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Dual Attention Based Image Pyramid Network for Object Detection


Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300×300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.


Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article

[IEEE Style]
X. Dong, F. Li, H. Bai and Y. Zhao, "Dual Attention Based Image Pyramid Network for Object Detection," KSII Transactions on Internet and Information Systems, vol. 15, no. 12, pp. 4439-4455, 2021. DOI: 10.3837/tiis.2021.12.010.

[ACM Style]
Xiang Dong, Feng Li, Huihui Bai, and Yao Zhao. 2021. Dual Attention Based Image Pyramid Network for Object Detection. KSII Transactions on Internet and Information Systems, 15, 12, (2021), 4439-4455. DOI: 10.3837/tiis.2021.12.010.

[BibTeX Style]
@article{tiis:25147, title="Dual Attention Based Image Pyramid Network for Object Detection", author="Xiang Dong and Feng Li and Huihui Bai and Yao Zhao and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2021.12.010}, volume={15}, number={12}, year="2021", month={December}, pages={4439-4455}}