• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Multi-level Cross-attention Siamese Network For Visual Object Tracking

Vol. 16, No. 12, December 31, 2022
10.3837/tiis.2022.12.011, Download Paper (Free):

Abstract

Currently, cross-attention is widely used in Siamese trackers to replace traditional correlation operations for feature fusion between template and search region. The former can establish a similar relationship between the target and the search region better than the latter for robust visual object tracking. But existing trackers using cross-attention only focus on rich semantic information of high-level features, while ignoring the appearance information contained in low-level features, which makes trackers vulnerable to interference from similar objects. In this paper, we propose a Multi-level Cross-attention Siamese network(MCSiam) to aggregate the semantic information and appearance information at the same time. Specifically, a multi-level cross-attention module is designed to fuse the multi-layer features extracted from the backbone, which integrate different levels of the template and search region features, so that the rich appearance information and semantic information can be used to carry out the tracking task simultaneously. In addition, before cross-attention, a target-aware module is introduced to enhance the target feature and alleviate interference, which makes the multi-level cross-attention module more efficient to fuse the information of the target and the search region. We test the MCSiam on four tracking benchmarks and the result show that the proposed tracker achieves comparable performance to the state-of-the-art trackers.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
J. Zhang, J. Wang, H. Zhang, M. Miao, Z. Cai, F. Chen, "Multi-level Cross-attention Siamese Network For Visual Object Tracking," KSII Transactions on Internet and Information Systems, vol. 16, no. 12, pp. 3976-3990, 2022. DOI: 10.3837/tiis.2022.12.011.

[ACM Style]
Jianwei Zhang, Jingchao Wang, Huanlong Zhang, Mengen Miao, Zengyu Cai, and Fuguo Chen. 2022. Multi-level Cross-attention Siamese Network For Visual Object Tracking. KSII Transactions on Internet and Information Systems, 16, 12, (2022), 3976-3990. DOI: 10.3837/tiis.2022.12.011.

[BibTeX Style]
@article{tiis:38216, title="Multi-level Cross-attention Siamese Network For Visual Object Tracking", author="Jianwei Zhang and Jingchao Wang and Huanlong Zhang and Mengen Miao and Zengyu Cai and Fuguo Chen and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2022.12.011}, volume={16}, number={12}, year="2022", month={December}, pages={3976-3990}}