• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Image Captioning with Synergy-Gated Attention and Recurrent Fusion LSTM

Vol. 16, No. 10, October 31, 2022
10.3837/tiis.2022.10.010, Download Paper (Free):

Abstract

Long Short-Term Memory (LSTM) combined with attention mechanism is extensively used to generate semantic sentences of images in image captioning models. However, features of salient regions and spatial information are not utilized sufficiently in most related works. Meanwhile, the LSTM also suffers from the problem of underutilized information in a single time step. In the paper, two innovative approaches are proposed to solve these problems. First, the Synergy-Gated Attention (SGA) method is proposed, which can process the spatial features and the salient region features of given images simultaneously. SGA establishes a gated mechanism through the global features to guide the interaction of information between these two features. Then, the Recurrent Fusion LSTM (RF-LSTM) mechanism is proposed, which can predict the next hidden vectors in one time step and improve linguistic coherence by fusing future information. Experimental results on the benchmark dataset of MSCOCO show that compared with the state-of-the-art methods, the proposed method can improve the performance of image captioning model, and achieve competitive performance on multiple evaluation indicators.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
Y. Yang, L. Chen, L. Pan, J. Hu, "Image Captioning with Synergy-Gated Attention and Recurrent Fusion LSTM," KSII Transactions on Internet and Information Systems, vol. 16, no. 10, pp. 3390-3405, 2022. DOI: 10.3837/tiis.2022.10.010.

[ACM Style]
You Yang, Lizhi Chen, Longyue Pan, and Juntao Hu. 2022. Image Captioning with Synergy-Gated Attention and Recurrent Fusion LSTM. KSII Transactions on Internet and Information Systems, 16, 10, (2022), 3390-3405. DOI: 10.3837/tiis.2022.10.010.

[BibTeX Style]
@article{tiis:37887, title="Image Captioning with Synergy-Gated Attention and Recurrent Fusion LSTM", author="You Yang and Lizhi Chen and Longyue Pan and Juntao Hu and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2022.10.010}, volume={16}, number={10}, year="2022", month={October}, pages={3390-3405}}