• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics


Abstract

Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More speci´Čücally, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
YongHeng Chen, Fuquan Zhang and WanLi Zuo, "Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics," KSII Transactions on Internet and Information Systems, vol. 12, no. 1, pp. 392-412, 2018. DOI: 10.3837/tiis.2018.01.019

[ACM Style]
Chen, Y., Zhang, F., and Zuo, W. 2018. Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics. KSII Transactions on Internet and Information Systems, 12, 1, (2018), 392-412. DOI: 10.3837/tiis.2018.01.019