• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction


Abstract

The Question Similarity Measurement of Chinese Crop Diseases and Insect Pests (QSM-CCD&IP) aims to judge the user’s tendency to ask questions regarding input problems. The measurement is the basis of the Agricultural Knowledge Question and Answering (Q & A) system, information retrieval, and other tasks. However, the corpus and measurement methods available in this field have some deficiencies. In addition, error propagation may occur when the word boundary features and local context information are ignored when the general method embeds sentences. Hence, these factors make the task challenging. To solve the above problems and tackle the Question Similarity Measurement task in this work, a corpus on Chinese crop diseases and insect pests (CCDIP), which contains 13 categories, was established. Then, taking the CCDIP as the research object, this study proposes a Chinese agricultural text similarity matching model, namely, the AgrCQS. This model is based on mixed information extraction. Specifically, the hybrid embedding layer can enrich character information and improve the recognition ability of the model on the word boundary. The multi-scale local information can be extracted by multi-core convolutional neural network based on multi-weight (MM-CNN). The self-attention mechanism can enhance the fusion ability of the model on global information. In this research, the performance of the AgrCQS on the CCDIP is verified, and three benchmark datasets, namely, AFQMC, LCQMC, and BQ, are used. The accuracy rates are 93.92%, 74.42%, 86.35%, and 83.05%, respectively, which are higher than that of baseline systems without using any external knowledge. Additionally, the proposed method module can be extracted separately and applied to other models, thus providing reference for related research.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
H. Zhou, X. Guo, C. Liu, Z. Tang, S. Lu and L. Li, "Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction," KSII Transactions on Internet and Information Systems, vol. 15, no. 11, pp. 3991-4010, 2021. DOI: 10.3837/tiis.2021.11.007.

[ACM Style]
Han Zhou, Xuchao Guo, Chengqi Liu, Zhan Tang, Shuhan Lu, and Lin Li. 2021. Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction. KSII Transactions on Internet and Information Systems, 15, 11, (2021), 3991-4010. DOI: 10.3837/tiis.2021.11.007.

[BibTeX Style]
@article{tiis:25099, title="Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction", author="Han Zhou and Xuchao Guo and Chengqi Liu and Zhan Tang and Shuhan Lu and Lin Li and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2021.11.007}, volume={15}, number={11}, year="2021", month={November}, pages={3991-4010}}