• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

A Distance Approach for Open Information Extraction Based on Word Vector


Abstract

Web-scale open information extraction (Open IE) plays an important role in NLP tasks like acquiring common-sense knowledge, learning selectional preferences and automatic text understanding. A large number of Open IE approaches have been proposed in the last decade, and the majority of these approaches are based on supervised learning or dependency parsing. In this paper, we present a novel method for web scale open information extraction, which employs cosine distance based on Google word vector as the confidence score of the extraction. The proposed method is a purely unsupervised learning algorithm without requiring any hand-labeled training data or dependency parse features. We also present the mathematically rigorous proof for the new method with Bayes Inference and Artificial Neural Network theory. It turns out that the proposed algorithm is equivalent to Maximum Likelihood Estimation of the joint probability distribution over the elements of the candidate extraction. The proof itself also theoretically suggests a typical usage of word vector for other NLP tasks. Experiments show that the distance-based method leads to further improvements over the newly presented Open IE systems on three benchmark datasets, in terms of effectiveness and efficiency.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
Liu Peiqian and Wang Xiaojie, "A Distance Approach for Open Information Extraction Based on Word Vector," KSII Transactions on Internet and Information Systems, vol. 12, no. 6, pp. 2470-2491, 2018. DOI: 10.3837/tiis.2018.06.003

[ACM Style]
Peiqian, L. and Xiaojie, W. 2018. A Distance Approach for Open Information Extraction Based on Word Vector. KSII Transactions on Internet and Information Systems, 12, 6, (2018), 2470-2491. DOI: 10.3837/tiis.2018.06.003