Vol. 18, No. 10, October 31, 2024
10.3837/tiis.2024.10.005,
Download Paper (Free):
Abstract
A key element of many Natural Language Processing (NLP) applications is Named Entity Recognition (NER). It involves categorizing and identifying text into separate categories, such as identifying a location or an individual's name. Arabic NER (ANER) is also utilized in numerous other Arabic NLP (ANLP) tasks, such as Machine Translation (MT), Question Answering (QA), and Information Extraction (IE). ANER systems can often be classified into three major groups: rule-based, Machine Learning (ML), and hybrid. This study focuses on examining ML-based ANER developments, particularly in the context of Classical Arabic, which presents unique challenges due to its complex morphological structure and limited linguistic resources. We propose a supervised approach that integrates word-level,
morphological, and knowledge-based features to improve NER performance for Classical Arabic. Our method was evaluated on the CANERCorpus, a specialized dataset containing annotated texts from Classical Arabic literature. The Naive Bayes (NB) approach achieved an F-measure of 80%, with precision and recall levels at 86% and 75%, respectively. These results indicate a significant improvement over traditional methods, particularly in dealing with the intricate structure of Classical Arabic. The study highlights the potential of ML in overcoming the challenges of ANER and provides directions for further research in this domain.
Statistics
Show / Hide Statistics
Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.
Cite this article
[IEEE Style]
R. Salah, M. Mukred, L. Q. b. Zakaria, F. A. M. Al-Yarimi, "A Machine Learning Approach for Named Entity Recognition in Classical Arabic Natural Language Processing," KSII Transactions on Internet and Information Systems, vol. 18, no. 10, pp. 2895-2919, 2024. DOI: 10.3837/tiis.2024.10.005.
[ACM Style]
Ramzi Salah, Muaadh Mukred, Lailatul Qadri binti Zakaria, and Fuad A. M. Al-Yarimi. 2024. A Machine Learning Approach for Named Entity Recognition in Classical Arabic Natural Language Processing. KSII Transactions on Internet and Information Systems, 18, 10, (2024), 2895-2919. DOI: 10.3837/tiis.2024.10.005.
[BibTeX Style]
@article{tiis:101406, title="A Machine Learning Approach for Named Entity Recognition in Classical Arabic Natural Language Processing", author="Ramzi Salah and Muaadh Mukred and Lailatul Qadri binti Zakaria and Fuad A. M. Al-Yarimi and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2024.10.005}, volume={18}, number={10}, year="2024", month={October}, pages={2895-2919}}