• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

An Arabic Script Recognition System

Vol. 9, No. 9, September 29, 2015
10.3837/tiis.2015.09.023, Download Paper (Free):

Abstract

A system for the recognition of machine printed Arabic script is proposed. The Arabic script is shared by three languages i.e., Arabic, Urdu and Farsi. The three languages have a descent amount of vocabulary in common, thus compounding the problems for identification. Therefore, in an ideal scenario not only the script has to be differentiated from other scripts but also the language of the script has to be recognized. The recognition process involves the segregation of Arabic scripted documents from Latin, Han and other scripted documents using horizontal and vertical projection profiles, and the identification of the language. Identification mainly involves extracting connected components, which are subjected to Principle Component Analysis (PCA) transformation for extracting uncorrelated features. Later the traditional K-Nearest Neighbours (KNN) algorithm is used for recognition. Experiments were carried out by varying the number of principal components and connected components to be extracted per document to find a combination of both that would give the optimal accuracy. An accuracy of 100% is achieved for connected components >=18 and Principal components equals to 15. This proposed system would play a vital role in automatic archiving of multilingual documents and the selection of the appropriate Arabic script in multi lingual Optical Character Recognition (OCR) systems.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
Y. M. Alginahi, M. Mudassar, M. N. Kabir, "An Arabic Script Recognition System," KSII Transactions on Internet and Information Systems, vol. 9, no. 9, pp. 3701-3720, 2015. DOI: 10.3837/tiis.2015.09.023.

[ACM Style]
Yasser M. Alginahi, Mohammed Mudassar, and Muhammad Nomani Kabir. 2015. An Arabic Script Recognition System. KSII Transactions on Internet and Information Systems, 9, 9, (2015), 3701-3720. DOI: 10.3837/tiis.2015.09.023.

[BibTeX Style]
@article{tiis:20900, title="An Arabic Script Recognition System", author="Yasser M. Alginahi and Mohammed Mudassar and Muhammad Nomani Kabir and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2015.09.023}, volume={9}, number={9}, year="2015", month={September}, pages={3701-3720}}