•  
  •  
 
An-Najah University Journal for Research - B (Humanities)

Abstract

This paper discusses part of speech (PoS) tagging for Arabic prepositions. Arabic has a number of predefined sets of particles such as particles of Nasb, particles of Jazm, particles of Jarr(also called prepositions), etc.Each set has a particular role in the context in which it appears. In general, PoS is the process of assigning a tag for each word (e.g. name, verb, particle, etc.) based on the context. In fact, PoS is a beneficial tool for many natural language processing (NLP) toolkits. For instance, itis used in syntactic parsing to validate the grammar of the sentence in question. It is also beneficial to understand the required meaning via textual analysis for further processing in search engines. Many other language processing applications utilize PoS such as machine translation, speech synthesis, speech recognition, diacritization, etc. Hence, the performance quality of many NLP applications depends on the accuracy of outputs of the used tagging system. Hence, thisstudy examines the Stanford tagger to explore its tag set in the text under examination and its performance for tagging Arabic prepositions. This study also discusses the weaknesses of the Stanford tagger, as it does not handle the merging case when a preposition joins with an adjacent word to form one single word. Another concern of the Stanford tagger is that it gives a unique tag for different particles such as Jarr and Jazmin terms of linguistic functions. Through our inductive studyof prepositions in terms of linguistic functions such as Jazm and Istifham (interrogation), we did not note differences in tagging prepositions like “to” ((إلىand “in” ((في. Other prepositions are also difficult to distinguish unless they are contextualized; these include“until” ((حتىand “except” ((عدا. This shows that this tagging system is inaccurate and the need for keeping up with tagging-related systems is vital, hence is the significance of our research. In this work, we used the Holy Quran to identifythe performance of the Stanford System in tagging prepositions in the Quran. This work encourages more research on tagging other Arabic prepositions to explore the compatibility of tagging symbols employed in the Stanford System and prepositions used in the Arabic language, in general.

Share

COinS