Lit Lang Library
Lit Lang Library
Event Detail Information
Linguistics Seminar Series -- Rania Al-Sabbagh, Ph.D. candidate in Linguistics
In this work, I present Mumkin 1.0 ' the first version of my large-scale modality tagger for standard and Egyptian Arabic. (Mumkin is the Arabic word for 'possible'.) Modality is the grammaticalized expression of the subjective attitudes and opinions of the speaker including possibility, probability, necessity, obligation, permissibility, ability, desire, and contingency. With modality, it is possible, therefore, to use the user-generated Web content on social networks to identify the social values, beliefs and opinions of a given community. However, the automatic extraction of modality is challenging for a number of reasons. First, there is much theoretical controversy about the definition of modality, its semantic types and syntactic characteristics. Second, Arabic modality, in particular, is not expressed by a closed-set of auxiliaries but rather by a wide range of syntactic structures including auxiliaries, lexical verbs, nominals (both adjectives and nouns) as well as prepositional phrases. Third, given the morphologically-rich nature of Arabic, modals ' except for auxiliaries ' are inflected for gender, number, person, tense, aspect and mood. That is, the Arabic modality system is quite complex. Finally, some Arabic modals are semantically and/or syntactically ambiguous. Mumkin 1.0 efficiently operationalizes consensus modality definitions and characteristics to enable the automatic extraction of Arabic modalities from social-network user-generated content. It tackles six modality types, namely epistemic, evidential, deontic, dynamic, boulomaic and volition modality. It uses machine learning techniques and linguistically-defined features to resolve semantic and syntactic ambiguities simultaneously in both standard and Egyptian Arabic tweets. Mumkin 1.0 sets the state-of-the-art results for Arabic automatic modality taggers given that it is the first in the Arabic NLP repository.