In this project, we aim to develop the first system that automatically scores Arabic essays written by high school and first-year university students to assist their Arabic writing proficiency in four different dimensions, i.e., style, relevance, organization, and development. Our system is planned to be integrated with Qatar University Testing Center (QUTC) and Pre-University Education schools at Qatar Foundation (QF-PUE) as end-users.
With the advances in artificial intelligence, machine learning, and several other related fields, the problem of Automated Essay Scoring (AES) has been studied for years and more extensively in the recent few years. However, most of this work has focused mainly on English (and few other languages) while no attention was directed towards Arabic. In this project, we propose to design, implement, and deploy an Arabic AES system that takes a student’s essay written in Arabic for a given prompt (describing the topic and requirements of the essay), analyzes the essay in the four aforementioned dimensions, and predicts a score - per dimension - that assesses the writing quality. The entire process would be very lengthy, costly, and tedious when performed manually.
The project has five major objectives: (1) releasing the first publicly-available dataset for Arabic AES, (2) developing effective automated holistic and dimension-specific scoring, (3) developing scoring models that are cross-lingual (relying only on non-Arabic labeled data) and cross-prompt (relying only on labeled data from previously-used prompts), (4) developing models that utilize multi-task learning to exploit inter-relationships between different dimensions of language proficiency, and (5) integrating and deploying our models in the environments of two end-users. Overall, the project aims to build research capacity in the relevant research areas (natural language processing, information retrieval, and machine learning) and establish Qatar as a hub for research on the particular problem of Arabic AES.
Our proposed solution decomposes the problem into four sub-problems: (1) acquiring the labeled Arabic dataset, (2) building assisting models that leverage existing non-Arabic labeled data and multilingual pre-trained language models to build Arabic AES models, AES-specific pre-trained language models, and holistic models, (3) building the core prompt-specific and cross-prompt models, in addition to leveraging multi-task learning, and (4) developing a production-quality trained models for deployment at the end-users.
We expect our project to have several potential technical and societal contributions. First, there is very little research on automated scoring for Arabic essays. Thus, our proposed system is the first of its kind in the region and, up to our knowledge, worldwide. More specifically, there is no publicly available Arabic dataset for AES, and no work has been conducted on dimension-specific scoring of Arabic essays of any length. Second, we aim to advance state-of-the-art with novel research on AES-specific pre-trained language models, cross-lingual learning, cross-prompt learning, and multi-task learning approaches; several conference and journal publications at top venues are planned. Third, in the context of standardized assessment, when compared to human scoring, automated scoring provides a higher degree of operational efficiencies, such as saving valuable teachers' time, providing a faster turnaround of results, allowing for more fair and unbiased scoring, and opening the door for more testing windows for students. Fourth, our proposed project contributes to achieving Qatar national priorities mentioned in national strategic documents such as Qatar National Vision (QNV) 2030, Qatar National Research Strategy (QNRS, 2012), and Qatar Second National Development Strategy (QNDS2, 2018-2022). Furthermore, the project supports the State of Qatar Law no. 7 issued in 2019 regarding the protection of the Arabic language. Fifth, our proposed work is also planned to be deployed by two end-users in Qatar. We plan to provide a system of technical readiness level TRL 6-7. Successful integration at these two sites is a great step towards integrating it on a wider scale, e.g., other universities and other Ministry of Education (MoE) schools. Finally, 2 or 3 graduate students will be co-supervised within the project team, and 1-2 patents will potentially be filed in the USPTO and/or EPO to protect the developed technology.