In this work, we propose to address these two challenges of developing an Empathy-based negotiating chatbot and Question/Answer system with a platform called MURASALAT, which means correspondences (مراسلات) in Arabic. The term is short for Multi-dialect Arabic Understanding Models for Empathy-based Negotiating Chatbot and Question Answering. The key benefit of the developed platform is that it can be configured for various applications and functionalities including but not limited to: 1) a company can use it to detect users’ satisfaction with services it provides and to conduct automated customer service, and 2) a user can pose a question related to a specific predefined topic, then receive the most relevant answer (QA). We aim to build one engine to serve these functionalities, make the engine fast, accurate and robust to respond to input, where the input can either be Modern Standard Arabic (MSA) text or DA with focus on Gulf, Egyptian, and Levantine dialects.
There has been several efforts to develop chatbots and QA systems, but these developments remain limited in the abilities to achieve human level conversations. As an example, Google recently announced the development of a bot (called Meena) model with 2.3 billion parameters that is supposed to cover any topic of conversation. However, even major advances as Google bot, still lack human-level interactions. Another major limitation of the advanced chatbots is that they lack the skills and expertise of emotional responses. In particular, the bots are limited in their ability to show empathy or customize responses to handle user interactions with empathy. As for QA systems, the current state of Arabic QA lags behind with small datasets and non-learning-based approaches.
These issues are further magnified in Arabic as Arabic chatbots and QA systems still lag significantly behind their English counterpart and in addressing the gaps that English bots have not addressed yet. While many resources have already been developed for Arabic, the scale of resources needed to achieve performances similar to Google Meena bot is still a major challenge. Significant computational resources will also be needed. While these resources may be available sporadically, it will be a challenge to dedicate similar resources for our experimentation. This may lead us to develop alternative models that can achieve similar significant performances but with much fewer number of parameters, similar to what squeezeNet (close to 1 Miliion parameters for image classification) did compared to AlexNet (closed to 70 million parameters). Furthermore, Arabic has its own set of challenges that need to be handled more cautiously with chatbots and QA systems.
Empathy-based negotiating chatbots require as a building block efficient and accurate models to extract emotions from written text. Hence, recognizing and responding to emotions from text continue to be an important area of research as many of humans’ and societal decisions are driven by feelings. While there have been significant advances in the development of automated models for emotion recognition (ER) from natural language, several challenges remain in the field. The challenges include the ability of the models to: 1) perform different natural language understanding (NLU) tasks like ER and QA, 2) account for the variety of dialects (e.g. Gulf, Egyptian, and Levantine) in addition to MSA as all may be used in expressing feelings, especially with the expected spread of conversational systems, 3) handle the shortage of language resources for Arabic, and 4) integrate with online systems that can accurately extract emotions, and provide answers on questions for any topic. While these challenges exist for any language including English, Arabic is presented with additional challenges resulting from the complexity of the Arabic language, the limited availability of resources for natural language processing (NLP), and the need for systems that can handle both formal and DA. Our previous efforts in developing NLP models for various applications have produced state of the art models for Arabic, however these models also fall short of addressing the open challenges mentioned above. Existing models do not capture emotions, and as for Arabic QA, it is lagging English due, in part, to the difficulty in automating the language analysis.
To address these challenges, we will develop multi-dialect Arabic NLP resources and models that can simultaneously handle various NLU tasks. In particular, we propose to achieve the following contributions: 1) Development of novel models for Arabic NLU based on transfer learning (TL) and multi-task learning (MTL) techniques, that can simultaneously perform different NLU tasks of ER and QA leading to the development of an empathy-based negotiating chatbot with Question/Answering capabilities. Such models will be designed with the objective to learn patterns that are common across different tasks while capturing unique characteristics of individual tasks. Moreover, 2) we will extend such MTL models to handle different dialects. 3) Development of new language resources for Arabic General Language Understanding Evaluation. 4) Implementation of an online system as well as the implementation of light weight variants of the models that are scalable and can handle real-time processing tasks.