Header Information

NPRP11S-1204-170060
NPRP11S
Qatar University
Award Tech. Completed
01 Jun 2019
Dr. Tamer Elsayed
3 Year(s)
01 Dec 2022
New
Early Detection of Fake News over Arabic Social Media

Project Summary
In this project, we aim to develop the first system that can detect fake news on Arabic social media, analyze its spread among online users, and identify the key influencers behind it. Our system is planned to be integrated with AlJazeera network as an industrial partner and end user. Fake news is everywhere nowadays. For a long time, such news propagated over newspapers and traditional mainstream media; however, the recent emergence and popularity of social media platforms (e.g., Twitter) made it very easy to spread a rumor in just a few hours or even seconds across continents. That usually happens even before the news is picked up by the mainstream media. The spread of wrong information arises to have a strong (negative) influence, not only on involved individuals, but also on large communities and even countries. As a clear example, fake news played a major role in the current GCC crisis since the beginning, and it continues to be a big factor. With the advances in artificial intelligence, machine learning, and several other related fields, the problem of fake news detection has been studied in the past few years, and more extensively in the past couple of years. However, most of this work has focused mainly on English (and few other languages), while no attention was directed towards Arabic. In this project, we propose to design, implement, and deploy an end-to-end system that monitors Arabic social media (in particular Twitter) to early-detect fake news, analyze its propagation, and provide supporting and/or refuting evidence that can be understood and verified by the end user. The entire process would be very lengthy, costly, and tedious if performed manually. The project has five major scientific objectives: (1) early detection of fake news/claims that are propagated through the social media but did not make it yet to the news, (2) effective verification of such claims that estimates its veracity, (3) analyzing the propagation pattern of the detected claims and identifying the key influencers behind it, (4) building an end-to-end system serving an end-user in the media field with a dashboard summarizing detected fake news over time, and (5) achieving all of that over Arabic content, with all of the challenges of the Arabic language and its dialects that are widely used over social media. Our proposed solution decomposes the problem into five sub-problems: (1) topic tracking, to identify relevant posts to given topics of interest, (2) detection of worth-checking claims, to identify them among the emerging claims within the relevant posts, (3) credibility estimation of users and posts related to the claim, (4) veracity classification of the claim, and (5) spread analysis, to discover propagation dynamics of the claims. We adopt an approach that integrates natural language processing, information retrieval, social analytics, image processing, and machine learning techniques, leveraging signals from the social media textual and imagery content, social networks, history of social posts, history of news articles, the Web, and the user feedback. We expect our project to have several potential contributions. First, we propose the first fake new detection system over Arabic content. Second, our proposed system provides both a confidence score and a justification for supporting/refuting the detected claims, which allows the user to understand the system decision and verify it. Third, we tackle the problem in a personalized mode, where the system adapts its algorithms based on the user’s (continuous) implicit/explicit feedback. Fourth, our system integrates both the textual and visual features to detect and verify the claims. Fifth, we provide several labelled datasets as evaluation testbeds for further research in the area, especially on Arabic. Finally, we plan to conduct a case study on the most propagated fake news during the gulf-crisis. By the end of the project, we envision several major outcomes. Our research advances the state-of-the-art in the area of fake news detection in general and over Arabic in particular; several conference and journal publications at top venues are planned. We also plan to release annotated data on Arabic fake news (the first of its kind). A real-time end-to-end system will be developed and integrated with end users (e.g., journalists); we plan to provide a system of technical readiness level TRL 6-7. Moreover, our system will provide a handy tool to verify news that can be used by normal users too. Finally, 2 graduate students will be co-supervised within the project team and 1-2 patents will potentially be filed in the USPTO and/or EPO to protect the developed technology. Our project is expected to have a clear impact on the society and journalism profession in addition to the research community. Our system has the potential of changing the way that journalists work nowadays, by adding another source of evidence. We also aim to allow normal users to verify news and significantly decrease the propagation of fake news. Identifying the key influencers can also inspire proactive actions towards future possible incidents. The research on fake news detection is still in its infancy; therefore, we plan to organize a workshop or a shared task in either SemEval (a top NLP evaluation forum) or ICWSM (a top social computing conference), and make our annotated Arabic data available on Kaggle to further promote the research in the new emerging area. The outcomes of a real-time fake news detection system would clearly benefit news agencies around the world. We envision our system to be used by journalists as a source of evidence when newly emerging claims appear in social media and were not yet verified or even picked up by mainstream media. A direct beneficiary of our proposed system is the AlJazeera network in Qatar, which is a leading news outlet with recognized worldwide influence. In this project, we will work very closely with AlJazeera as an end user.
Artificial intelligence; Machine Learning and Data Mining; Information retrieval; Arabic; Social media
Applied research
1. Natural Sciences
1.2 Computer and Information Sciences
Computer Sciences
Yes
No
5. Social Sciences
5.8 Media and Communications
Information Science
No
Yes

Institution
Qatar University
Qatar
Submitting Institution
University of Edinburgh
United Kingdom
Collaborative Institution

Personnel
Lead PI
Dr. Tamer Elsayed
Qatar University
PI
Dr. Walid Magdy
University of Edinburgh
PI
Dr. Abdulaziz Alali
Qatar University
Consultant
Mrs. Maryam Al-Khater
Australian National University

Outputs/Outcomes
Conference Paper
CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media
Alberto Barrón-Cedeño, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, and Fatima Haouari.
DOI:10.1007/978303045442565
Conference Paper
bigIR at TREC 2019: Graph-based Analysis for News Background Linking
Marwa Essam and Tamer Elsayed
DOI:10.2019/trec.QU.N
Online Paper
ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks
Fatima Haouari, Maram Hasanain, Reem Suwaileh, Tamer Elsayed
DOI:10.2020/arXiv.2004.05861
Conference Paper
The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News
Preslav Nakov, Giovanni Da San Martino, Tamer Elsayed, Alberto Barr´on-Cede˜no, Rub´en M´ıguez, Shaden Shaar, Firoj Alam, Fatima Haouari, Maram Hasanain, Nikolay Babulkov, Alex Nikolov, Gautam Kishore Shahi, Julia Maria Struß, and Thomas Mandl
DOI:10.1007/978-3-030-72240-1-75
Conference Paper
ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection
Fatima Haouari, Maram Hasanain, Reem Suwaileh, Tamer Elsayed
DOI:10.2021/wanlp.72
Conference Paper
AraFacts: The First Large Arabic Dataset of Naturally-Occurring Professionally-Verified Claims
Zien Sheikh Ali, Watheq Mansour, Tamer Elsayed, and Abdulaziz Al-Ali
DOI:10.2021/wanlp.231
Conference Paper
ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks
Fatima Haouari, Maram Hasanain, Reem Suwaileh, Tamer Elsayed
DOI:10.2021/wanlp.82
Conference Paper
bigIR at TREC 2020: Simple but Deep Retrieval of Passages and Documents
Fatima Haouari, Marwa Essam, Tamer Elsayed
DOI:10.2020/trec.QU.DL
Conference Paper
Why is that a Background Article? A Qualitative Analysis of Relevance for News Background Linking
Marwa Essam and Tamer Elsayed
DOI:10.1145/3340531.3412120
Conference Paper
Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media
Alberto Barrón-Cedeño, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, Fatima Haouari, Nikolay Babulkov, Bayan Hamdan, Alex Nikolov, Shaden Shaar, and Zien Sheikh Ali
DOI:10.1007/978-3-030-58219-7-17
Conference Paper
bigIR at CheckThat! 2020: Multilingual BERT for Ranking Arabic Tweets by Check-worthiness
Maram Hasanain and Tamer Elsayed
DOI:10.2020/checkthat.clef20.qu
Online Paper
Overview of CheckThat! 2020 Arabic: Automatic identification and verification of claims in social media
Maram Hasanain, Fatima Haouari, Reem Suwaileh, Z Ali, Bayan Hamdan, Tamer Elsayed, Alberto Barrón-Cedeno, Giovanni Da San Martino, Preslav Nakov
DOI:10.2020/checkthat20.arabic.overview
Online Paper
Overview of CheckThat! 2020 English: Automatic identification and verification of claims in social media
Shaden Shaar, Alex Nikolov, Nikolay Babulkov, Firoj Alam, Alberto Barrón-Cedeno, Tamer Elsayed, Maram Hasanain, Reem Suwaileh, Fatima Haouari, Giovanni Da San Martino, Preslav Nakov
DOI:10.2020/checkthat20.english.overview
Conference Paper
Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News.
Preslav Nakov, Giovanni Da San Martino, Tamer Elsayed, Alberto Barrón-Cedeño, Rubén Míguez, Shaden Shaar, Firoj Alam, Fatima Haouari, Maram Hasanain, Watheq Mansour, Bayan Hamdan, Zien Sheikh Ali, Nikolay Babulkov, Alex Nikolov, Gautam Kishore Shahi, Julia Maria Struß, Thomas Mandl, Mucahid Kutlu, Yavuz Selim Kartal.
DOI:10.1007/978-3-030-85251-1-19
Online Paper
Overview of the CLEF-2021 CheckThat! Lab Task 1 on Check-Worthiness Estimation in Tweets and Political Debates
Shaden Shaar, Maram Hasanain, Bayan Hamdan, Zien Sheikh Ali, Fatima Haouari, Alex Nikolov, Mucahid Kutlu, Yavuz Selim Kartal, Firoj Alam, Giovanni Da San Martino, Alberto Barrón-Cedeño, Rubén Míguez, Javier Beltrán, Tamer Elsayed and Preslav Nakov.
DOI:10.2021/checkthat21.task1.overview
Online Paper
Overview of the CLEF-2021 CheckThat! Lab Task 2 on Detecting Previously Fact-Checked Claims in Tweets and Political Debates
Shaden Shaar, Fatima Haouari, Watheq Mansour, Maram Hasanain, Nikolay Babulkov, Firoj Alam, Giovanni Da San Martino, Tamer Elsayed, Preslav Nakov.
DOI:10.2021/checkthat21.task2.overview
Conference Paper
bigIR at TREC 2021: Adopting Transfer Learning for News Background Linking
Marwa Essam and Tamer Elsayed
DOI:10.2022/trec.qu.DL
Online Paper
Cross-lingual Transfer Learning for Check-worthy Claim Identification over Twitter
Maram Hasanain and Tamer Elsayed
DOI:10.2022/arXiv.2211.05087
Conference Paper
Automated Fact-Checking for Assisting Human Fact-Checkers
Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barr´on-Cede˜no, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino.
DOI:10.24963/ijcai.2021/619
Conference Paper
Did I see it before? Detecting previously-checked claims over Twitter
Watheq Mansour, Tamer Elsayed, Abdulaziz Al-Ali
DOI:10.1007/978-3-030-99736-6-25
Conference Paper
Detecting Users Prone to Spread Fake News on Arabic Twitter
Zien Sheikh Ali, Abdulaziz Al-Ali, Tamer Elsayed
DOI:10.2022/osact.12
Journal Paper
Studying effectiveness of Web search for fact checking
Maram Hasanain and Tamer Elsayed
ISSN:00224577