Researchers from Qatar University have created a dataset of tweets about COVID-19 by Arab users between January and April 2020. Analysis of the dataset reveals that most tweets by far have originated in Saudi Arabia, and that mentions of Allah have increased with the disease onset, as assumedly users debated whether the disease is a punishment by God. The dataset can further be used to track down misinformation in social networks.
We all know and love (or hate) Twitter, with its daily hundreds of millions of tweets. For more than a decade now, Twitter has been used as a way for spreading and consuming news. It has been shown that Twitter can reflect occurrences in the physical world, such as the spread of diseases like influenza and even Zika and Ebola. What’s more, conversations on Twitter can have a very real effect on events in the real world, with the 2016 presidential elections in the US providing a sharp reminder of that.
By analyzing tweets in Twitter, researchers gain the opportunity to study the structure and dynamics of events. They can even obtain a detailed understanding the decision making processes of the public and public representatives.
Ever since COVID-19 has began spreading around the world, Arab tweeters began having heated discussions about the new virus. The discussions have reached a feverish peak when the virus reached the United Arab Emirates, with many Arab tweeters asking for more transparency about the disease and the way governments in the Arab world have been addressing it. Other Arab tweeters focused more on the impact of the disease on their day-to-day lives, habits, and work.
Researchers from Qatar University in Qatar have recently created a dataset of Arabic tweets about Covid-19, called ArCOV-19 (Arabic COVID-19). The dataset, which was brought to the scientific community attention in a paper in arXiv, includes only tweets in Arabic, and captured tweets mentioning COVID-19 between January and April 2020. Overall, more than one million tweets were captures, together with the propagation networks – data about the way those tweets spread around in Twitter. This is the first publicly-available dataset For one, they realized that most tweets (38.7%) about COVID-19 have originated in Saudi Arabia. That makes sense as Saudia has the largest number of active Twitter users in the Arab world. The contender for second place – Kuwait – only has 11.1% of all tweets.
The researchers also identified two categories of the most frequently used words. One category included words that were directly related to COVID-19, like “health”. The other category was not directly related to the disease, but instead contained references to prayers and mentions of Allah or God. The word Allah’s mention, in particular, revealed a unique pattern: it appeared very frequently when news about the virus emerged, declined over time, and then become frequent again when the virus reached the Arab world. The researchers believe that early on, the discussions focused mainly on the religious aspect of the pandemic, with many users claiming that it is a punishment by God.
ArCOV-19 is currently open for researchers to make use of the data it contains. The researchers hope that it can be used for tracking information about the virus, for detecting misinformation and for emergency management when such is needed. Data from ArCOV-19 can help government agencies detect new spread events, for example. The data can also help decision makers understand the public opinions and mindset about the changes to their lifestyle and new habits like social distancing. It may even be used to identify and analyze cases of offensive language and hate speech.
Finally – and perhaps most important – ArCOV-19 can be used to detect misinformation – the spread of unsubstantiated rumors and false claims. Misinformation can rapidly spread around in social networks, thus hindering governments’ efforts to mount an effective response to the pandemic. In extreme cases, misinformation can even cause panic and hoarding of supplies.
The ArCOV-19 is therefore expected to be a highly useful tool for the Arab world to analyze the response to COVID-19 and to keep track of misinformation. As more data accumulated in the dataset, it is expected to become even more useful and aid in halting the spread of misinformation online – and of COVID-19 in the physical world.
Original content by Nawartna