A Python Library for Exploratory Data Analysis and Knowledge Discovery on Twitter Data

  • 2020-09-03 17:44:44
  • Mario Graff, Daniela Moctezuma, Sabino Miranda-JimĂ©nez, Eric S. Tellez
  • 40

Abstract

Twitter is perhaps the social media more amenable for research. It requiresonly a few steps to obtain information, and there are plenty of libraries thatcan help in this regard. Nonetheless, knowing whether a particular event isexpressed on Twitter is a challenging task that requires a considerablecollection of tweets. This proposal aims to facilitate, a researcher interestedin Twitter data, the process of mining events on Twitter. The events could berelated to natural disasters, health issues, people's mobility, among otherstudies that can be pursued with the library proposed. Different applicationsare presented in this contribution to illustrate the library's capabilities,starting from an exploratory analysis of the topics discovered in tweets,following it by studying the similarity among dialects of the Spanish language,and complementing it with a mobility report on different countries. In summary,the Python library presented retrieves a plethora of information processed fromTwitter (since December 2015) in terms of words, bigrams of words, and theirfrequencies by day for Arabic, English, Spanish, and Russian languages.Finally, the mobility information considered is related to the number oftravels among locations for more than 245 countries or territories.

 

Quick Read (beta)

loading the full paper ...