Deep Neural Networks for Bot Detection

Abstract

The problem of detecting bots, automated social media accounts governed bysoftware but disguising as human users, has strong implications. For example,bots have been used to sway political elections by distorting online discourse,to manipulate the stock market, or to push anti-vaccine conspiracy theoriesthat caused health epidemics. Most techniques proposed to date detect bots atthe account level, by processing large amount of social media posts, andleveraging information from network structure, temporal dynamics, sentimentanalysis, etc. In this paper, we propose a deep neural network based on contextual longshort-term memory (LSTM) architecture that exploits both content and metadatato detect bots at the tweet level: contextual features are extracted from usermetadata and fed as auxiliary input to LSTM deep nets processing the tweettext. Another contribution that we make is proposing a technique based on syntheticminority oversampling to generate a large labeled dataset, suitable for deepnets training, from a minimal amount of labeled data (roughly 3,000 examples ofsophisticated Twitter bots). We demonstrate that, from just one single tweet,our architecture can achieve high classification accuracy (AUC > 96%) inseparating bots from humans. We apply the same architecture to account-level bot detection, achievingnearly perfect classification accuracy (AUC > 99%). Our system outperformsprevious state of the art while leveraging a small and interpretable set offeatures yet requiring minimal training data.

Quick Read (beta)

loading the full paper ...