A Novel Way of Identifying Cyber Predators

  • 2017-12-11 17:24:13
  • Dan Liu, Ching Yee Suen, Olga Ormandjieva
  • 3

Abstract

Recurrent Neural Networks with Long Short-Term Memory cell (LSTM-RNN) haveimpressive ability in sequence data processing, particularly for language modelbuilding and text classification. This research proposes the combination ofsentiment analysis, new approach of sentence vectors and LSTM-RNN as a novelway for Sexual Predator Identification (SPI). LSTM-RNN language model isapplied to generate sentence vectors which are the last hidden states in thelanguage model. Sentence vectors are fed into another LSTM-RNN classifier, soas to capture suspicious conversations. Hidden state enables to generatevectors for sentences never seen before. Fasttext is used to filter thecontents of conversations and generate a sentiment score so as to identifypotential predators. The experiment achieves a record-breaking accuracy andprecision of 100% with recall of 81.10%, exceeding the top-ranked result in theSPI competition.

 

Quick Read (beta)

loading the full paper ...