Using Deep Neural Networks to Translate Multi-lingual Threat Intelligence

Abstract

The multilingual nature of the Internet increases complications in thecybersecurity community's ongoing efforts to strategically mine threatintelligence from OSINT data on the web. OSINT sources such as social media,blogs, and dark web vulnerability markets exist in diverse languages and hindersecurity analysts, who are unable to draw conclusions from intelligence inlanguages they don't understand. Although third party translation engines aregrowing stronger, they are unsuited for private security environments. First,sensitive intelligence is not a permitted input to third party engines due toprivacy and confidentiality policies. In addition, third party engines producegeneralized translations that tend to lack exclusive cybersecurity terminology.In this paper, we address these issues and describe our system that enablesthreat intelligence understanding across unfamiliar languages. We create aneural network based system that takes in cybersecurity data in a differentlanguage and outputs the respective English translation. The Englishtranslation can then be understood by an analyst, and can also serve as inputto an AI based cyber-defense system that can take mitigative action. As a proofof concept, we have created a pipeline which takes Russian threats andgenerates its corresponding English, RDF, and vectorized representations. Ournetwork optimizes translations on specifically, cybersecurity data.

Quick Read (beta)

loading the full paper ...