Leveraging Foreign Language Labeled Data for Aspect-Based Opinion Mining

  • 2020-03-15 15:53:53
  • Nguyen Thi Thanh Thuy, Ngo Xuan Bach, Tu Minh Phuong
  • 3

Abstract

Aspect-based opinion mining is the task of identifying sentiment at theaspect level in opinionated text, which consists of two subtasks: aspectcategory extraction and sentiment polarity classification. While aspectcategory extraction aims to detect and categorize opinion targets such asproduct features, sentiment polarity classification assigns a sentiment label,i.e. positive, negative, or neutral, to each identified aspect. Supervisedlearning methods have been shown to deliver better accuracy for this task butthey require labeled data, which is costly to obtain, especially forresource-poor languages like Vietnamese. To address this problem, we present asupervised aspect-based opinion mining method that utilizes labeled data from aforeign language (English in this case), which is translated to Vietnamese byan automated translation tool (Google Translate). Because aspects and opinionsin different languages may be expressed by different words, we propose usingword embeddings, in addition to other features, to reduce the vocabularydifference between the original and translated texts, thus improving theeffectiveness of aspect category extraction and sentiment polarityclassification processes. We also introduce an annotated corpus of aspectcategories and sentiment polarities extracted from restaurant reviews inVietnamese, and conduct a series of experiments on the corpus. Experimentalresults demonstrate the effectiveness of the proposed approach.

 

Quick Read (beta)

loading the full paper ...