Learning Disentangled Representations of Texts with Application to Biomedical Abstracts

  • 2018-04-19 15:09:14
  • Sarthak Jain, Edward Banner, Jan-Willem van de Meent, Iain J. Marshall, Byron C. Wallace
  • 6

Abstract

We propose a method for learning disentangled sets of vector representationsof texts that capture distinct aspects. We argue that such representationsafford model transfer and interpretability. To induce disentangled embeddings,we propose an adversarial objective based on the (dis)similarity betweentriplets of documents w.r.t. specific aspects. Our motivating applicationconcerns embedding abstracts describing clinical trials in a manner thatdisentangles the populations, interventions, and outcomes in a given trial. Weshow that the induced representations indeed encode these targeted clinicallysalient aspects and that they can be effectively used to performaspect-specific retrieval. We demonstrate that the approach generalizes beyondthis motivating example via experiments on two multi-aspect review corpora.

 

Quick Read (beta)

loading the full paper ...