Text Simplification with Sentence Embeddings

Abstract

Sentence embeddings can be decoded to give approximations of the originaltexts used to create them. We explore this effect in the context of textsimplification, demonstrating that reconstructed text embeddings preservecomplexity levels. We experiment with a small feed forward neural network toeffectively learn a transformation between sentence embeddings representinghigh-complexity and low-complexity texts. We provide comparison to a Seq2Seqand LLM-based approach, showing encouraging results in our much smallerlearning setting. Finally, we demonstrate the applicability of ourtransformation to an unseen simplification dataset (MedEASI), as well asdatasets from languages outside the training data (ES,DE). We conclude thatlearning transformations in sentence embedding space is a promising directionfor future research and has potential to unlock the ability to develop small,but powerful models for text simplification and other natural languagegeneration tasks.

Quick Read (beta)

loading the full paper ...