RecoBERT: A Catalog Language Model for Text-Based Recommendations

Abstract

Language models that utilize extensive self-supervised pre-training fromunlabeled text, have recently shown to significantly advance thestate-of-the-art performance in a variety of language understanding tasks.However, it is yet unclear if and how these recent models can be harnessed forconducting text-based recommendations. In this work, we introduce RecoBERT, aBERT-based approach for learning catalog-specialized language models fortext-based item recommendations. We suggest novel training and inferenceprocedures for scoring similarities between pairs of items, that don't requireitem similarity labels. Both the training and the inference techniques weredesigned to utilize the unlabeled structure of textual catalogs, and minimizethe discrepancy between them. By incorporating four scores during inference,RecoBERT can infer text-based item-to-item similarities more accurately thanother techniques. In addition, we introduce a new language understanding taskfor wine recommendations using similarities based on professional wine reviews.As an additional contribution, we publish annotated recommendations datasetcrafted by human wine experts. Finally, we evaluate RecoBERT and compare it tovarious state-of-the-art NLP models on wine and fashion recommendations tasks.

Quick Read (beta)

loading the full paper ...