Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

Abstract

Large Language Models (LLMs) have become a cornerstone in Natural LanguageProcessing (NLP), achieving impressive performance in text generation. Theirtoken-level representations capture rich, human-aligned semantics. However,pooling these vectors into a text embedding discards crucial information.Nevertheless, many non-generative downstream tasks, such as clustering,classification, or retrieval, still depend on accurate and controllablesentence- or document-level embeddings. We explore several adaptationstrategies for pre-trained, decoder-only LLMs: (i) various aggregationtechniques for token embeddings, (ii) task-specific prompt engineering, and(iii) text-level augmentation via contrastive fine-tuning. Combining thesecomponents yields state-of-the-art performance on the English clustering trackof the Massive Text Embedding Benchmark (MTEB). An analysis of the attentionmap further shows that fine-tuning shifts focus from prompt tokens tosemantically relevant words, indicating more effective compression of meaninginto the final hidden state. Our experiments demonstrate that LLMs can beeffectively adapted as text embedding models through a combination of promptengineering and resource-efficient contrastive fine-tuning on syntheticallygenerated positive pairs.

Quick Read (beta)

loading the full paper ...