Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation

Abstract

Large language models (LLMs) can suggest missing elements from items listedin a prompt, which can be used for list completion or recommendations based onusers' history. However, their performance degrades when presented with toomany items, as they start to suggest items already included in the input list.This occurs at around 100 items for mid-2024 flagship LLMs. We evaluate thisphenomenon on both synthetic problems (e.g., finding missing numbers in a givenrange of shuffled integers) and realistic movie recommendation scenarios. Werefer to this issue as \textit{attention overflow}, as preventing repetitionrequires attending to all items simultaneously. Although iterative loops canmitigate this problem, their costs increase with the repetition rate, affectingthe language models' ability to derive novelty from lengthy inputs.

Quick Read (beta)

loading the full paper ...