Abstract
In-context learning (ICL) is critical for large language models (LLMs), butits effectiveness is constrained by finite context windows, particularly inultra-long contexts. To overcome this, we introduce InfiniteICL, a frameworkthat parallels context and parameters in LLMs with short- and long-term memoryin human cognitive systems, focusing on transforming temporary contextknowledge into permanent parameter updates. This approach significantly reducesmemory usage, maintains robust performance across varying input lengths, andtheoretically enables infinite context integration through the principles ofcontext knowledge elicitation, selection, and consolidation. Evaluationsdemonstrate that our method reduces context length by 90% while achieving 103%average performance of full-context prompting across fact recall, groundedreasoning, and skill acquisition tasks. When conducting sequential multi-turntransformations on complex, real-world contexts (with length up to 2M tokens),our approach surpasses full-context prompting while using only 0.4% of theoriginal contexts. These findings highlight InfiniteICL's potential to enhancethe scalability and efficiency of LLMs by breaking the limitations ofconventional context window sizes.