Downloads: 1
India | Community Science | Volume 14 Issue 12, December 2025 | Pages: 1141 - 1147
Extending Context Windows in Large Language Models: A Survey of Techniques, Architectures, and Evaluation Methods
Abstract: Since the inception of Transformers, models have dramatically evolved in their ability to model long and short-term memory. From the original Transformer with 512 tokens to Gemini 2.5 Pro at 2 million tokens-4,000? increase. This survey examines how position encoding methods such as RoPE scaling and ALiBi and architectural alternatives like state space models and hybrids allow for context window increases. A finding from most models is that few utilize an effective context length near what they boast. In reality, most exist within 10-50% of their effective context windows-RULER claims those boasting 128K effectively only use 32K without decay. The "lost in the middle" effect suggests systematic failure to retrieve whatever exists in the middle regardless of whether it exists in 0-128K or 0-32K of context. Thus, this survey compiles these findings across positional encoding methods, sparse attention methods, memory-augmented ones, state space and retrieval-augmented generation and computational optimization. Findings are taken from peer-reviewed literature across NeurIPS, ICML, ICLR, ACL and EMNLP to compile a useful overview for practitioners working with long context LLMs.
Keywords: Large Language Models, Context Length, Positional Encoding, RoPE, ALiBi, Sparse Attention, State Space Models, Mamba, Retrieval-Augmented Generation, FlashAttention, KV Cache
How to Cite?: Mohammed Sule, "Extending Context Windows in Large Language Models: A Survey of Techniques, Architectures, and Evaluation Methods", Volume 14 Issue 12, December 2025, International Journal of Science and Research (IJSR), Pages: 1141-1147, https://www.ijsr.net/getabstract.php?paperid=SR251215184109, DOI: https://dx.doi.org/10.21275/SR251215184109