
A novel and efficient method for In-Context Ranking (ICR) using large language models (LLMs) for Information Retrieval (IR). The core problem addressed is the computational cost of standard LLMs for ranking many candidate documents, which scales quadratically with context length due to the attention mechanism. BlockRank solves this by first analyzing LLM attention patterns, revealing inherent inter-document block sparsity and query-document block relevance signals within middle layers