Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

2024-06-20Code Available1· sign in to hype

Taiming Lu, Muhan Gao, Kuai Yu, Adam Byerly, Daniel Khashabi

Code Available — Be the first to reproduce this paper.

Code

github.com/TaiMingLu/know-dont-tell
OfficialIn paperpytorch★ 19

Abstract

Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a "know but don't tell" phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.

Tasks

Information Retrieval Position Retrieval

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

Code

Abstract

Tasks

Reproductions