SOTAVerified

Multi-hop Question Answering

Papers

Showing 125 of 202 papers

TitleStatusHype
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language ModelsCode7
Retrieval-Augmented Generation with Hierarchical KnowledgeCode4
Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical PerceptionCode3
Husky: A Unified, Open-Source Language Agent for Multi-Step ReasoningCode3
MASKSEARCH: A Universal Pre-Training Framework to Enhance Agentic Search CapabilityCode2
LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented SearchersCode2
Iteration of Thought: Leveraging Inner Dialogue for Autonomous Large Language Model ReasoningCode2
EfficientRAG: Efficient Retriever for Multi-Hop Question AnsweringCode2
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM AgentsCode2
Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent CollaborationCode1
KnowTrace: Bootstrapping Iterative Retrieval-Augmented Generation with Structured Knowledge TracingCode1
HopWeaver: Synthesizing Authentic Multi-Hop Questions Across Text CorporaCode1
TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question AnsweringCode1
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM CollaborationCode1
Knowledge Editing with Dynamic Knowledge Graphs for Multi-Hop Question AnsweringCode1
KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language ModelsCode1
Grounded Multi-Hop VideoQA in Long-Form Egocentric VideosCode1
Hierarchical Retrieval-Augmented Generation Model with Rethink for Multi-hop Question AnsweringCode1
CompAct: Compressing Retrieved Documents Actively for Question AnsweringCode1
Re-ReST: Reflection-Reinforced Self-Training for Language AgentsCode1
Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question AnsweringCode1
Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as AgentsCode1
LongAgent: Scaling Language Models to 128k Context through Multi-Agent CollaborationCode1
Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge GraphsCode1
DeepEdit: Knowledge Editing as Decoding with ConstraintsCode1
Show:102550
← PrevPage 1 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Multi-hop Dense Passage Retriever (MDR)Answer F156.5Unverified
#ModelMetricClaimedVerifiedStatus
1Beam RetrievalAn69.2Unverified