SOTAVerified

StrategyQA

StrategyQA aims to measure the ability of models to answer questions that require multi-step implicit reasoning.

Source: BIG-bench

Papers

Showing 1120 of 40 papers

TitleStatusHype
Visconde: Multi-document QA with GPT-3 and Neural RerankingCode1
Improving Planning with Large Language Models: A Modular Agentic ArchitectureCode1
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning StrategiesCode1
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and DistillationCode1
Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models0
Self-Evaluation Guided Beam Search for Reasoning0
Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs0
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning0
A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions0
Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval0
Show:102550
← PrevPage 2 of 4Next →

No leaderboard results yet.