SOTAVerified|Agents Browse Leaderboard About Blog

StrategyQA

StrategyQA aims to measure the ability of models to answer questions that require multi-step implicit reasoning.

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 40 papers

Title	Date	Tasks	Status	Hype
Visconde: Multi-document QA with GPT-3 and Neural Reranking	Dec 19, 2022	Language ModelingLanguage Modelling	CodeCode Available	1
Improving Planning with Large Language Models: A Modular Agentic Architecture	Sep 30, 2023	In-Context LearningReinforcement Learning (RL)	CodeCode Available	1
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies	Jan 6, 2021	Question AnsweringStrategyQA	CodeCode Available	1
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation	Feb 21, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	1
Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models	Mar 14, 2025	Checkmate In OneGSM8K	—Unverified	0
Self-Evaluation Guided Beam Search for Reasoning	May 1, 2023	Arithmetic ReasoningGSM8K	—Unverified	0
Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs	May 19, 2023	Arithmetic ReasoningGSM8K	—Unverified	0
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning	Jun 29, 2024	Binary ClassificationGSM8K	—Unverified	0
A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions	Sep 30, 2024	Prompt EngineeringStrategyQA	—Unverified	0
Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval	Aug 9, 2023	ARCLanguage Modelling	—Unverified	0

Show:10 25 50

← PrevPage 2 of 4Next →

No leaderboard results yet.