SOTAVerified

StrategyQA

StrategyQA aims to measure the ability of models to answer questions that require multi-step implicit reasoning.

Source: BIG-bench

Papers

Showing 2130 of 40 papers

TitleStatusHype
Rationale-Aware Answer Verification by Pairwise Self-EvaluationCode0
A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions0
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning0
Meta-prompting Optimized Retrieval-augmented Generation0
Question-Analysis Prompting Improves LLM Performance in Reasoning Tasks0
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning0
Improving Attributed Text Generation of Large Language Models via Preference Learning0
Towards Uncertainty-Aware Language Agent0
IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions0
The ART of LLM Refinement: Ask, Refine, and Trust0
Show:102550
← PrevPage 3 of 4Next →

No leaderboard results yet.