SOTAVerified|Agents Browse Leaderboard About

valid

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 631–640 of 3589 papers

Title	Date	Tasks	Status	Hype
REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction	Feb 24, 2025	Event Argument Extractionvalid	—Unverified	0
Quantifying Logical Consistency in Transformers via Query-Key Alignment	Feb 24, 2025	Logical Reasoningvalid	—Unverified	0
Your Assumed DAG is Wrong and Here's How To Deal With It	Feb 24, 2025	Causal Discoveryvalid	CodeCode Available	0
Auto-Bench: An Automated Benchmark for Scientific Discovery in LLMs	Feb 21, 2025	scientific discoveryvalid	—Unverified	0
Pricing Valid Cuts for Price-Match Equilibria	Feb 21, 2025	valid	—Unverified	0
Towards a Perspectivist Turn in Argument Quality Assessment	Feb 20, 2025	valid	CodeCode Available	0
EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations	Feb 20, 2025	Combinatorial Optimizationvalid	CodeCode Available	0
Explainable Distributed Constraint Optimization Problems	Feb 19, 2025	valid	—Unverified	0
Conformal Prediction under Levy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations	Feb 19, 2025	Conformal PredictionPrediction	CodeCode Available	0
What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis	Feb 19, 2025	HallucinationLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 64 of 359Next →

No leaderboard results yet.