SOTAVerified|Agents Browse Leaderboard About

valid

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–60 of 3589 papers

Title	Date	Tasks	Status	Hype
What Has Been Lost with Synthetic Evaluation?	May 28, 2025	NegationReading Comprehension	—Unverified	0
Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models	May 27, 2025	valid	CodeCode Available	0
STACI: Spatio-Temporal Aleatoric Conformal Inference	May 27, 2025	Gaussian ProcessesGPU	—Unverified	0
PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects	May 27, 2025	Privacy PreservingUncertainty Quantification	—Unverified	0
Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners	May 26, 2025	MuJoCovalid	—Unverified	0
On the Robustness of RSMA to Adversarial BD-RIS-Induced Interference	May 26, 2025	valid	—Unverified	0
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach	May 26, 2025	TARvalid	—Unverified	0
HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices	May 26, 2025	In-Context LearningRetrieval-augmented Generation	CodeCode Available	0
PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation	May 26, 2025	valid	—Unverified	0
We Need to Measure Data Diversity in NLP -- Better and Broader	May 26, 2025	Diversityvalid	—Unverified	0

Show:10 25 50

← PrevPage 6 of 359Next →

No leaderboard results yet.