SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
valid
valid
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 31–40 of 3589 papers
Title
Date
Tasks
Status
Hype
Language Models over Canonical Byte-Pair Encodings
Jun 9, 2025
valid
—
Unverified
0
Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework
Jun 9, 2025
Benchmarking
Fairness
—
Unverified
0
AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists
Jun 9, 2025
scientific discovery
valid
—
Unverified
0
Inference on the value of a linear program
Jun 7, 2025
valid
—
Unverified
0
Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems
Jun 7, 2025
Code Generation
valid
—
Unverified
0
On Efficient Estimation of Distributional Treatment Effects under Covariate-Adaptive Randomization
Jun 6, 2025
regression
valid
Code
Code Available
0
Speech Neurophysiology in Realistic Contexts: Big Hype or Big Leap?
Jun 5, 2025
valid
—
Unverified
0
Does It Make Sense to Speak of Introspection in Large Language Models?
Jun 5, 2025
valid
—
Unverified
0
DRE: An Effective Dual-Refined Method for Integrating Small and Large Language Models in Open-Domain Dialogue Evaluation
Jun 4, 2025
Dialogue Evaluation
valid
—
Unverified
0
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Jun 4, 2025
Text to SQL
Text-To-SQL
—
Unverified
0
Show:
10
25
50
← Prev
Page 4 of 359
Next →
No leaderboard results yet.